Dataspace 10: An Array Representation

Prelude:
Dataspace 0: Those Memex Dreams Again
Dataspace 1: In Search of a Data Model
Dataspace 2: Revenge of the Data Model
Dataspace 3: It Came From The S-Expressions
Dataspace 4: The Term-inator

I want a Memex. Roughly, I want some kind of personal but shareable information desktop where I can enter very small pieces of data, cluster them into large chunks of data, and – most importantly – point to any of these small pieces of data from any of these chunks.

‘Pointable data’ needs a data model. The data model that I am currently exploring is what I call term-expressions (or T-expressions): a modified S-expression syntax and semantics that allows a list to end with (or even simply be, with no preceding list) a logical term in the Prolog sense.

So far, we have been looking at term-expressions as an extension of (or implemented on top of) Lisp or Scheme cons-cell structure. This is fine if we’re running on a Lisp or Scheme. But the most popular languages today are not Lisp or Scheme, and don’t usually have a native cons-cell implementation. Further, the model of all storage as a big undifferentiated soup of cons-cells has a couple of big limitations: 1) an O(n)  to O(log n) access time, depending on the data structure, if we don’t already have a pointer, and 2) pointers are relative to a big memory pool – they don’t give us an easy way to break our data into chunks and make sure that related data is stored close by.

One way of solving all of these problems is to look at how we can represent term-expressions not on cons-cells, but on a much more fundamental and widely-available data structure: arrays.

Continue reading “Dataspace 10: An Array Representation”

Dataspace 9: A Tower of Nulls, And Awkward Sets

Prelude:
Dataspace 0: Those Memex Dreams Again
Dataspace 1: In Search of a Data Model
Dataspace 2: Revenge of the Data Model
Dataspace 3: It Came From The S-Expressions
Dataspace 4: The Term-inator

I want a Memex. Roughly, I want some kind of personal but shareable information desktop where I can enter very small pieces of data, cluster them into large chunks of data, and – most importantly – point to any of these small pieces of data from any of these chunks.

‘Pointable data’ needs a data model. The data model that I am currently exploring is what I call term-expressions (or T-expressions): a modified S-expression syntax and semantics that allows a list to end with (or even simply be, with no preceding list) a logical term in the Prolog sense.

Looking at term-expressions, one of the first things we notice is that there are a large number of null-like terms. I’m wondering what the meaning of these varieties of null might be.

  • The simplest null-like term is the nil pair or empty list: ()
  • The next one is the empty term : (/)
  • Then we have the empty set (if can think of /all as a set) or empty union:  (/all)
  • Then, for every other term functor X, the empty X: (/X)

An interesting question is whether terms correspond to types, (and if so, in what particular type system) or whether the notion of ‘type’ is unrelated to what we’re looking at here.

Continue reading “Dataspace 9: A Tower of Nulls, And Awkward Sets”

Dataspace 8: Example: Movie data

Sidebar: Here’s a quick comparison of what I’m hoping to achieve in terms of syntax and readability, and an example of why I think it’s important to spend a fair bit of time thinking about syntax. Particulary, about what’s not in the syntax, so it’s not there to get in the way.

SWI Prolog’s SWISH has some wonderful example programs on the web; here, for example is a simple movie database  with nearly 3000 separate facts (probably taken from IMDB, I guess).

Continue reading “Dataspace 8: Example: Movie data”

Dataspace 7: A Low-Level Encoding

Prelude:
Dataspace 0: Those Memex Dreams Again
Dataspace 1: In Search of a Data Model
Dataspace 2: Revenge of the Data Model
Dataspace 3: It Came From The S-Expressions
Dataspace 4: The Term-inator

I want a Memex. Roughly, I want some kind of personal but shareable information desktop where I can enter very small pieces of data, cluster them into large chunks of data, and – most importantly – point to any of these small pieces of data from any of these chunks.

‘Pointable data’ needs a data model. The data model that I am currently exploring is what I call term-expressions (or T-expressions): a modified S-expression syntax and semantics that allows a list to end with (or even simply be, with no preceding list) a logical term in the Prolog sense.

Up till now we’ve been looking at term-expressions as a thin layer over S-expressions (ie, one reserved symbol, the term marker), and assuming that at a machine level they will use a Lisplike cons cell structure (ie, linked lists).

The architecture of PicoLisp makes a good argument for using cons cells as the only method of storage, as it simplifies memory management, and simplicity may be more important for reliability and security than raw performance.

But if we wanted, we could have quite a dense encoding for term-expressions, based on the old Lisp Machine tricks of CDR coding and tagged pointers. This means we could map term-expressions directly onto sequences of memory cells.

Continue reading “Dataspace 7: A Low-Level Encoding”

Dataspace 6: Terms as Types

Prelude:
Dataspace 0: Those Memex Dreams Again
Dataspace 1: In Search of a Data Model
Dataspace 2: Revenge of the Data Model
Dataspace 3: It Came From The S-Expressions
Dataspace 4: The Term-inator

I want a Memex. Roughly, I want some kind of personal but shareable information desktop where I can enter very small pieces of data, cluster them into large chunks of data, and – most importantly – point to any of these small pieces of data from any of these chunks.

‘Pointable data’ needs a data model. The data model that I am currently exploring is what I call term-expressions (or T-expressions): a modified S-expression syntax and semantics that allows a list to end with (or even simply be, with no preceding list) a logical term in the Prolog sense.

I’m not a huge fan of type theory as it currently exists in functional programming languages such as Haskell. Type theory seems, to me, to be merely an application of logic – and I think what we’d find much more useful than a type-inference engine that only runs at compile time, is a general logical inference engine like Prolog that operates on general logical terms, and can operate at runtime.

(Because on the Internet, it’s always runtime. There is no ‘compile time’ where you can escape the entire network; all you can do is pass the output of one program into another via sending and receiving data. At some point, we need to start thinking about ‘types’ as merely syntactic transformations of data, or logical statements (themselves pieces of data) made, inferred and proved about data.  To handle the creation of new types at runtime, we need functions that can take types – or structured data containing types – as arguments, and return types as values.)

Continue reading “Dataspace 6: Terms as Types”