Dataspace 10: An Array Representation

Prelude:
Dataspace 0: Those Memex Dreams Again
Dataspace 1: In Search of a Data Model
Dataspace 2: Revenge of the Data Model
Dataspace 3: It Came From The S-Expressions
Dataspace 4: The Term-inator

I want a Memex. Roughly, I want some kind of personal but shareable information desktop where I can enter very small pieces of data, cluster them into large chunks of data, and – most importantly – point to any of these small pieces of data from any of these chunks.

‘Pointable data’ needs a data model. The data model that I am currently exploring is what I call term-expressions (or T-expressions): a modified S-expression syntax and semantics that allows a list to end with (or even simply be, with no preceding list) a logical term in the Prolog sense.

So far, we have been looking at term-expressions as an extension of (or implemented on top of) Lisp or Scheme cons-cell structure. This is fine if we’re running on a Lisp or Scheme. But the most popular languages today are not Lisp or Scheme, and don’t usually have a native cons-cell implementation. Further, the model of all storage as a big undifferentiated soup of cons-cells has a couple of big limitations: 1) an O(n)  to O(log n) access time, depending on the data structure, if we don’t already have a pointer, and 2) pointers are relative to a big memory pool – they don’t give us an easy way to break our data into chunks and make sure that related data is stored close by.

One way of solving all of these problems is to look at how we can represent term-expressions not on cons-cells, but on a much more fundamental and widely-available data structure: arrays.

Continue reading “Dataspace 10: An Array Representation”