Dataspace 3: It Came from the S-Expressions

Prelude:
Dataspace 0: Those Memex Dreams Again
Dataspace 1: In Search of a Data Model
Dataspace 2: Revenge of the Data Model
Dataspace 3: It Came From The S-Expressions
Dataspace 4: The Term-inator

I want a Memex. Roughly, I want some kind of personal but shareable information desktop where I can enter very small pieces of data, cluster them into large chunks of data, and – most importantly – point to any of these small pieces of data from any of these chunks.

The Prolog data model – based on logical terms , which are very similar to SQL relations, but can be nested and computed like functions – looks useful, but still has a few rough edges because it was built in 1972 and hasn’t changed much since.

For one thing, a Prolog term looks like a C function call

funny(cats)

but that syntax is rather irregular. Can we come up with a simpler syntax, that gives us more options for how we organise data?

Yes. Yes we can. But there’s a price to pay.

Option 7: S-expressions

The granddaddy of almost all programming languages – specified in 1958, so predating even Algol  is Lisp, which almost entirely minimises the distance between its in-memory semantics and its syntax. Its semantics is linked lists; its syntax is S-expressions.

Linked lists have some drawbacks compared to arrays or objects (they use more memory and are slower to search) but also have the advantage that they are very simple, can store any combination of trees and lists, and can store changes to structured data without needing to copy the entire structure.

Right now, since we’re thinking about a system where remixing data is the use case, it’s the simplicity and flexibility that we’re after rather than performance. We want to find the simplest possible data model that can capture the information we want to store. Can we use S-expressions to represent Prolog term structure?

Traditionally, Prolog systems tend to be compilers (for speed) and internally don’t use linked lists to store data. The architecture generally derives from a 1983 specification, the Warren Abstract Machine, which was intended for raw speed. Unfortunately, the WAM has hardcoded in some annoying limitations. If we look at Prologs based on S-expressions, we might see some more flexible ways of handling Prolog term structure.

There are three interesting answers here.

Pilog

The minimal Lisp-like system PicoLisp bundles a Prolog engine called Pilog. Pilog rules or facts are entered – and stored – as S-expressions. Variables are represented with an @ in front of them, which means we can have symbols with uppercase.

The Prolog clauses:

likes(john, mary).
likes(john, X) :- likes(X,wine),likes(X,food).

becomes in Pilog

(be likes (John Mary))
(be likes (John @X) (likes @X wine) (likes @X food))

‘be’ here is a PicoLisp macro, which just takes the rule and adds it to the Pilog database (stored in property lists of the PicoLisp symbol table, which makes lookup on the functor name fast, but it’s a simple search after that).

An interesting thing you might wonder if you’re a Lisp – or particularly, Scheme – programmer is if you could use dots in these S-expressions? Eg, if you recall how S-expressions work, they’re not arrays but linked lists, so these two expressions describe the same list:

(a b c)
(a . (b . (c . ())))

but this expression

(a . b)

has only one form – it cannot be simplified, and cannot be expressed as an array. It’s just a pair of values. This form is called a dotted pair, and we can end any list with it:

(a b c . d)

The ‘d’ here is not ‘the fourth element of an array’ – it is, rather, ‘the entire rest of the list past the third element’.

What might happen if ‘d’ were a variable? Could we match an entire part of a term, not just one ‘slot’ in it? eg in

(be likes (John . @X) …)

@X might equal not ‘Mary’ but ‘(Mary)’. Not a value, but a list.

and we might also have

(be likes @X …)

which matches @X to ‘(John Mary)’, eg, an entire row in a table, if we were thinking of ‘likes’ as something like a SQL relation.

Note that – just as we can’t express (a b c . d) in an array, we also can’t express ‘likes (John . @X’) in standard Prolog term form. There is no syntax for it, because there’s no semantic model for it – because the Prolog term data model comes from formal logic, not from S-expressions. And because the formal logic community existed centuries to decades before Lisp, nobody in that community quite got around to thinking that ‘capturing the remainder of an expression in a logical variable might be useful’. ‘A tuple might actually be a linked list, with a hidden nil at the end’ just wasn’t the paradigm they were working in – but in a Lisp frame of mind, converting between tuples and lists comes naturally.

This is important. By doing something we thought was quite trivial – just changing our syntax – we actually did something quite major: we expanded our data model’s semantics, because the syntax we moved to was strictly more expressive than our original data model. By looking at the unused parts of our syntax we found semantics that weren’t in our original model, but thinking about them, they seem more complete and well-founded than the original model.

Pilog’s not the only S-expression Prolog, though. Can we learn from another system? I’m glad you asked!

Micro-Prolog

Would you believe me if I told you that in 1983, you could buy an entire working Prolog system, implemented in less than 48K of RAM, for the Sinclair Spectrum cassette-tape based home computer? You’d think I was pulling your leg, right?

But there it is. And the Micro-Prolog Primer textbook is both a serious, academic look at Prolog, and a glimpse of a different way of thinking about Prolog expressions – a more S-expression-y way.

MicroProlog in fact had two separate Prolog languages. Its core language was an S-expression Prolog (using | instead of . , but with the same semantics). This core language was then used to implement a ‘user interface’ language over the top called SIMPLE (an allusion to BASIC, I guess). SIMPLE took a much more ‘triple-based’ approach. Eg the SIMPLE clause (p3)

y greater-of (x y) if x LESS y

would be like the standard Prolog clause

greaterof(y, [x,y]) :- x < y.

Note that (x y) in SIMPLE is a list.

Oh, and the odd trick that Micro-Prolog pulls with variable names is that ‘anything starting with x, y or z is a variable’. It’s not a bad trick, really, for a tiny system; comparable with Microsoft BASIC’s ‘variables must be a letter and maybe a number’ or FORTRAN’s  infamous ‘integers are variable names between i and n’.

SIMPLE, looking so unlike Prolog and so much like RDF’s triples (yet with the full Turing-complete algorithmic power of Prolog behind it) is really interesting! A glimpse of a completely different way the history of 1980s home computing could have unfolded. But it’s the underlying Micro-Prolog Standard Syntax that’s the real treasure here.

From part III, page 247, we get the ‘Standard Syntax’ chapter. p 249 shows us that a Micro-Prolog Standard Syntax clause is very much like a Pilog one:

((friend-of x y) (likes x y) (likes y z))

except instead of putting the functor separately from the arguments, the whole thing is just a list. Isn’t that much simpler? And might this revised syntax – more in line with native S-expressions – perhaps reveal some more semantic possibilities that were formerly hidden by the formal logic notation?

Micro-Prolog certainly has discovered our ‘dotted pair’ trick. It uses it on page 260, to parse SIMPLE expressions. It also uses a trick similar to Lisp’s macros, calling variables which can match parts of a SIMPLE expression ‘meta-variables’.

And Micro-Prolog’s meta-variables – arising naturally from the structure of S-expressions – can match things that standard Prolog forbids. They can match, for instance, the functor (or predicate symbol)  – which in S-expressions is just the head of a list, a part of an expression like any other.

This allows some pretty sophisticated, reflective metalogical programming – on the 48K Sinclair Spectrum in 1983.

Used with dict the meta-variable enables us to find the names on the arcs between particular nodes, as in the above query. It also enables us to find all the nodes connected to a given node together with the name of the connection:

&.WHICH((x z) (dict x) (x John z))
(likes Mary)
(is-a-friend-of Mary)
(likes Jim)

No (more) answers

The clause
((connects x y z) (dict x) (x y a))
is a rule that can be used to walk over this graph.

 

In many ways, this tiny 1983 system was already decades ahead of even the Semantic Web.

Has the rest of the Prolog community learned this trick? Sorta. It took until 1989 before formal academia took note, and by then Prolog was almost dying.

HiLog

In 1989 and 1993 a pair of academic papers showed that yes, you can have something other than just a symbol as a predicate functor. This discovery was given a name: HiLog.

The original paper:

HiLog: A foundation for higher-order logic programming (1993)
by Weidong Chen , Michael Kifer , David S. Warren
Venue: JOURNAL OF LOGIC PROGRAMMING

(that’s the David S Warren who invented the Warren Abstract Machine, by the way, who should know his logic programming)

Abstract

We describe a novel logic, called HiLog, and show that it provides a more suitable basis for logic programming than does traditional predicate logic. HiLog has a higher-order syntax and allows arbitrary terms to appear in places where predicates, functions and atomic formulas occur in predicate calculus. But its semantics is first-order and admits a sound and complete proof procedure. Applications of HiLog are discussed, including DCG grammars, higher-order and modular logic programming, and deductive databases.

I’ll leave you to read the paper itself. It’s a bit dense and not entirely relevant to what we’re looking at. But the kernel of it is:

First-order predicate calculus is well-defined. There exist higher-order logics which are not well behaved at all.  But the formal logic community made a bit of a huge oversight when it assumed that ‘higher order logic’ meant higher order semantics rather than higher order syntax. Higher order semantics, essentially means you’re talking about ‘the contents of a relation’ – which could be infinite, and obviously proving things about infinitely-sized things can get undecidable. Higher order syntax however, is closer to what we do in functional programming: we can have arbitrarily complex structured objects, for example functions, but they’re really just names, not values. We don’t have to make them mean anything (or evaluate to anything) until we want to.

Long and short of it is, Micro-Prolog (which doesn’t even get a mention here), doing the simple and obvious thing with variable matching based on the natural properties of the simplest possible syntax, was years ahead of the formal logic community which had, for historical reasons, missed that names can be structured entities.

The lesson of HiLog, although formally learned in 1989, hasn’t by any means been distributed evenly. SWI Prolog doesn’t include it. Kanren does not think about it at all. One system that does is XSB – but that’s quite a large sprawling project, and not especially well documented. Logic programming is already a small subcommunity, and HiLog semantics are a tiny pool within that.

So maybe S-expression syntax is a very useful thing to hold to, and can give us an edge here. But there are still problems with it.

Both Micro-Prolog and Pilog use the S-expression list (a b c) to mean both ‘Prolog term’ and ‘Lisp list’.

Remember in Prolog, these were two very different things? One used the () syntax and the other used the [] syntax.

There are advantages you get from combining them. The . operator is the most obvious one – it means you can break terms apart like lists. In Prolog, you can do this, sorta, but you have to explicitly convert a term into a list and back again. Like HiLog, you can live without it, but you lose a lot of directness and simplicity.

So representing terms as lists seems like something we want. But we would like to be able to distinguish terms from lists – because once again, we run into a problem of ambiguity if we don’t.

If you just see

(a b c)

in Micro-Prolog, or

(a (b c))

in Pilog, you don’t know what you’re really looking at. Is it a list? Is it a term? You have to go by context. And that’s really, really dangerous in a system which might sprawl across the Internet and hang around for decades. We want maximum flexibility, but minimum ambiguity.

Remember that the thing we liked most about object oriented programming is that every object comes with a type (or class, or prototype) built in? So that given a random piece of data stuff, you can ask it directly: what are you?

Well: can we combine these two so far separate data worlds — terms and lists as S-expressions; and terms and lists as separate entities –and get something perhaps simpler and more powerful than either?

The answer is yes.