In a companion post I give an overview of pyLucid, a version of Lucid close to that found in the Lucid book and implemented in Python. Here is an overview of the implementation.

The first stage is parsing. I used a not-very–well-known mixture of recursive descent and operator precedence (which I reinvented independently). Instead of separate functions for every level of precedence, it has a single function parametrized with the precedence level. For error checking it has a function *expect* parametrized with a lexical token. If the token doesn’t show up as expected, an error message is produced along with the first five tokens that showed up instead. This is usually enough to pinpoint the syntactic error.

The parser produces not a parse tree but objects in a multi-sorted abstract data type that models the various syntactic categories. For example it may produce a *where* (clause) that has a *subject* and a *body,* which is a list of *definition*s each of which has a *rhs* and a *lhs* and so on. The abstract data type has predicates, constructors, and destructors. For example *WhereP(w)* tests if *w* is a where clause, *WhereSubject(w)* returns its subject, and *WhereC(s,b)* returns a clause with subject *s* and body *b*.

The abstract data type is implements with POP2 constructs but this fact, and the implementation details, are a closely guarded secret found only in one file. In turn the POP2 data objects are implemented by Python constructs and the details are found only in another file. And this file also implements POP2 I/O which, as I indicated, doubles as the pyLucid lexxer.

Early on I realized that it was much more convenient to have a single destructor that returns components (Python allows multiple returns). Thus instead of writing

s = WhereSubject(w)

b = WhereBody(w)

you write

s, b = WhereD(w)

Once the program passes the syntax check and is turned into a where clause the semantic checking begins. We verify that the where bodies have at most one definition per variable and that the formal parameters of functions are all distinct. We check that function arities are consistent (always have the same number of parameters) and finally that every variable (besides *input*) has a relevant definition.

Once the checking is done we proceed to transform the program to a form that can be directly evaluated. The first step is a systematic renaming of variables so that every logically different variable has a different name. For example if the program is

X+I

where

X = f(Y);

f(X) = X+Y*I

where

Y = 7;

I = 2;

end;

I=9;

end;

[sorry, I haven’t figured out indenting] the result of renaming is

X+I

where

X = f(Y);

f(X_01) = X_01+Y_01*I

where

Y_01 = 7;

end;

I = 9;

Y = 3;

end;

The crucial points are (1) the transformed program is still a pyLucid program; and (2) it’s semantically equivalent to the original. This will be true at each stage, so we will end up with a very simple (but long) program equivalent to the original source.

At several points we will ‘dissolve’ where clauses by simply dumping their definitions into the enclosing set of definitions. Thus

A+B

where

A = 9;

B = 2 * A

where

A = 7;

end;

end;

is renamed to

A+B*3

where

A = 9;

B = 2 * A_01

where

A_01 = 7;

end;

end;

then the inner where clause can be dissolved giving

A+B*3

where

A = 9;

B = 2 * A_01

A_01 = 7;

end;

The most important transformation is the elimination of user defined functions basically by transforming function calls into branching time iteration (I’ll save the details for another post). Then the program has only zero-order (individual) variables and the rest of the where clauses can be dissolved.

Finally by introducing extra variables we turn compound expressions into ‘atomic’ expressions (only one operation) and the whole program is now a set of atomic equations: one individual variable on the left equated to an expression consisting of an operation applied to individual variables (or a literal/constant).

The program above becomes

output = A+V_00;

V_00 = B*V01;

V_01 = 3;

A = 9;

B = V_02 * A_10;

V_02 = 2;

A_01 = 7;

and it is ready for evaluation.

We can consider the atomic equations form of the program as the analog of assembly language. The output of the compiler is the assembly language and nothing else, and that is the input to the interpreter/evaluator. Thus they communicate with a human-readable text file.

Eduction of the assembly language is straight forward. The evaluator demands values of particular variables for particular space and time coordinates. These coordinates are stored in ‘registers’ rather than passed around as parameters.

For example, if we demand A and it it defined as

A = B+C

then we evaluate B and C without changing registers and return the sum of these values.

If

A = next B

we bump the time register by one, evaluate B, then decrement the time register (restoring its initial value).

If

A = B sby C

we examine the space register. If it’s 0 we evaluate B, otherwise we decrement it, evaluate C, then increment it.

And so on (check out the eduction blog post).

There are actually more registers than just time and space. Our treatment of nesting implies a hierarchy (usually very shallow) of extra time parameters, one for each current level of nesting. There are another two registers specifying the current “place” parameter (place in the function calling tree). The compiler introduces special operators that explicitly manipulate these coordinates.

We have to save computed values in a cache to avoid recomputing them. There is so much memory on even a laptop these days that for the time being we didn’t implement the retirement plan to periodically thin out the cache.

We did, however, find a simple strategy that greatly reduces the size of the cache – we don’t cache variables that are the result of data operations. Thus if A=B+C, we don’t cache A because it can be recomputed B and C.

While exercising the interpreter we discovered an interesting phenomenon: many if not most cached values are never retrieved. We keep track of variables for which at least one value is retrieved. In practice the new V- variables are not retrieved and the same holds for most of the original program variables. Typically even with a larger program only a half dozen or so program variables actually benefit from caching. If we could understand this phenomenon better we could save a lot of memory.