## Dimensionality Analysis: multiple dimensions

[This reports on research carried out in collaboration with my PhD student Monem Shennat and former student Omar Alaqeeli]

In a previous post I showed how to find identify variables in a classic (time only) Lucid program that are constant – do not change with time. This information proves vital for output and for efficient caching.

Now we consider two dimensional Lucid, like pyLucid, in which variables can vary in two independent dimensions, time t and space s.

The situation is more complex because a variable can be constant, sensitive to time only, sensitive to space only, or sensitive to both. If we define the dimensionality of a variable as the set of all dimensions to which it is sensitive, the four possibilities are {}, {t}, {s}, and {s,t}. Our dimensionality analysis assigns one of these to every program variable, including the output. Note that these assignments are upper bounds. For example, we may assign {s,t} to X even though, when we actually run the program, it turns out that the space parameter doesn’t affect the results. Upper bounds are good enough to avoid caching duplicate values.

Here is a two dimensional program that produces the stream of prime numbers:

N2 = 2 sby N2+1;
S = N2 fby (S wherever (S mod init S != 0));
output = init S;

The first step, as before, is to convert it into a set of atomic equations:

N2 = 2 sby T00;
T00 = N2+1;
S = N2 fby T01;
T01 = S wherever T02;
T02 = S mod T03;
T03 = init S;
T04 = T02 != 0;
output = T03;

Next we set everything to dimensionality {} (constant). On the second iteration, we set N2 to {s} and S to {t}.

On the third iteration T00 gets {s}, S gets {t,s}, T01 gets {t}, T02 gets {t}, T03 gets {t}, T04 gets {t}, output gets {t}.

Finally on the fourth iteration N2 gets {s}, T00 gets {s}, S gets {t,s}, T01 gets {t,s}, T02 gets {t,s}, T03 gets {t}, T04 gets {t,s}, and output gets {t}.

The next stage gives the same result so the iteration is complete. We conclude in particular that N2 has dimensionality {s} (a pure vector), S has dimensionality {t,s} (a time varying array), and output has dimensionality {t} (a pure stream). When N2 is cached one space tag is enough, caching S requires a time tag and a space tag, and the output is a stream of scalars.

At each stage the new dimensionality of each variable is calculated from the previous dimensionalities according to rules for the operators. These rules are similar to the rules for time sensitivity but more elaborate. Here are some of them

if V = X sby Y then V is space sensitive, and also time sensitive if either X or Y are
if V = init X then V is space insensitive but time sensitive if X is
if V = X wherever P then X is space sensitive and time sensitive if either X or P is
if V = X + Y then V is space sensitive if either X or Y is and time sensitive if X or Y is

Each rule with operator o can be understood as applying a corresponding operator o* to the dimensionalities of the operands. For example {} sby* {t} is {t,s} and +* is union (of dimensionalities). These *-operators are all monotonic in the subset ordering of dimensionalities so that if di is the approximate dimensionality on the ith iteration, di ⊆ di+1. This guarantees that the iterations eventually settle down.

Multiple Dimensions

What if we have more than two dimensions – say vertical v, horizontal h, forward and backward z? The algorithm should be evident by now. A dimensionality is a set (e.g.{t,h,v}) of dimensions, to which the intension in question is possibly sensitive.

Of course we don’t invent dozens of new operators like fby and sby. Instead we parameterize them with a dimension’s name attached with a period. Thus fby.h is fby in the vertical direction, and first.z is first in the forward/backward dimension. Ordinary fby becomes fby.t.

The algorithm proceeds as above, except that the values being combined are subsets of the set {t,s,v,h,z,…} of all dimensions. (We can assume, since any particular program is finite, that this set itself is finite.)

The general rules are obvious generalizations of those given above for {s,t}:

if V = X + Y then V is assigned the (set) union of the current values of X and Y
if V = next.d X then V is assigned the current value of X
if V = first.d X then V is assigned the current value of X, minus d if it is in this set
if V = A fby.d Y then X is assigned the union of the values of X and Y, plus d
if V = X whenever.d Y then X is assigned the union of the values of X and Y, plus d

Future Work

This analysis is by no means the last word for GLU, the advanced dialect of Lucid developed at SRI as a coordination language. GLU had many more powerful dimensional features. GLU had where clauses (blocks) with temporary declarations of local dimensions. It also allowed user-defined functions. These could be data functions like cube but could be nonpointwise, for example computing an running average. The definitions could be recursive and even dimensionally abstract, e.g. having a dimensional parameter that specifies the dimension in which the running average is performed.

At present we don’t know how to analyze such programs; research is continuing.

How did GLU do it? We don’t know because the GLU implementation is proprietary software and is currently locked up in a corporate vault. It is all but certain that it will never see the light of day. We do know, for example, from discussions with former GLU researchers, that they used crude approximations. In particular, for user defined functions they took the union of the approximate dimensionalities of actual parameters yielding a worst-case result. 