When we discovered the dataflow interpretation of Lucid (see post, Lucid, the dataflow language) we thought we’d found the promised land. We had an attractive computing model that was nevertheless faithful to the declarative semantics of statements-as-equations.
However, there was a snake in paradise, as David May explained in the fateful meeting in the Warwick Arts Center Cafeteria.
The problem was with the if-then-else operator and, more generally, with any operator that can produce output before it has consumed all its inputs. The simplest example is probably the lazy logical disjunction operator, which can produce the output true as soon as an input of true arrives on either of its input pipes.
The simple data-push (data-driven) dataflow model cannot take advantage of these operator’s foresight. In the data-driven model, a computation node has to wait for all its inputs to arrive before it produces any output. Suppose, for example, that an if-then-else node receives 25 on its second input then true on its first. It has enough information to send 25 off as its current output. Instead, the data driven model requires it to wait until (say) 36 appears on its third input. Only then is it free to send out 25, ignoring the 36 that it was waiting for.
We can always loosen the model an allow a node to send of output before the arrival of input known to be unnecessary. But we can’t avoid waiting for the unnecessary input, and discarding it upon its arrival.
To see why, suppose that the next value of the first input arrives and its value is false. This means that the value that succeeds 25 is irrelevant, and what we need is the value that succeeds the current input on line number three (36). However our simple model does not allow us to skip inputs. We have to wait until 36 shows up, discard it, then wait for the value that we will be using as the next output.
And it’s the need to discard data that leads to serious trouble. It could be that a huge amount of resources were required to produce the 36, resources which were wasted because we threw it out. But a real catastrophe happens if the data to be discarded requires
infinite resources. Then we wait forever for a result that is irrelevant, and the computation
Why is this a catastrophe? After all, nontermination is part of computation – there is no
complete model of computation that avoids. The catastrophe, however, is that this nontermination is not predicted by the denotational semantics of the program considered as a set of equations. In particular, the semantics of if-then-else is based on the equations
if true then X else Y = X
if false then X else Y = Y
Our Lucid semantics obeyed these rules even when any of X, Y, or Z has the value ⟂ – the ‘result’ of a computation that does no terminate. In particular
if true then X else ⟂ = X
if false then ⟂ else Y = Y
In other words, the Lucid if-then-else can stare into the abyss of nontermination without losing its soul; but the data-driven if-then-else node cannot (recall that Nietzsche’s remark that when you stare into the abyss, the abyss stares back into you).