In this model there is a first or initial time point, and every time point has a unique successor. Imperative iterations normally terminate, so we should have only finitely many time points. Lucid avoids the complications of finite time domains by making everything at least notionally infinite, so that the domain of time points is the natural numbers with the usual order.

In temporal logic logicians have studied a huge variety of time domains. What do they mean in terms of iteration and how do we write iterative programs over nonstandard time domains?

One simple generalization is to drop the requirement that there be an initial time point and specify instead that every time point also has a unique predecessor. This gives us the integers as the time domain. We won’t get into philosophical problems about infinite negative time and iterations that have formally been going on forever.

We can adapt Lucid to this notion of time by making our streams have the integers as their domain. For primitives we can retain the usual **first**, **next**, and **fby***. *It’s obvious how the first two work. And that **next** has a dual **prev.** But **fby** needs some thought. We soon see that *X ***fby*** Y *must be the standard part of *Y* shifted right preceded by the 0 point of *X * preceded by the nonstandard part of *X*. In other words

*… x _{-2}, x_{-1}, x_{0}, y_{0}, y_{1}, y_{2}, …*

with *x _{0}* at the zero point.

After playing around a bit we discover we need another operator that works like **fby** except that it puts *y _{0}* at the zero point. Since it’s sort of a backwards

*1, 1, 2, 3, 5, 8, 11, …*

Then

*1* **ybf*** (0*** ybf prev prev*** Fib + ***prev*** Fib)*

This definition goes two steps into the past to define the current value of *Fib* in terms of the two previous values. We can also go only one step by writing

*Fib = 0 ***ybf*** (1 ***fby prev ***Fib + Fib)*

and here we’re defining the next value of *Fib* in terms of the current and previous values.

Either of these is preferable to the confusing

*Fib = 1* **fby*** 1* **fby*** Fib +* **next*** Fib*

which is correct but hard to grasp because it defines **next next** *Fib* in terms of **next** *Fib* and the current *Fib.*

What would this mean for imperative programming? What would a **while** loop look like that has pre-initialization? Let alone one that has already been going on, forever? No idea.

What if we want to define a stream by two recurrence relations, one forward and one reverse? The simplest example is a counter, given in standard Lucid as

*I = 0* **fby*** I+1*

This clearly gives the wrong answer in the new interpretation; I is

*… 0, 0, 0, 0, 1, 2, 3, 4, …*

So let’s define the left hand part by

*J = J-1* **ybf*** 0*

and this defines *J* as

*… -3, -2, -1, 0, 0, 0, 0, …*

then we can put them together as *J* **fby*** I* giving

*… -3, -2, -1, 0, 1, 2, 3, …*

Interestingly, *J* **ybf*** I* gives the same result.

But this is all pretty clumsy. Can we do better? Yes, it turns out

*K = K-1* **ybf*** (0* **fby*** K+1)*

does the job. What if we parenthesize it the other way? What if

*K = (K-1* **ybf*** 0)* **fby*** K+1*

It turns out we get the same result. This is not a coincidence. There is a general rule that

*A* **ybf*** (M* **fby*** B) = (A* **ybf*** M)* **f****b****y** B

for any *A, B,* and *M*. Both expressions denote the generalized stream

*… a _{-3}, a_{-2}, a_{-1}, a_{0}, m_{0}, b_{0}, b_{1}, b_{2}, b_{3}, …*

This is a very pleasing identity (trivial to verify) and justifies us omitting parentheses in definitions such as

*K = K-1* **ybf*** 0* **fby*** K+1*

which combines two recurrence relations, one backwards, one forward.

Now let’s do an example involving both space and time. Obviously we’ll allow negative space coordinates. Instead of thinking up new names for the operators we’ll simply add .s to the time operators, e.g. **prev.s**.

The example is a simple minded numerical analysis treatment of heat flow. We have an infinite (in both directions) iron bar at temperature 1 in the middle tapering off linearly to 0. First we define a distribution that goes negative then min it with 0 to get the desired initial heat distribution Q:

*P = P-0.01* **ybf.s*** 1* **fby.s*** P-0.01*

*Q =***min***(P,0)*

Now we define an iteration in which (I know this is simple minded) H starts with Q and at each step each value of H is replaced by the average of the neighboring values.

*H = Q* **fby (prev.s*** H +* **next.s*** H)/2*

Imagine the headache it would be to do this with only nonnegative indices. We’d have to shift it all over, calculate how much shift, etc.

What about conventional imperative languages – do they have arrays with negative indices? Apparently not … I don’t know of any and neither do knowledgeable friends. Of course in Python (for example) you can write *A*[-3] but this is just for counting from the end of the array. Odd that.

Next post: branching time.

]]>

*2, 3, 5, 7, 9, eod, eod, eod, …*

The input and output conventions are adjusted to interpret *eod* as termination. If the above stream is the output, the implementation will ‘print’ the first five values and terminate normally. If a user inputs the first five values, then terminates the input stream, this is not treated as an error. Instead, the ‘missing’ values are evaluated to *eod* if requested.

What makes it interesting is that when a (strict) data operation is evaluated, if any (or all) of the operands are *eod*, the result is *eod.* (Non strict operations like *if-then-else-fi* need special rules). Thus termination propagates through expressions, which is almost always what you want. A continuously running filter which computes, say, a running average of its input will terminate normally if its input is terminated. There is no need to repeatedly test for end of input.

Furthermore, *eod* allows us to write expressions and filters for problems that require constructs like *while* or *for*.

A simple example is the *last* filter. Suppose we define S

*S = 0 fby S+I*

to be a running sum of the stream I. Let’s say I is finite and we want the sum of its elements. Obviously, this is the last value of S; so we write

*Sum = last S*

And how is last defined? Easy

*last X = X asa iseod next X*

Here *iseod* is a special operator that can examine *eod* without being turned to *eod*. It returns true if its argument is *eod*, false otherwise.

I was browsing Hacker News the other day and learned about the “rainfall problem”, a coding exercise used as a solve-at-the-whiteboard interview question. You have a series (finite, of course) of numbers and you must calculate the average of the positive numbers that appear before the sentinel value -999. Let’s solve it in Lucid with *eod*.

The first step is to remove the ad hoc sentinel value and replace is with *eod*. Let *R* be the original data stream; we define *T*, the finite stream of temperatures, as

*T = R until R eq -999*

Here *X until* P is like the stream *X* except that once *P is* true, the output is *eod*. The operator *until* (which normally would be built in) has a simple definition:

*X until P = if sofar not P then X else eod fi*

where *sofar Q* is true at a timepoint iff *Q* has been true up to then. We can define *sofar* as

*sofar Q = R where R = true fby R and Q end*

Now that we have the temperatures as a proper finite stream we can define the stream *P* of the positive temperatures as

*P = T whenever T>0*

(For this to work our implementation of *whenever* must handle *eod* correctly. This will be the case, for example, if we base it on the recursive definition

*X whenever P = if first P then first X fby (next X whenever next P)*

* else next X whenever next P fi*

which gives sensible results if *X* and/or *P *are finite.)

Finally we define the stream *A* of averages as

*A = S/N where S = first P fby S+next P; N = 1 fby N+1 end*

and the number we want is the last one

*answer = last A fby eod*

Note that if we count *until*, *sofar*, and *last* as being built-in, we don’t use *iseod*.

What happens when there is more than the time dimension? What do we do? If there is also a space dimension then we can add another special value, *eos* (end of space). The value *eos* propagates like *eod* when combined with ordinary data. And we add an extra rule: when *eos* combines with *eod* the result is *eod*; *eod* trumps *eos*. With this arrangement we have a simple output convention. If *X* is the 2D stream being output, we evaluate *X* at timepoint 0 and at successive spacepoints till we encounter *eos*. Then we move to the next line, increase the timepoint to 1, and output successive spacepoints till we again encounter *eos*. We then increase the timepoint to 2, output successive spacepoints etc.

If at any stage we encounter *eod*, we terminate normally. We could call this the ‘typewriter’ output convention. There is a corresponding input convention that requires an end-of-line input as well as an end-of-data input.

And what about three dimensions? For example, video in which frames vary in the time dimension and a frame varies in a horizontal (*h*) dimension and a vertical (*v*). We can generalize the typewriter convention using *eoh* (end of horizontal), *eof* (end of frame), and *eod*.

What’s the general situation, when there’s lots of dimensions? W. Du suggested a family of special objects, indexed by a *set* of dimensions. If *eod(S1)* and *eod(S2) *are the objects corresponding to the sets *S1* and *S2* of dimensions, then the result of combining them (say, adding them) is *eod(S1∪S2)*. Thus the bigger the index set, the more overpowering is the object. The value *eos* is revealed to be *eod({s})* and what we call simply eod is *eod({s,t})*. In the video context, *eoh* is *eod({h})*, *eof* is *eod({h,v})* and *eod* is *eod({h,v,t})*.

EOD

]]>

“But wait!” I hear you say. “It’s nice to write elegant equations defining the infinite array of all primes but what about the everyday working world? What about the array of midterm marks of my class? It’s finite! How do you work with that in Lucid2D?”

Not obvious. You can create an infinite array whose first elements are the midterm marks followed by infinitely many -99 values. Then set a variable N equal to the number of valid marks and (say) average the first N values of the array. But in what form to you read it in in the first place? We need a general input protocol that, obviously, doesn’t give -99 a special status.

This becomes more pressing if we’re inputting a finite number of finite arrays. In which case we need a whole stream of N’s.

As I mentioned when we first tried to “add arrays” to Lucid we tried to find a simple algebra of finite vectors, matrices, etc but never succeeded. The algebra would be specially complex if you want ‘ragged’ arrays in which rows have different lengths (as mentioned above). It turned out that infinite arrays are mathematically simpler than finite ones.

Still, a language that can’t easily average a finite list of midterm grades is not much use. Eventually, we came up with a better solution.

The idea is to introduce a special value that works like -99 above but isn’t an actual number. The new object is *eod,* which stands for “end of data”. The object *eod* is not a number, (or a string or a boolean etc). It’s the value of a stream that has terminated. I think of it as the value read in (in UNIX) after you hit control-D.

The input convention is clear. A finite stream gets entered as an infinite one padded with *eod’s*. The output convention is also simple. When outputting a stream X, you (as usual) demand the values of X in order and output them. When the value returned is *eod*, you don’t produce anything, rather you terminate normally.

Lucid is based on an algebra: a set of values together with a set of operations on these values. If you extend the set of values, you have to extend the operations and say how they work if any of the operands are the new values.

Fortunately that’s simple for *eod*. The basic rule is that any data operation (like “+”) that is strict (needs all its arguments) returns *eod* if *any* of its arguments are *eod*. In other words

*eod+1 = 4+eod = eod+eod = eod*

For an operation like *if-then-els*e that doesn’t need all its arguments, the first argument needed is sensitive on its own to *eod*:

*if eod then 3 else 4 fi = eod*

Otherwise it simply chooses between alternatives as usual:

*if true then 3 else eod fi = 3*

*if false then 3 else eod fi = eod*

*if false then eod else eod fi = eod*

Similarly

*false and eod = false*

*true and eod = eod*

*eod and eod = eod*

As for the space and time Lucid operations, they aren’t affected. For example, the value of *next X* at time *t* is still the value of *X* at time* t+1* – whether or not this value is *eod*. So if we’ve written an eductive interpreter, the only part that needs changing is the part that evaluates data operations.

For strict operations, the rule (above) is simple. For* if-then-else-fi* we first evaluate the test; if it is *eod*, we return *eod*. Otherwise we evaluate and return the chosen alternative. Similarly, to evaluate an *and* expression we evaluate the first operand and return *eod* if that value is *eod*. Otherwise, we evaluate the second operand and return this value.

To conclude let’s consider the midterm grades problem. Suppose that we input the stream (a stream, rather than an array) of grades as G, padded with *eod’s* as above. Let’s say G is

*56, 79, 66, 85, eod, eod, eod, …*

(it’s a small class).

First we define a running sum

*S = first G fby S + next G*

so that S is

*56, 135, 201, 286, eod, eod, eod, …*

Notice that S is also ‘really’ a finite stream of the same length. Yet its definition doesn’t take finiteness into account. When the next value of S is computed by increasing it by the next value of G, and this next value is *eod*, the resulting value of S is also *eod*.

Now we define a counter and form the running average A:

*N = 1 fby N+1*

*A = S/N*

so that A is

*56, 67.5, 67, 71.5, eod, eod, eod, …*

(because *eod/5* etc is *eod*).

All we need is the last value of A; the value when the next value is *eod*. At the moment we can’t do this because we can’t test for *eod*. The comparison *X eq eod* returns *eod*, because *eq* is a strict data operation.

We need a primitive that can gaze on *eod* without being turned to *eod*. We call it *iseod* and *iseod X* returns *true* if X is *eod*, *false* otherwise. Then we can define *last* as

*last(X) = X asa iseod next X*

and what we want is *last(A)*. However that’s all we want, so if we ask for

*last(A) fby eod*

we get one number and a normal termination, as desired. More eloquently, if

*just(Y) = first Y fby eod*

our output is* just(last(A))*.

There’s more interesting operators like *last* that can be defined with *eod* and *iseod*. Also, what about intensions that are finite in space as well as time? I’ll talk about these next time.

Finally, you know how this post has to end.

*eod*

]]>

The late Ed Ashcroft and I discovered this possibility when we tried to “add arrays” to Lucid. Initially, we intended Lucid to be a fairly conventional, general purpose language. So we considered various ‘features’ and tried to realize them in terms of expressions and equations.

Static structures like strings and lists were no problem. We ran into trouble with arrays, however. We tried to design an algebra of *finite* multidimensional arrays (along the lines of APL) but the results were complex and messy to reason about.

Finally it dawned on us that we should consider infinite arrays – sort of frozen streams. And that these could be realized by introducing (in the simplest case) a space parameter *s* that works like the time parameter *t*. In other words, Lucid objects would be functions of the two arguments *t* and *s*, not just *s*. These things (we had various names for them) could be thought of as time-varying infinite arrays.

The details were pretty straight forward. We would add space versions of the temporal operators *first*, *next*, *fby* etc. The programmer as before could define variables with arbitrary expressions involving these operators. Let’s call the space operators *initial* (corresponding to *first*), *succ* (successor, like *next*), and *sby* (succeeded by, like *fby*).

In imperative languages arrays are usually created by loops that update components one by one. You can emulate this in Lucid. You need an update operator that takes an array, an index, and a value, and returns an array like the old on except that the value is now stored at the index. However we realized this was kludgey and likely inefficient.

The ‘idiomatic’ way to do it is to define the whole array at once, like we do for streams. Thus

*nums = 1 sby nums+1*

defines the array of counting numbers and

*tri = 1 sby tri + succ nums*

the array of triangular numbers.

If we mix spatial and temporal operators, we can define entities that depend on both dimensions – that are time-varying arrays. Thus

*stot(A) = S where S = A sby S + succ A; end*

*P = 1 fby stot(P)*

gives us Pascal’s triangle (tilted to the left)

*1 1 1 1 1 …*

*1 2 3 4 5 …*

*1 3 6 10 15 …*

*1 4 10 20 35 …*

*1 5 15 35 70 …*

with the space dimension increasing to the right and the time dimension increasing towards the bottom.

The function *stot* is not defined recursively and we can eliminate it by applying the calling rule, giving

*P = 1 fby (S where S = P sby S + succ P end*)

and we can promote S to a global giving

*P = 1 fby S*

*S = P sby S + succ P*

The first equation implies that S is equal to next P and substituting in the right hand side of the second equation gives

*P= 1 fby S*

*S = P sby next P + succ P*

Now we can eliminate S to get

*P = 1 fby P sby next P + succ P*

So we can use the usual rules to transform two-dimensional programs. Even though there are two dimensions, the programs are still equations.

We can use the space dimension to generate primes without recursion. We define a stream of arrays in which on each time step the next array is the result of purging the current array of all the multiples of its initial element.

*N = 2 sby N+1*

*S = N fby S wherever S mod initial S ne 0*

*P = initial S*

This defines P to be the stream of all primes.

Finally here is a program that crudely approximates heat transfer in an infinite metal bar. The bar is initially hot (temperature 100) at the left end (initial point) and 0 elsewhere. Thereafter at every timepoint each spacepoint receives or gives a small percentage of the temperature difference with its neighbour.

*eps = 0.1*

*B0 = 100 sby 0*

*B = B0 fby 100 sby succ B + eps*(B-succ B) + eps* (succ succ B – succ B)*

The output shows the bar gradually warming up as the heat travels from left to right.

*100 0.0 0.0 0.0 …*

* 100 10.0 0.0 0.0 …*

* 100 18.0 1.0 0.0 …*

* 100 24.5 2.6 0.1 …*

*…*

The eduction process is easily extended to two dimensions. Instead of demands specifying a variable and a time coordinate, they specify a variable and a time coordinate and a space coordinate. These demands give rise to demands for possibly different variables at possibly different time and space coordinates.

There is one complication, however, involving the warehouse (cache). Some variables may be independent of one or more of the coordinates. For example, in the primes program above the variable *N* does not depend on the time coordinate. In principal, if we cache values of *N* tagged with both the time and space coordinates we risk filling the cache with duplicate values of *N* with the same space coordinates but with different time coordinates.

That doesn’t happen with the primes program because it works out that all the demands for *N* will have time coordinate 0, so no duplicates. Many programs are well behaved in this sense. But not all.

For example, in the heat transfer program, demands for *B* at different time points and space points will lead to demands for *eps* at different time- and space points. But these demands for *eps* at different contexts will all return 0.1, so if the results of these demands are each cached, we’re wasting time and space.

Avoiding this problem requires what we call *dimensionality* *analysis*. The dimensionality of a variable 𝒱 is the set of dimensions that it depends on; the least set of dimensions with the property that knowing the coordinates of these dimensions allows you to compute 𝒱.

If there are two dimensions *s* and *t* there are four possible dimensionalities:

*{}, {s}, {t}, {s,t}*

For example, of the dimensionality is *{t}*, we need to know the time coordinate but not the space coordinate.

In practice we can’t always compute the exact dimensionality because it could turn out e.g. that a complex looking expression always has the same value. But we can compute bounds on the dimensionalities and that’s almost always good enough.

I’ll leave dimensionality analysis to another post – it’s very similar to type inference. Applied to the prime program, for example, it finds that the dimensionality of *N* is *{s}*, of *S* is *{s,t}*, and of *P* is *{t}*. Applied to the heat program, it finds that the dimensionality of *B *is *{s,t}*, of *B0* is *{s}*, and of *eps* is *{}*.

This information allows us to cache and fetch the values of these variables with the minimum tags.

Notice that *P* and *N* are both one-dimensional – only one coordinate is required to compute a value. But they don’t have the same dimensionality. The *rank* of a variable is the number of coordinates needed. It is the cardinality of the dimensionality. Knowing the rank is not enough to tell you how to cache and fetch a value.

In future posts I’ll talk about what happens when there are a few more dimensions, or a lot more dimensions, or dynamic dimensions, or dimensions as parameters of functions.

]]>

The logician Willard Quine defined a paradox as an “absurd” statement backed up by an argument.

The famous result of Banach and Tarski definitely counts as a paradox by this definition. They proved that it is possible to take a unit sphere (a ball with radius 1), divide it into five pieces, then by rotations and translations reassemble it into *two* unit spheres.

Huh?

This would seem to be impossible, based on our experience of the physical world. What happened to conservation of volume? The original sphere had volume 4π/3, the five parts should have total volume 4π/3, but the two spheres have total volume 8π/3. Something doesn’t add up.

That’s literally true. Four of the pieces are so bizarre they don’t have a volume (technically, they are non measurable sets). Therefore you can’t add their volumes.

**Axiom of Choice**

I’ve said before that a paradox can often be understood as a proof by contradiction of one of the (often implicit) assumptions. One of the assumptions here is the additivity of volume. But the other is the *Axiom** of Choice*.

The Axiom of Choice (AC) seems harmless at first. It says that if you have a collection of nonempty sets, there is a single function (a “choice function”) that assigns to each set an element of that set.

This seems reasonable and in line with our experience. If you have a bunch of bags each with some candies in them, there is certainly no problem collecting one from each bag (a child can do it and will only be too happy to oblige). Even if the candies in each bag are identical.

Trouble happens when the number of candy bags is uncountably infinite. Why should there be a uniform way of making this infinite number of choices?

**Nonmeasurable sets**

This trouble takes many forms. The Banach Tarski paradox is just one. AC also (obviously) implies that there are sets that don’t have a volume (or area, or length).

The supposed existence of nonmeasurable sets seriously complicates analysis. (Analysis is, roughly speaking, generalized calculus.) Analysis textbooks are full of results which state that such-and-such a procedure always generates a measurable set. If students ask to see an example of one of these mysterious objects that don’t have a volume (or area, or length), the instructor is in trouble. AC tells you that such sets exist, but says nothing about any particular one of them. It’s *non constructive*.

In fact it can be shown that almost any set that is in any sense definable (say, by logical formulas) is measurable. For example, all Borel sets are measurable. If authors simply assumed that all sets are measurable, the average text would shrink to a fraction of its size. And they wouldn’t get into trouble – it is not possible, without AC, to prove the existence of a non measurable set.

**Determinacy**

More trouble arises when we deal with infinite games. Finite games of perfect information (no hidden cards) are well understood. If ties are impossible, then one player ‘owns’ the game – has a winning strategy. (A strategy is basically a complete playbook which tells you what to do in each situation.) Zermelo, the Z in ZF, first proved this. This is called determinacy.

When we move to infinite games (in which the players alternate forever) AC causes trouble. As you can guess, AC implies the existence of nondeterminate games, in which every strategy for player I is beaten by some strategy for player II, and vice versa. Strange. Needless to say, I can’t give you a concrete example of a nondeterminate game. Once again, you can prove that almost any particular game that you can specify is determinate.

**Infinite voting systems**

My final example of a counterintuitive consequence of AC is the *ultrafilter theorem*. To avoid nerdy formulas, I’ll describe it in terms of voting.

Let’s say we have a finite group of voters

*P _{1}, P_{2}, P_{3}, … , P_{n}*

and they each vote Aye or Nay on a resolution. When do the Ayes have it? Obviously, when they have a majority (let’s count ties as the Nays having it). No problem.

When there are infinitely many voters, however, it is not so obvious what to do. A vote can be thought of as an infinite sequence of Ayes and Nays, e.g.

*Aye, Nay, Nay, Aye, Nay, Aye, Aye, Nay, …*

What constitutes a “majority” of an infinite set of voters? You could give it to the Ayes if there are infinitely many of them, but it is also possible that at the same time there are infinitely many Nays, in which case the Nays have grounds for complaint.

It’s useful to make a list of the properties such a voting system should have.

- If the vote is unanimous, then the result should be the same, whether Aye or Nay
- No ties: either the Ayes have it (have a majority), or the Nays do
- If a vote is held and one person changes their vote, the outcome is unaffected.
- If a vote is held and the Ayes have it, and then any number of voters switch from Nay to Aye, the Ayes still have it
- the union of two minorities is a minority, and the intersection of two majorities is a majority

Sounds doable, but how?

We already saw that making all infinite sets majorities won’t work, because their complements may be infinite. In the same way we can’t say minorities are all finite. We can’t choose one or even finitely many people as the deciders, because individual votes don’t count.

Hmmm.

Well, don’t try to solve this because you won’t succeed. It can be shown, again, that there is no concrete (definable) scheme that works. In particular, even if we use Turing machines that can perform an infinite sequence of steps in a finite amount of time (this makes mathematical sense), there is no voting program.

And yet the Axiom of Choice tells us that there is a voting method (not obvious). But don’t ask what it is, it’s a rabbit that AC pulls out of its hat.

**The nature of existence**

What to do about this?

We can retain AC and just live with the absurd Banach-Tarski result, with sets without volume (or area or length), with games that have no winner, and infinite voting.

But in what sense does, say, there *exist* a voting method? AC tells us we are free to imagine that there exists a voting method. Gödel showed that AC is consistent with ZF (assuming, as everyone believes, that ZF is consistent). That means we won’t get into trouble if we use it. But many of its consequences are unsettling.

AC means, for example, that you can say “I know that there is a voting method that works” but not “I know a voting method that works”. Of course this situation happens in real life. But in real life there’s the possibility of resolving the situation. If you know there is a wolf in the woods, you can go into the woods and find it. No use going looking for the voting method because you’ll never find it.

**Other choices**

Can we do without AC? To a point, yes. There are weaker forms that don’t have unsettling consequences. One is Countable Choice (CC), that says that given an (infinite) sequence

*S _{1}, S_{2}, S_{3}, …*

of sets there is a sequence

*x _{1}, x_{2}, x_{3}, …*

with each *x _{i}* an element of

CC or DC is enough to do most practical mathematics, including most analysis. However it is not enough for important foundational theorems. For example, DC is not enough to prove the completeness theorem for first order logic. (Which says that every formula is either provable or has a counterexample.) For completeness, you need a voting method.

Another possibility is the Axiom of Determinacy (AD) which says that every game has a winner. It has some nice consequences, for example, it implies that every set of real numbers is measurable.

But it also implies that ZF is consistent. This sounds nice, too, but is actually a disaster. It means that we can’t prove the consistency of AD with ZF (assuming the consistency of ZF). In fact it is not known whether ZF+AD is consistent. Not safe for work!

**AC, I can’t quit you**

What to do? I’m afraid I don’t have the answer. AC causes trouble but it also makes life a lot simpler. For example, it implies that any two orders of infinity are comparable. Without AC, cardinal arithmetic is chaos. Set theorists have tried to come up with a weaker version of the Axiom of Determinacy but so far nothing persuasive has appeared.

In the end, it’s an engineering decision. If we choose AC, we have a well ordered mathematical universe with very nice features but also some bizarre objects with properties that contradict our real life experiences. A kind of Disneyland but with monsters. If we reject AC, we have a chaotic, complex universe in which the normal rules don’t apply. A kind of slum with broken windows, collapsing stairways, and cracked foundations. A “disaster” as Horst Herrlich put it.

And there doesn’t seem to be a middle ground. DC fixes some of the cracks and makes a large part of the slum (e.g. analysis) habitable, but doesn’t make it a theme park.

One possibility is to treat AC as a powerful drug and take it only when necessary. Theorems should come with consumer labels saying what went into them. So if you see a box on the shelf of “Banach and Tarski’s Miracle Duplicator! Feed Multitudes!”, it will say on the back of the box “Contains AC”.

]]>*This statement is false.*

If it’s true then it’s false, but if it’s false then it’s true … nothing works.

In my not-so-humble opinion, most (maybe all) paradoxes are the last step in a proof by contradiction that some unstated assumption is false.

In this case, the assumption is that the above statement is meaningful – is either true or false. The assumption is false, the statement is meaningless. End of paradox.

Of course, there’s more to it than that. Behind the Liar Paradox is a more general, and seemingly sensible assumption, that any statement that is syntactically correct is meaningful. Obviously, not the case. Here’s another example

*I do not believe this statement.*

If I believe it, then I don’t, and if I don’t, then I do.

It’s tempting to believe that self reference is the problem, but there are plenty of self referential sentences that are (or seem … ) meaningful and true; e.g. “I know this sentence is true”.

To get to the bottom of this we need to formalize the paradox. This was first done by the famous logician Alfred Tarski (in 1936). In his formalization, the problem is the phrase “is true”.

More than 80 years later you can explain it without getting too technical. Imagine we have a formal logical language with quantifiers, variables, Boolean connectives, arithmetic operations and (this really helps) strings and string operations. Call this language ℒ. At this stage everything syntactically correct makes sense. For example, we can state that string concatenation is associative, or that multiplication distributes over addition.

Since we have strings, we can talk about expressions and formulas *in the language itself. *We can define a predicate (of strings) that is true iff the string is a syntactically correct formula. We can define an operation “subs” that yields the result of substituting an expression for a variable; more precisely, subs(f,g) is the result of substituting g for every occurrence of x in f. So far, no problem. Can we produce a formula that refers to itself? Not yet.

Gödel numbers? No need. The whole point of Gödel numbering is to show that you don’t need strings, you can represent them as (arbitrarily large) integers. This is important but not particularly interesting. In modern computer science terms, it means *implementing* strings as (arbitrarily long) integers, and nowadays (but not in the 30’s) everyone believes this without seeing the details.

So far so good. One last little step … and we go over the cliff. The last step is to add a predicate T of strings that says it’s argument is a formula and that this formula is true (with free variables universally quantified). T seems harmless enough, but with it we can reproduce the Liar Paradox.

Provided we can make a sentence refer to itself. This, not Gödel numbering, is the tricky part.

Since ℒ+ has strings, subs, and T, we can talk about whether or not a formula is true of itself (as a string). If a formula is *not* true of itself (it ‘rejects’ itself) let’s call it *neurotic*.

To see that we can define neurosis, let’s say that a formula Φ is true of a formula Θ iff Φ is true when all occurrences of the variable x in Φ are replaced by Θ (as a string constant). If we call the result of this substitution Φ[Θ], then to say that Φ is true of Θ is to say that Φ[Θ] is true.

Then let Ψ be the sentence

*¬T(subs(x,x))*

It should be clear that Ψ says that its argument is neurotic. What about Ψ, is it neurotic? Is Ψ[Ψ] true or false?

On the one hand, if it’s false, then by definition of neurosis Ψ is neurotic. But since Ψ tests for neurosis, Ψ[Ψ] should be true. On the other hand, if Ψ[Ψ] is true, then since Ψ tests for neurosis, Ψ is neurotic. But this means by the definition of neurosis, Ψ is not neurotic. No way out. (You may recognize this as a variant of the “barber who shaves all those who don’t shave themselves” paradox.)

Thus Ψ[Ψ] is our liar sentence. I can tell you exactly what it is; it’s

*¬T(subs(“¬T(subs(x,x))”,”¬T(subs(x,x))”))*

and is, by my count, 41 characters long.

We can make the argument clearer (if not as precise) using our functional shorthand. We define Ψ by the rule

*Ψ[Φ] = ¬Φ[Φ]*

Then

*Ψ[Ψ] = ¬Ψ[Ψ]*

Those who are familiar with the λ calculus or combinatory logic will detect the Y combinator behind this argument. The combinator Y is, as a λ-expression,

*λF (λx F(x x))(λx F(x x))*

It’s called a fixed point combinator because YF reduces to F(YF); YF is a fixed point of F. The ISWIM (where -clause) version is much easier to understand:

*G(G) where G(x) = F(x(x))*

Working back from this contradiction, it means we can’t consistently add a truth predicate to our basic language ℒ. That in turn means that we can’t define T in ℒ, otherwise the ℒ would be inconsistent. That’s what Tarski meant when he called his result “the undefinability of truth”.

Can we salvage anything from this? Yes, and this is due to Tarski and Saul Kripke.

There is no harm in applying T to formulas that don’t use T, the meaning is obvious. Call the language allowing this ℒ ‘. Similarly, applying T to ℒ ‘ formulas is ok, call the language where this is allowed as well ℒ ”. We can create a sequence ℒ, ℒ ‘, ℒ ”, ℒ ”’, … (This is Tarski’s hierarchy).

We can throw these all together producing a language ℒ *. But then we can create ℒ*’, ℒ*” etc. Generalizing this we have a hierarchy indexed by the countable ordinals (don’t ask). Kripke’s proposal was to define a single language with a single truth predicate in which anything goes but in which sentences not caught up in this construction have an intermediate truth value. Thus Ψ[Ψ] would be neither -1 (false) nor +1 (true) but 0, with 0 being its own negation. I’ll let you decide whether this makes Ψ[Ψ] meaningful after all.

Sentences that have a conventional truth value Kripke calls *grounded*; those, like the liar sentence, *ungrounded*. You can think of the ungrounded sentences as those in which evaluation fails to terminate. Notice that “this statement is true” is ungrounded. (Kripke found a way around this but I won’t go into the details.)

Finally, infinitesimal logic can shed some light on groundedness. If we redefine T(f) to be the truth value of f *times 𝛆*, and evaluate over the infinitesimal truth domain

-1, -𝛆, -𝛆^{2}, -𝛆^{3}, … 0 … 𝛆^{3}, 𝛆^{2}, 𝛆, 1

then we get a more nuanced result. The power of the infinitesimal tells us roughly how many layers of truth predicate we have to go through to decide between true and false.

]]>

Basically I said that Gödel’s results proved that no fixed set of facts and rules can on their own form the basis of mathematical knowledge. I said that hard-earned experience is indispensable. That mathematics is ultimately an experimental science. (This is not the usual take on Gödel’s work.)

But grammar? For natural languages, it’s the same story. Forget about semantics (meaning). Just the syntax of a natural language like English is infinitely rich and can’t be described by any manageable set of facts and rules. The same goes (sorry) for Go-the-game-not-the-programming-language. To master them you need judgement and experience.

Does this mean facts and rules are not as important as we might think? Actually no, they’re indispensable. In fact they are a vital part of what makes us human!

Formal grammars were invented by Chomsky and, independently, the Algol committee, to specify Algol and describe natural languages. They worked spectacularly well for Algol and got off to a good start for natural languages.

For example, one rule that covers a lot of sentences in English (and other languages) is

*<sentence> ::= <noun phrase> <verb phrase> <noun phrase>*

But already you have trouble because in many languages the verb phrase has to agree with the noun phrase in terms of number. So you need two rules

*<sentence> ::= <singular noun phrase> <singular verb phrase> <noun phrase>
<sentence> ::= <plural noun phrase> <plural verb phrase> <noun phrase>
*

In Russian (in the past tense) the verb phrase has to agree with the noun phrase in terms of gender (there are three in Russian). Six rules.

Let’s stick to English and concentrate on noun phrases. One big thing to deal with is the definite article “the”. Native speakers don’t think about it, but there are rules for “the”. For example, it does not precede personal names, like “John” or “Alison”. Or names of organizations, like “IBM”. Oh wait, what about “The Government” and “the BBC”? Hold on, you don’t say “the NBC” … ???

I have no idea what the rules are. I’ve had many students whose native language (Chinese, Farsi, Korean, … ) has no definite article. I often have to correct their usage and it seems they are always coming up with new ways to get it wrong.

So there seems to be a kind of incompleteness phenomenon here. No matter how many facts and rules you discover, there’s always a sentence that is idiomatic but not covered by these facts and rules.

This is what torpedoed early AI efforts in natural language processing. It was based on grammar and logic and failed because you never had enough facts and rules.

The first efforts at playing games like Chess or Go were also based on facts and logic. The main rule is that the value of a position for one player is the negative of the value of the least favourable (for the other player) position arrived at in one move (whew!).

That rule, and a whole bunch of facts about who wins a terminal position, in principle is enough. But not in practice.

So instead you need heuristic rules to evaluate positions. (In Chess, having passed pawns, controlling the centre, material superiority etc etc). IBM managed to make this work for Chess but for Go it was hopeless. Too many possible moves, too much context to take into account.

And yet there is AlphaGo, which has beaten the world champion. How does it work?

I don’t know. It uses neural networks to process hundreds of thousands of professional games and millions of games it plays with itself. The only facts and rules that humans give it are (as I understand it) the rules of the game. Maybe not even that – the explanations of AlphaGo are vague, probably because of commercial secrecy.

However, I think I can explain the success of AlphaGo (and, recently, Google translate) by an appeal to human psychology. Specifically, to the notions of *conscious* and *unconscious*.

It’s generally agreed that the brain works in both conscious and unconscious modes. Most of the processing is in the unconscious mode and we are (needless to say) unaware of it. How does the unconscious work? Not clear, though it may involve thrashing out contradictory tendencies.

The unconscious communicates with us through feelings, intuitions, hunches, judgement, perception, aesthetics, reflexes …

Anyone who has taken Go seriously will be amazed at how experts talk about the game. They use concepts like *strength*, *thickness*, good and bad *shape*, even (I’m not making this up) *taste*. Teachers encourage their students to play quickly, relying on instincts (reflexes). Learning Go is not so much about memorizing facts and rules as training your unconscious. Maybe AlphaGo works the same way, by simulating an unconscious and training it.

What then is left for the conscious? Guess what – facts and rules.

I’m convinced that the conscious, rational part of the mind works in a machine-like fashion, using and manipulating facts and rules, devising and following step-by-step protocols (algorithms). This is either an important insight or a banal observation and i’m not sure which.

I’m not saying that people are machines. We rely on our unconscious, which apparently does not work sequentially. The conscious and unconscious work together and make a great team. Only if you consciously ignore your feelings do you become a soulless robot (though there’s a lot of that about).

For example, in mathematics we first discover facts and rules by insight based on experience. Once we’ve found some we have confidence in, we then consciously apply them, draw consequences through step-by-step reasoning and leap way ahead of what we could discover by experience alone.

It’s this teamwork that give us such an advantage over animals, who act almost completely unconsciously. It’s what makes us human. It gives us the freedom to choose between doing what we feel like doing – or, if it’s not the same thing, doing what is best. It gives us free will.

]]>

This result, known as Gödel’s Theorem, has a lot of formal and informal consequences. It means there is no computer program that can infallibly decide whether or not a statement about arithmetic is true or false. It means we will never know everything about arithmetic, though we may know more and more as time goes on. It means, however, that this knowledge will not come about purely as a result of manipulating formal facts and rules. We will have to rely on other sources, including experiment.

Even more interesting is the fact that this situation – the limits of facts and rules – reappears in other domains, including games, natural language, and even psychology.

Experiments? What can mathematicians learn from experiments? Experiments aren’t useless, they can, for example, lead to conjectures. But unless these conjectures are proved, how can they contribute to mathematical knowledge?

It all depends on what you mean by experiment. Almost all conventional mathematics can be done in an axiomatic system called Zermelo Fraenkel set theory, (ZF), sometimes with the Axiom of Choice (ZFC). (If you want details consult Wikipedia). It’s obviously crucial that the facts and rules of ZFC be consistent (non contradictory). Otherwise every statement (and its opposite) can be formally derived.

Yet Gödel’s results implie that the consistency of ZFC cannot be proven in ZFC; in other words, the consistency of ZFC is not a theorem of conventional mathematics. Nevertheless we believe it, because we use ZFC. People win prizes and piles of money for proving things in ZFC. If ZFC were inconsistent, this would be money for old rope. So it’s safe to say that mathematicians strongly believe that ZFC is consistent.

But believing is not knowing, you say. The consistency of ZFC is not real mathematical knowledge. But what about all the results proved from the facts and rules of ZFC? They’re all tainted because they all assume consistency. So, strictly speaking, we cannot say we “know” that the four colour theorem is true even though there is now a proof. Strictly speaking, we only believe it.

It’s tempting to take refuge in simpler systems, like Peano Arithmetic (PA). PA consists of a handful of simple rules, basically the inductive definitions of the arithmetic operations. To these we add the principle of mathematical induction: to prove P(n) for all n, prove that P(0) is true, and prove that P(n+1) is true assuming P(n) is true.

All sensible and reliable. But how do we *know* that mathematical induction works?

In short, we know that induction is valid because (1) it makes complete sense, and (2) it has never let us down. In other words it *feels* right and in our (extensive) experience works *in practice*.

This last paragraph is not formal mathematics. We are invoking judgement and experience.

Nevertheless I would argue that we have the right to say we “know” (not just believe) that induction is valid. Because we believe with at least the same degree of certainty that the law of gravity is valid. And for the same reasons – judgement and experience. The law makes sense and has never let us down. We *know* it.

For that matter, how do we know that x+y=y+x is valid or even that 23+14=37 is true? By insight and experience. (These are not obvious to kids learning arithmetic.)

In other words, mathematics, like physics, has an empirical element. ZFC has been around for about a century and we can consider the experience of using it as a big experiment. The hypothesis is that ZFC is consistent and in a century of intensive use no contradiction has shown up. Hypothesis confirmed!

There are other formal systems of set theory that are strong enough to do mathematics. One is Gödel-Bernays (GB) which has the advantage that there are only finitely many axioms (ZF has axiom schemas). But GB is equiconsistent with ZF (and ZFC): if one is consistent they all are. So we can use it with confidence.

There is another system , namely Quine’s New Foundations (NF). It’s not known to be equiconsistent with ZFC so the results of the century-long experiment don’t necessarily apply. NF makes sense and hasn’t produced a contradiction, but we don’t have nearly the same experience with NF as we have with ZFC. This means we can’t have nearly the same confidence in results obtained using NF that we do in results obtained using ZFC.

OK, but what does all this have to do with games, natural language, and psychology? Well, this is where is gets interesting … but look at that word count! This post is already too long.

I promise to take this up in the next post, which will be real soon. But think about facts and rules vs feelings and experience and you can probably figure it out for yourself.

]]>The MHC has a big brother, the Hybrid Predicate Calculus (HPC), which (apparently) has the power of full predicate logic. But at a certain point, it gets weird!

The basic idea is simple enough, you expand MHC by allowing property constants to have extra arguments (still on the left). For example, to say that Socrates (s) likes Plato (p) you write

spL

Notice the verb comes last – the HPC is an SOV language, like Japanese. (This means the typical simple sentence has a subject, and object, and a verb, in that order).

As we saw, the MHC has expressions that correspond to natural language quantifier phrases, such as “All Greeks”. The HPC has them too, and they can be used like nouns. Thus

[G]pL

says that all Greeks like Plato.

The HPC allows partial application – a relation constant (like L) does not have to have a full set of arguments. Thus L on its own denotes the liking relation, pL denotes the property of liking Plato, and thus spL can be understood as saying that Socrates has this property. In other words, that Socrates likes Plato.

Since pL is a property, we can form the quantifier phrase [pL], which clearly means “everyone who likes Plato”. Thus

[pL]aL

says that everyone who likes Plato likes Aristotle.

In this way we can nest brackets and say things that in conventional logic require nested quantifiers. 〈A〉L is the property of liking some Athenian. [〈A〉L] therefore means “everyone who likes some Athenian” and

says that anyone who likes some Athenian likes Socrates.

The students and I managed to say some pretty complex things without bound variables. One of my bonus questions was

Every student registered in at least one course taught by professor Egdaw is registered in every course Pat is registered in.

We should use R as the registered relation, T as the teaches relation, e as professor Egdaw, and p as Pat.

However, we immediately run into a problem: how to express the property of being a course taught by professor Egdaw. eT doesn’t work, it’s the property of teaching professor Egdaw. What we need is the taught-by relation, the converse of T. There is no way of doing this with what we have. But there’s an easy fix: add the converse operator. We denote it by the (suggestive) tilde symbol “~”. In general, ~K is the relation K with the first two arguments swapped, and in particular ~T is the taught-by relation (“~” often translates the passive voice).

We can now proceed in stages. e~T is the property of being taught by professor Egdaw and 〈e~T〉 is “some course taught by professor Egdaw”. 〈e~T〉R is the property of being registered in some course taught by (the brilliant) Egdaw, and [〈e~T〉R] is “every student registered in a (at least one) course taught by professor Egdaw”.

Now for the second main quantifier phrase. p~R are the courses Pat is registered in, and [p~R] is “every course pat is registered in”. We simply put the two phrases one after another followed by R and get

[〈e~T〉R][p~R]R

I used to think this was hard but actually it’s pretty straight forward. It’s interesting to compare it with the first order logic formalization

∀s((∃c R(s,c) ∧ T(e,c)) →∀c (R(p,c) → R(s,c)))

Of course, it was not a a good sign that we had to introduce a new feature (the converse operator). Will other examples require other features? How far will this go?

Just a bit further. We run in to another problem if we try to say “Plato likes all Athenians who like themselves”. How do we express the property of liking oneself?

Again, impossible with what we’ve got. We have to introduce a sort of self operator /. /K is like K except the first argument to K is duplicated. /L is the property of liking yourself and the expression we’re looking for is

p[A∧/L]L

We need one more operator that has the effect of ignoring an argument. We use “*” and *K is like K except *K ignores its first argument, its second argument is the first given to K, its third argument is the second given to K, and so on. We’d need it to express e.g. the relation of Spartans liking Athenians. The expression *S∧A∧L denote the relation that says that its second argument, who is Spartan, likes its first argument, who is Athenian.

Perhaps these equivalences will help

cba~K cabK

cba/K cbaaK

cba*K cbK

You can think of these three operators as replacements for things you can do with arbitrary use of variables. First, variables can be out of order, hence ~; variables can be duplicated, hence /; and variables can be omitted, hence *.

The only problem is that these three operators provide only simple cases of swapping, duplication, and omission. What if you need, say, to swap the third and second arguments, duplicate the third or omit the fourth? Don’t you need whole families of operators?

Not quite; instead, we add a meta operator that generates these families. In general, if ⊗ is an operator then ⊗’ is the operator that works with the indexes of the arguments shifted by one. Thus ~’ swaps the third and second arguments, /’ duplicates the second argument, and *’ omits the second arguments and shifts the higher ones. This gives the equivalences

dcba~’K dbcaK

dcba/’K dcbbaK

dcba*’ dcaK

The meta-operator ‘ can be iterated (e.g. /”’) and can also be applied to any quantifier phrase.

I’m pretty sure that this is enough, that anything that can be said in first order logic can be said in HPC. The only problem is, sometimes the result looks like a dog’s breakfast. The experience has been that assertions that can be expressed simply in natural language can be expressed simply in HPC, but that more technical statements (like the axioms of set theory) are often incomprehensible.

There’s more to the story. For example, HPC can be easily understood in terms of operations on relational tables. But I’ll leave that to a future post.

]]>

I use the blackboard. I hate powerpoint, as do many students. For one thing it’s a lot of preparation work. Also it’s too easy to present way too much information. Click, click, click, each slide crammed with information. The blackboard slows you down to just the right pace.

Part of what got me thinking about video was a mainly good experience with still photography. I would put an effort into writing clearly and laying the blackboard panes out neatly, then I would take pictures. I would post them on line and thus the students had lecture notes.

What should have tipped me off is my discovery that even taking still photos of blackboards is not straightforward. The naive approach is to stand right in front of the board, point the camera at it, and press the shutter.

Two things can go wrong, depending on what happens next. If the flash doesn’t go off, chances are there won’t be enough light on the board. Your camera (which of course you’ve set on automatic) uses a long exposure and the image is blurred.

On the other hand, if the flash goes off, there’s a big bright spot in the center of the photo where the flash reflects off the board. Nothing in the middle of the board is readable and photoshop won’t fix it.

Experienced photographers know how do it properly. One solution is to set up a remote flash that illuminates the board off centre. However I didn’t have one and didn’t want the hassle of setting it up for every class.

The other solution is us a tripod (and no flash) but again I didn’t want the hassle of hauling equipment to class and setting it up.

Finally I came up with a third solution: take the picture (with flash) from just on the side, and a few steps back. In the resulting photo there is no glare spot, though the board is distorted. Fix the distortion with photoshop skew, and you’re in business. A bit if extra work, but worth it. I used this for several classes over many terms.

**On to video**

Encouraged by my still photography experience, I decided to move on to video. All I needed was a camera and (unavoidably) a tripod. I already had a tripod and borrowed a video camera from a friend (who used it to record things like birthdays).

I brought them in to class set them up, aimed the camera and turned it on. I began lecturing as usual … couldn’t wait to see the result.

Which was awful. The image was low definition (probably 480p) and was useless because you couldn’t read the blackboard.

OK, I need a camera that can take high definition movies. I settled on a Canon digital SLR – a Rebel T1i (the series is now up to the T6i). I bring it to class, set it up, aim it, turn it on, lecture, then view the final result.

Which is OK till about half way through the lecture, when the video stops in mid sentence. Huh? I didn’t turn it off!

Out of desperation I consulted the manual and soon found the problem: it can only take 30 mins of video at a time, something to do with buffers filling up (it doesn’t matter how much the card will hold). After 30 minutes it stops recording. And the worst part is, it does so silently, without the slightest beep. There is no remedy and all the comparable cameras work the same way.

Also, the results weren’t that great. I had to place the camera far enough back that both blackboard panels appear. That meant they were both small and hard to read and most of the image was wasted.

So it was just not practical to simply set the camera up and let it run for an hour.

**Plan B**

The backup plan that emerged was to move the camera up to one of the blackboard panels (usually the left one) and have it fill the viewfinder. Also, I got a remote control for the camera so I could turn it off and on without leaving the blackboard. The idea was to record not the whole lecture, just highlights a few minutes long each.

This worked pretty well. No danger of the camera switching off, and the writing on the board was clearly readable.

I didn’t want to go back and forth pointing the camera at each of the two panels in turn. No need – the classrooms I was in had sliding panels. So I’d work on one panel and film, then turn the camera off, slide the right panel to the left, then start lecturing and filming again.

The trouble began after the filming. I had to stitch together these short filmlets. Not hard – though I had to learn iMovie to do it.

The next step is to actually watch the resulting movie, and if you follow in my footsteps you’re in for a shock.

The first shock is seeing yourself as you really are. You may look fatter than you imagine, or older than you imagine. Maybe you notice a sort of smug smirk you didn’t know you had, or some annoying mannerism (none of this applies to me, of course). Unfortunately iMovie can’t help here – it’s not photoshop. In time you’ll get used to you (or not; many people don’t even want to start filming for fear of what they’ll see).

**Yawn**

The second shock is realizing just how boring your lecturing can be. For example, perhaps you go way too slow, or give too much detail. But you can fix this.

One seemingly unavoidable source of boredom is writing on the board. I said earlier that the blackboard slows you down and this is a good thing. Not always.

Suppose I’m doing a logic course and want to deal with the resolution method. I announce that we’re going to look at the “Resolution Method” (so far so good). The next thing I do is turn to the blackboard and start writing RESOLUTION METHOD on the board. Except it seems in the movie to take about a week.

R – E – S (clack clack clack) O – L – U (clack clack clack) …. H – O – D

(“clack” is the sound of chalk hitting the board).

And all the time what are you looking at? My butt.

Here, iMovie can help. You cut out most of the clack-clack and crossfade stick the ends together. Then it’s magic. I announce Resolution Method to the class, turn to the blackboard, raise my chalk, clack … whoosh … RESOLUTION METHOD appears instantaneously, and I turn back to the audience.

In the end the videos began to look quite slick – you can see one of the more popular ones here.

**Epilogue**

So where are all these slick videos? I left some on youtube, you can find them by searching for “billwadge”. But there were never that many.

The reason is that once the fun wore off, I realized that it was all a lot of work. Hauling the equipment to class, setting it up, sliding the panels, turning the camera on and off, editing in iMovie – worse than preparing powerpoint.

My conclusion is that it’s not practical without help. Ideally, you need a camera operator if not two cameras and two operators. Plus someone to do good video editing. Plus maybe some good lighting.

Anything less is too much work for the poor lecturer and not helpful enough to be worth the effort.

[FADE TO BLACK]

]]>