The Ballad of the Lambda Calculus
Computer Science can be said to predate even the idea of a computer itself by at least two millennia, in the sense that even before Babbage started designing calculating engines (for simplicity, I will here ignore mechanical calculation devices which predate Babbage), mathematicians had for a long time been analyzing “algorithms”, mathematical procedures which behave like functions and are calculated by following a specific set of completely prescribed steps. In fact, although the name was not given to them until the 17th or 18th century, algorithms can be said to go at least as far back as Euclid in 300 BC. Since algorithms, which are the fundamental basis of computer programming, are basically just computer programs without the syntax, it at first seems surprising that they would predate machines capable of running them by so much; but this really isn’t so unusual, since algorithms in principle can be worked by a human with a sheet of paper– all that is necessary for all the properties of an algorithm to hold universally is that something, mechanical or human, follows the steps without room for inserting will of one’s own.
Work on understanding algorithms before the 20th century, however, was stymied by a lack of rigor in what we understood an algorithm to be. Though it did occur to mathematicians hundreds of years before Babbage that algorithms are not just something to design, but objects in their own right whose properties can be meaningfully analyzed, all such analysis ultimately was doomed by the intuitive (rather than rigorous) nature of what an algorithm was understood at the time to be. For example, I say above that an algorithm consists of a series of “steps”. But what, exactly, is a “step” in an algorithm? For a machine, such as a computer, a “step” can be easily understood to be a machine instruction. Instructions for humans, on the other hand, do not come in discrete units– if step one in an algorithm is “write down a number”, could we actually call this a “step” or should we break things down further into “pick up a pencil”, “pick a number”, “position hand over paper” etc? This sounds nitpicky, but it’s crucially important, since without knowing what a step is it is of course impossible to ask any of the questions we would like to ask about algorithms which concern steps– questions like “if this algorithm is executed, how many steps will it take?”. What was not understood in early attempts to understand algorithms was that algorithms could not really be meaningfully studied in the absence of some specific idea of a “model of computation”– a formal system by which algorithms can be specified according to some exact scheme and then deterministically executed.
The path that lead to this realization was arguably first set off by David Hilbert, who for the tenth of his famous 23 problems of 1900 asked:
The Entscheidungsproblem [decision problem for first-order logic] is solved when we know a procedure that allows for any given logical expression to decide by finitely many operations its validity or satisfiability … The Entscheidungsproblem must be considered the main problem of mathematical logic.
This question was eventually answered in 1935, before the first computer had ever been devised, in a famous paper written by one Alan Turing; in this paper the “Turing machines” that now bear his name were first defined. The answer was not the one Hilbert would have hoped for. Turing defined a simple, abstract machine which in principle could be built, but in practice would be impractical; the point of this machine was simply to provide a sort of least common denominator for computing devices. Studying this machine provided a number of surprises .The surprise which would have disappointed Hilbert was that his question from 1900 was impossible to answer: there are logical expressions whose validity or satisfiability cannot be finitely decided.
The surprise which would be more interesting to us however was the idea that the machine turns out to be universal. The Turing machine, despite its simplicity, has all the capabilities of any finite-time deterministic machine which can be built or even imagined. A related surprise was that any machine capable of emulating a Turing machine is, itself, just as powerful as a Turing machine. This gives rise to the idea of a “universal model of computation”. There are a countless number of these, and the Turing machine was simply the first to be known. Any such universal model provides the rigor needed to analyze algorithms; whichever one you choose, or whichever one you mechanically implement, things will behave in certain important ways the same.
Interestingly, however, though we tend to think of it as sort of the platonic form of a computational model, the Turing machine was not the first universal model of computation ever invented. The first was in fact invented without anyone having fully realized its significance, by Alan Turing’s teacher, Alonzo Church. Church had years before Turing’s paper defined a “Lambda calculus”, a sort of way of defining trees of functions that call other functions, equipped with a rule for simplifying these trees as if the function’s value were being evaluated. (The function’s value turns out to be another tree of functions that call other functions.) The lambda calculus is if anything even more simple than the Turing machine, giving it valid claim to be more fundamental than the Turing machine; and it has a certain strange beauty, in that “programs” written in it build all normal accouterments of programming out of nothing but functions. (Numbers, for example, are encoded by chaining functions together in a way that they “count” themselves off when evaluated.) Unfortunately the lambda calculus is completely alien to anything encountered in human experience, and is infamously difficult to understand; so difficult, in fact, that the fact the Lambda calculus is just as powerful as the Turing machine was not realized until the 1950s, decades after Turing’s paper, and the proof was written by someone other than Alonzo Church. Because it failed to gain historical precedence and because it is neither as easy to grasp nor as close in resemblance to a normal computer as the Turing machine, the Lambda calculus is today largely unknown, although in an odd twist a rephrasing of the Lambda calculus (the calculus of logical combinators) is still used today deep in the guts of the virtual machines of certain so-called “functional programming” languages, as a sort of abstract “machine code” the virtual machine evaluates.
The Lambda calculus looks like this:
((λ (a b c d e f g h i j k l) ((λ (m n o p q r) ((λ (s t u) ((λ (v w x) (λ (y) (λ (z) (i (v z) z (n z (f c) (λ (aa ab) (j (k aa) ab)) l (t q (x (w y z) (s q (((f c) l) z))))))))) (λ (v) (k ((o l) (s b ((o (e l)) v))))) (λ (v w) (u (g p e) (s e v) (s e ((c (c (d (f l)))) w)))) ((λ (v w) (λ (x y) (u (g p q) (w y) (v x)))) ((λ (v w x y z) (λ (aa) (z (g p q) (y (g p q) (x v p aa)) (x w d (((c o) l) aa))))) (λ (v) (r (((c l) v) a) ((((o c) l) v) a))) (λ (v) ((λ (w) (r (w a) ((λ (x) (r (x a) (r ((x b) a) (((o l) x) a)))) ((p l) w)))) ((o (d l)) v))) ((λ (v w) (λ (x y z) (m (g q p) ((q (p (w x))) (v y z)) b))) (λ (v w) (m v (m v w b) (j b ((v l) w)))) (λ (v) (λ (w) (j (v w) w)))) (λ (v w) (n w v (λ (x y) (j ((k x) a) y)) l b)) ((λ (v) (λ (w x y) (n (j a (j x y)) w (λ (z aa) (j (((k z) k) (((k (k (l z))) k) (((k (l (l z))) k) a))) aa)) (λ (z) (j ((i (k z) g v) (k (k (l z))) (k (l (l z)))) (j (l (k (l z))) (l (l (l z)))))) b))) (λ (v w) ((v a) w)))) ((λ (v w) (λ (x) (v q (w q (v q x))))) (λ (v w) (n w v (λ (x x) ((λ (y z aa) (y aa aa x (y aa z (l (l (l (l x)))) x))) (λ (y z aa aa) (j (k aa) (j (k (l aa)) (j (r (k (l (l aa))) (y (k aa) (k (l aa)))) (j (r (k (l (l (l aa)))) (z (k aa) (k (l aa)))) aa))))) (λ (y z) (((y a) z) a)) (λ (y z) (y (z a))))) (p l) b)) (λ (v w) (n w v (λ (x x) ((λ (y) (j ((((x b) b) b) a) (j (((y b) b) a) (j ((y b) a) (j ((x b) a) (j (((x b) b) a) (j ((((y b) b) b) a) (j (x a) (j (y a) x))))))))) ((((x b) b) b) b))) (p l) b)))))) ((λ (s) (λ (t u) (n u t (λ (v w) (s (k v) w)) l b))) ((λ (s) (λ (t u) (n (s t) p (λ (v w) (j (k v) w)) (λ (v) (s (l v))) u))) (λ (s) ((s (λ (t) (j ((t a) a) (h (t b) (t a))))) (λ (t) a))))) ((λ (s) (λ (t u) (s t (c o) (s (g c t) o (s (g o t) c u))))) (λ (s t u) (n u s (λ (v w) (j (h (k v) (g t (k (l v)))) w)) (c l) b))) (λ (s t u) (n (j t u) s (λ (v w) (j (r (k (k v)) (k (l v))) w)) (λ (v) (j (l (k v)) (l (l v)))) b)))) (λ (m n o) ((λ (p) (((m p) (j n o)) b)) (λ (p) (j ((p a) b) (j ((p a) a) (p b)))))) (λ (m n o p q) ((k ((n (λ (r) (j (λ (s) ((k r) (o (l r) s))) (p (l r))))) (j (λ (r) r) m))) q)) (c c) (d c) (g (f c) (g d e)) (λ (m n) ((m n) ((n m) a))))) (λ (a) (λ (b) b)) (λ (a) a) (λ (a) (λ (b) (a (a b)))) (λ (a) (λ (b) (a (a (a b))))) (λ (a) (λ (b) (a (a (a (a (a b))))))) (λ (a) (λ (b) (a (a (a (a (a (a (a b))))))))) (λ (a b) (λ (c) (a (b c)))) (λ (a b) (λ (c) (λ (d) ((b c) ((a c) d))))) (λ (a b c) ((a (λ (d) c)) b)) (λ (a b) (λ (c) ((c (λ (d) b)) a))) (λ (a) (a (λ (b) (λ (c) c)))) (λ (a) (a (λ (b) b))))
November 29th, 2007 at 2:09 am
You could have pretty-printed that. I’m not going to bother to try to parse it, nor to ask if it requires differntiation of normal-order or applicative-order evaluation.
— Uncle Glenny, whose first (computer) language was LISP (in 1974).
November 29th, 2007 at 5:20 am
Hi Glenny… I guess it wasn’t really clear, but the block of lambda calculus at the end there is actually just an obfuscated version of the (anonymously-authored) code linked immediately beforehand. The original is still here if you want to take a look. There’s even comments…!
I don’t think evaluation order should matter for this particular code, since there are no side-effects…