Lisp is not based on the Lambda Calculus

Lisp is not based on the Lambda Calculus(danielsz.github.io)

285 points by danielszm 6 years ago | 133 comments

vga805 6 years ago |

The post quotes McCarthy:

"one of the myths concerning LISP that people think up or invent for themselves becomes apparent, and that is that LISP is somehow a realization of the lambda calculus, or that was the intention. The truth is that I didn't understand the lambda calculus, really" - John McCarthy

So there are a two issues here, 1) whether or not it was McCarthy's intention to realize the Lambda Calculus in LISP, and 2) whether or not LISP is such a realization. Or at least some kind of close realization.

The answer to 1 is clearly no. This doesn't imply an answer to 2 one way or another.

If 2 isn't true, what explains the widespread belief? Is it really just that he, McCarthy, borrowed some notation?

vilhelm_s 6 years ago | |

Modern lisps do realize the lambda calculus, but this was not immediate. In particular, in order to exactly match the lambda-calculus beta-reduction rule, you need to use lexical rather than dynamic scope, which did not really become popular until Scheme in the 1970s.

btilly 6 years ago | | |

Did not become popular, or did not become implemented?

My understanding is that lexical scope was first implemented in Algol and Pascal, and then was first implemented with true garbage collection in Scheme. (Thereby leading to the restriction in Algol and Pascal that closures existed, but they could only be passed into functions, and never returned from them. That way the variables being closed over could live on the stack.)

But I'd love to learn that I'm wrong and that these things came before that.

paulddraper 6 years ago | | |

Now, all the Lisps (Racket, Clojure) are lexically scoped.

Common LISP is lexically scoped, though it does still have opt-in dynamic scoping ("special variables").

jasonhansel 6 years ago | | |

Question: if lexical scopes are in the language's data structures, but can't be explicitly created or made visible in the language's syntax, is it still homoiconic?

_emacsomancer_ 6 years ago | |

> So there are a two issues here, 1) whether or not it was McCarthy's intention to realize the Lambda Calculus in LISP, and 2) whether or not LISP is such a realization. Or at least some kind of close realization.

This would fit in with Graham's suggestion that McCarthy more "discovered" Lisp than "invented" it.

kmill 6 years ago | |

If it was a realization of a lambda calculus, then it is one with (a) primitives, (b) strict evaluation, (c) quoted lambda terms, and (d) "dynamic" bindings.

(a) In classic lambda calculus, everything is a lambda term. McCarthy's Lisp has primitives like lists and numbers. However, it is known that lambda calculus is powerful enough to encode these things as lambda terms (for example,

  null = (lambda (n c) (n))
  (cons a b) = (lambda (n c) (c a b))

gives a way to encode lists. The car function would be something like

  (car a) = (lambda (lst)
              (lst (lambda () (error "car: not cons"))
                   (lambda (a b) a)))

This would not work in the original Lisp because of binding issues: the definition of cons requires the specific a and b bindings be remembered by the returned lambda.)

(b) Lambda calculus does not have any evaluation rules. Rather, it is like algebra where you can try to normalize an expression if you wish, but the point is that some lambda terms are equivalent to others based on some simple rules that model abstract properties of function compositions. Lambda-calculus-based programming languages choose some evaluation rule, but there is no guarantee of convergence: there might be two programs that lambda calculus says are formally equivalent, but one might terminate while the other might not. Depending on how you're feeling, you might say that no PL for a computer can ever realize the lambda calculus, but more pragmatically we can say most languages use lambda calculus with a strict evaluation strategy.

(c) The lambda terms in lambda calculus are not inspectable objects, but more just a sequence of symbols. Perhaps one of the innovations of McCarthy is that lambda terms can be represented using lists, and the evaluator can be written as a list processor (much better than Godel numbering!). In any case, the fact that terms have the ability to evaluate representations of terms within the context of the eval makes things a little different. It's also not too hard to construct a lambda evaluator in the lambda calculus[1], but you don't have the "level collapse" of Lisp.

(d) In lambda calculus, one way to model function application is that you immediately substitute in arguments wherever that parameter is used in the function body. Shadowing is dealt with using a convention in PL known as lexical scoping, and an efficient implementation uses a linked list of environments. In the original Lisp, there was a stack of variable bindings instead, leading to something that is now known as dynamic scoping, which gives different results from the immediate substitution model. Pretty much everything fun you can do with the lambda calculus depends on having lexical scoping.

All this said, the widespread belief about Lisp being the lambda calculus probably comes from Scheme, which was intentionally lambda calculus with a strict evaluation model. Steele and Sussman were learning about actors for AI research, and I think it was Sussman (a former logician) who suggested that their planning language Schemer (truncated to Scheme) ought to have real lambdas. At some point, they realized actors and lambdas (with mutable environments) had the exact same implementation. This led to "Scheme: An Interpreter for Extended Lambda Calculus" (1975) and the "Lambda the ultimate something" papers. Later, many of these ideas were backported to Lisp during the standardization of Common Lisp.

[1] https://math.berkeley.edu/~kmill/blog/blog_2018_5_31_univers...

carlehewitt 6 years ago | | |

Schemer (later renamed Scheme) was invented to scheme against Actors reprising Conniver, which was invented to connive against Planner.

See the following for the current state of the art including the latest Actor approach to Eval, which is more modular and concurrent than the Eval in Lisp and Scheme:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3418003

The above article explains exactly how Actors are much more powerful than lambdas with mutable environments.

Cybiote 6 years ago | | |

Adding a little to your wonderful post, another possibility for why the conflation of the original lisp as based on the lambda calculus is that its notation was heavily inspired by the lambda calculus, even though it was more properly a refinement of the μ-recursive functions (of form f:(N,N,...N) -> N) over the natural numbers.

While a lot of people are trying to defend the lambda calculus as a basis, I think this actually undersells the significance of LISP. Apart from Lisp the language family and its implementations, there is Lisp, (arguably) the first practically realizable mathematical model of computation. That is, it stands on its own as a model for computation†, continuing along a long line of which I think Grassmann's 1861 work on arithmetic and induction is a good starting point.

Turing Machines are intuitive and the lambda calculus is subtle and expressive, but Lisp's contribution was to place partial recursive function on a more intuitive/realizable basis in terms of simple building blocks of partial functions, predicates, conditional expressions and symbolic expressions (ordered pairs/lists of atomic symbols). Lambdas come in as a notation for functions with a modification to facilitate recursive definitions.

†Making Greenspun's Tenth Rule trivially true.

btilly 6 years ago | | |

Thank you.

This is probably the most informative comment that I've read on HN in the last couple of months.

paulddraper 6 years ago | | |

> In classic lambda calculus, everything is a lambda term

OO says everything is an object. Even though Java has non-object primitives, we're still gonna classify Java as OO.

> Lambda calculus does not have any evaluation rules.

> The lambda terms in lambda calculus are not inspectable objects, but more just a sequence of symbols.

It's not clear to me why this makes Lisp not in the family of Lambda implementations.

> In the original Lisp, there was a stack of variable bindings instead, leading to something that is now known as dynamic scoping.

That's true. Every modern Lisp (Scheme, Clojure, Racket) has lexical scoping. And Common LISP uses lexical by default.

> Later, many of these ideas were backported to Lisp during the standardization of Common Lisp.

Again this contributes to the notion that LISP/Schema/Lambda Calculus were "discovered", not that Lambda calculus has an explicit pedigree.

danielszm 6 years ago | | |

This is excellent. Thank you!

John McCarthy said that he never had the intention to realize the lambda calculus, but he followed that statement with the corollary that had someone "started out with that intention, he might have ended with something like LISP." Peter Landin was a pioneer in that regard. See "The Mechanical Evaluation of Expressions", published in 1964, and the SECD virtual machine. Machine interpreters like SECD and CEK may come close to a "realization" of the lambda calculus. Their design is directly inspired by its semantics. You don't necessarily end up with something like LISP, but you can, see Lispkit and LispMe.

dwohnitmok 6 years ago | | |

For (b) isn't this the appeal of normal-order evaluation (or its related sibling lazy evaluation)? If there is a terminating reduction sequence lazy evaluation will find it, whereas eager evaluation will fail to find it.

Moreover the lambda calculus is confluent, so if you find the terminating reduction sequence, you're guaranteed all other terminating sequences end up with the same result.

So as long as your PL uses normal-form evaluation or lazy evaluation you can entirely realize any equivalences in the lambda calculus.

kazinator 6 years ago | | |

and (e) mutation: `setq`, `rplacd`, ...

illvm 6 years ago | |

Could it be that, at least some, university teach programming languages with the idea that Lisp has this feature? My PL course at Northeastern had SHLAC (Scheme Has Lambda Calculus) where we started with an identity function and built up from there.

vga805 6 years ago | | |

That seems likely. But then, I would think LISP does realize the lambda calculus in some sense. It naturally lends itself to this sort of exercise and it's really successful.

agumonkey 6 years ago | |

both fair points but I'd like to know if he mentioned why on earth did he pick lambda. lambda expressions/closures turned out to be a very peculiar and important path.

nils-m-holm 6 years ago |

Lambda Calculus (LC) versus LISP is not just about lexical scoping, but also about partial application (currying), which is cumbersome in LISP and natural in LC. In LC (where \ = lambda)

    (\xy.x)M  ==>  \y.M

while in LISP

    ((lambda (x y) x) M)  ==>  undefined

because the lambda function expects two arguments. Of course

\xy.x is just an abbreviation for \x.\y.x, so the LISP counterpart would really be

    ((lambda (x) (lambda (y) x)) M)  ==>  (lambda (y) M)

but this only proves the point that currying is natural in LC and not in LISP, because LC provides syntactic sugar that allows to treat higher-order functions and functions of multiple variables in the same way.

Also, LC is not compatible with functions with a variable number of arguments, which is common in LISP. For instance,

    (+ 1)  ==>  1

in most LISPs, but given PLUS == \mnfx.mf(nfx) and 1 == \x.fx

    PLUS 1  ==>  \nfx.f(nfx) == SUCC

i.e., (PLUS 1) reduces to "SUCCessor", the function adding one to its argument.

In most LISP dialects, you can pass any number of arguments to a variable-argument function like +. So what does the syntax (F X) denote in general? The application of a unary function to one argument or the partial application of a binary function? Or a ternary one...?

In LC it does not matter, because multi-variable functions and higher-order functions are the same.

I have developed a LISPy language that uses currying instead of functions of multiple arguments in the book Compiling Lambda Calculus (https://www.t3x.org/clc/index.html).

You can download the code here: https://www.t3x.org/clc/lc1.html.

jasim 6 years ago |

If I understood you right, Lisp was not directly inspired by lambda calculus, but from McCarthy's own research into recursive functions where he found that the three primary functions can cover the whole of computation.

What I'm extrapolating from this is that McCarthy's ideas are similar in implication to Lambda Calculus where you can define computation with just function abstraction and application, and use Peano numbers to represent data. Both approaches end up creating a purely functional way to write programs.

Would that be correct? I also wonder whether there is anything we can take away from this knowledge that is applicable to programming or how we look at it?

tudelo 6 years ago | |

What "three primary functions" are you referring to?

ayushgp 6 years ago | | |

I suggest you to watch this video if you want to understand how 3 basic functions can be used to create whole language https://youtu.be/3VQ382QG-y4

Shebanator 6 years ago | | |

This is touched on in the article:

"They are interesting because with just three initial functions (successor, constant and projection functions) closed under composition and primitive recursion, one can produce most computable functions studied in number theory (addition, division, factorial, exponential, etc.)."

dualogy 6 years ago | | |

Probably means the 3 irreducable primitives in LC: applications, abstractions, and "variables" (ie. attribute identifiers)

pwpwp 6 years ago |

One of the newest Lisp dialects, Kernel, is pretty close to lambda calculus, though. Like in LC, there is no implicit evaluation of arguments. A fexpr receives the "source code" of its input expressions, similar to LC. Then it can explicitly evaluate those it cares about.

https://web.cs.wpi.edu/~jshutt/kernel.html

kd0amg 6 years ago | |

"Receiving the source code" of an argument in lambda calculus is an accident of notation. The source code is not observable by the function it is passed to. Confluence implies that there is no way within lambda calculus to distinguish the result of reducing a term from the term itself.

bandrami 6 years ago |

I'm not even sure why it gets pushed as "functional"; I mean, you can pass and return functions, but that's really not the point of the language like it is with ML or Haskell. It's primarily a symbolic language.

bjourne 6 years ago |

Afaik, Haskell is a realization of the (typed!) lambda calculus. Lisps aren't because they don't do lazy evaluation. The LC beta reduction of (\a. a) (\c. d) (\e. f) is (\c. d) (\e. f) but most lisps will reduce it to (\a. a) d. This might seem like a minor detail but means general recursion using the y combinator isn't actually implementable in lisps (I could be wrong though).

curryhoward 6 years ago | |

> Afaik, Haskell is a realization of the (typed!) lambda calculus. Lisps aren't because they don't do lazy evaluation.

Lazy evaluation is just one possible operational semantics for a lambda calculus. Eager evaluation is another. In fact, all of the versions of lambda calculi presented in Benjamin Pierce's widely-read textbook "Types and Programming Languages" feature eager evaluation rather than lazy evaluation.

So the claim that the reason that Lisps aren't based on the lambda calculus is due to lack of lazy evaluation is incorrect. There are other reasons that Lisps diverge from lambda calculi but the evaluation strategy isn't one of them.

juliangamble 6 years ago |

There are some common themes here. Let's get some precise terminology so we can all talk about the same thing.

Some questions to ponder:

Is Lisp a term re-writing system? https://news.ycombinator.com/item?id=9554335

Is lambda calculus a term rewriting system? https://cstheory.stackexchange.com/questions/36090/how-is-la...

Is the Mathematica language a term-rewriting system? https://mathematica.stackexchange.com/questions/119933/why-d...

And to round it all up: Is Lisp an evaluation system and Lambda calculus an evaluation system? [I'll leave this one to the reader]

danharaj 6 years ago |

It is difficult to believe that McCarthy did not understand he was beating the same horse along with Church, Curry, Schoenfinkel, et al.

vilhelm_s 6 years ago | |

I mean, he obviously was aware of lambda-calculus as a thing that existed (since he used lambda notation for functions!), but might not have bothered to study it very closely. Nowadays the lambda calculus is considered very elegant, but in the 1940s I think it was considered quite weird and not very well known. (I forget who, but I think some famous logician complained that people did not read his thesis because it used the lambda-calculus formalism, or was adviced not to use it for that reason, or something like that.)

I think, actually the main reason it became so popular is exactly because it was implemented in Lisp/Scheme...

lonelappde 6 years ago | |

Why? Does every compiler writer know all the theoretical underpinnings and generalizations of their work? Or do that make something that solves a problem without investigating the entire universe around it?

danharaj 6 years ago | | |

Because he was certainly aware of the literature and he was a top notch scholar. Your follow-up questions seem to be implying something, care to spell it out for me?

empath75 6 years ago | |

From what I understand, he purposefully took ideas from it, but because he did not feel like he understood it fully, he didn’t try to make a full implementation of it.

klawed 6 years ago | |

Don't forget Gödel!

carlehewitt 6 years ago | | |

In his famous 1936 article, Turing correctly noted that proof of the computational undecidabilty of halting problem does not involve the same fixed point as the one used by Gödel.

See the following: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3418003

leshow 6 years ago |

Stupid question, why is it often written "_the_ lambda calculus" and not just "lambda calculus"

waitwhatwhere 6 years ago |

Interesting parallel with stories out there where authors think teachers get their writings “wrong”.

Unintended metaphor and application are things.

Smacks of a cognitive bias known as functional fixedness: https://en.m.wikipedia.org/wiki/Functional_fixedness

A screwdriver can also be a pry bar :-)

Imo this is why looser IP laws are important. Humanity needs to be able to rethink and find new application of its epistemological ideas to find new ideas of interest.

Too often we’re held to thinking about IP only the way the author intended. It’s almost pushing into thought policing.

dogfishbar 6 years ago |

I spent a lot of time on this. See M-LISP: a representation-independent dialect of LISP with reduction semantics, TOPLAS, 1992, the relevant bit is in section 2.

It's true that J. McCarthy had only a passing familiarity with LC. M-expression LISP, as it was originally conceived, was all about first-order recursion schemes over S-expressions. But due to a very simple error in the base case of an inductive definition, LISP 1.0 "featured" or "supported" higher-order functions, ala LC.

namelosw 6 years ago |

The problem is like 'is Erlang an Actor language?'. The answer is yes.

Carl Hewitt developed the Actor model based on Smalltalk in the 1970s.

Joe Armstrong created Erlang in the 1980s, which he didn't know the Actor model at all at that time. Erlang doesn't even have the concept of Actor, it accidentally implemented Actor model by the elegant design of processes.

But when it comes to the Actor model nowadays, Erlang is basically a must-mention language, although the intention wasn't about Actor.

ProfHewitt 6 years ago | |

The following article has a critique of Erlang as an Actor language:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3418003

ProfHewitt 6 years ago | |

Actors were influenced more by Simula-67 than by SmallTalk'72, which was a byte-stream language. However, neither language had adequate constructs for concurrency.

tempguy9999 6 years ago |

This is not relevant directly to the subject but perhaps someone in formal langs can help me. I'm interested in optimisation of (necessarily) pure functional langs. Starting with deforesting (the elimination of intermediate structures) eg.

  map(f, map(g, list(1, 2, 3)))

can be optimised trivially by a human to

  map(f.g, list(1, 2, 3))

(where f.g is functional composition) but I want to do this automatically, and the first step is to play with it. I've defined defined stuff on paper then started substituting but it's slow and, being me, error prone, when done with paper and pen.

Does anyone know of a symbolic manipulation software for haskell, or similar syntax (prefer to avoid lisp syntax if poss, but ok if nothing else) which will allow me to do this easily and get a feel for it?

Thanks

bontaq 6 years ago | |

You could use uniplate and a small AST to play more with it, the paper has examples of transformations

the paper: https://ndmitchell.com/downloads/paper-uniform_boilerplate_a...

small tutorial: https://www.cs.york.ac.uk/fp/darcs/uniplate/uniplate.htm

tempguy9999 6 years ago | | |

This seems (AFAICT) a bit higher what I'm after, but very interesting nonetheless, I'll have a play, thanks.

empath75 6 years ago | |

The compiler will generally do that sort of optimization.

tempguy9999 6 years ago | | |

Lord, that's an unhelpful comment. Some compilers do not do that eg. scala, and the cost is high hence my request.

wvlia5 6 years ago | |

Sounds like you are in need of transducers

tempguy9999 6 years ago | | |

Could you elaborate? A bit searching throws up some very interesting stuff but AFAICS a transducer is roughly comparable to a partial function, which doesn't relate to my request for manual expression manipulation support.

If I'm missing something, please say.

didibus 6 years ago |

Can we extend from this another misconception then? That functional programming stems from the Lambda Calculus? When in reality, it might come from Lisp, which does not come from Lambda Calculus, thus making Lisp the root of the tree for the origin of functional programming?

carapace 6 years ago | |

We know "the root of the tree for the origin of functional programming": John Backus's Turing Award lecture "Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs"

https://amturing.acm.org/award_winners/backus_0703524.cfm It's not obvious but the "ACM Turing Award Lecture" link is the PDF.

tempguy9999 6 years ago | | |

I dunno. I thought the foundations were laid in mathematics considerably pre computer. eg. from wiki's page on haskell curry:

The focus of Curry's work were attempts to show that combinatory logic could provide a foundation for mathematics. (edit: accidentally stripped the part here mentioned that was in 1933 ie. very pre-computer) [...]. The paradox, developed by Rosser and Stephen Kleene, had proved the inconsistency of a number of related formal systems, including one proposed by Alonzo Church (a system which had the lambda calculus as a consistent subsystem) and Curry's own system. [...]

By working in the area of Combinatory Logic for his entire career, Curry essentially became the founder and biggest name in the field. Combinatory logic is the foundation for one style of functional programming language. The power and scope of combinatory logic are quite similar to that of the lambda calculus of Church, and the latter formalism has tended to predominate in recent decades.

And I think there's more but it's hardly my field. Prolog is grown out of predicate calculus which has its roots in propositional calculus, which goes back to the ancient greeks.

The mathematical foundations of things can be surprisingly old. I saw a 3D wireframe of a goblet with perspective, and that was from the 1500's. It could have been done on a 1980's home computer by appearance.

cannabis_sam 6 years ago | |

Not really, Peter Landin’s ISWIM is essentially syntactic sugar over lambda calculus, and went on to influence ML and Haskell. So there is a more direct lineage from lambda calculus to functional programming via that route.

(Regarding the sibling comment: Landin’s paper also predates Backus’ paper by about 10 years)

didibus 6 years ago | | |

No idea how accurate it is, but Wikipedia says that ISWIM was influenced by Lisp...

didibus 6 years ago |

It isn't clear though if McCarthy didn't know anything about the Lambda Calculus, or simply didn't know it well and didn't create Lisp as a concrete realization of Lambda Calculus. In that, he might have created Lisp for whatever other reasons and was doing his own exploration, but it's probable that in doing so, he used his inherent knowledge of many pre-existing literature, which could include some of Lambda Calculus, thus Lisp having some resemblance to it, like the use of lambda to define functions.

Also, realistically speaking, no programming language is based on the Lambda Calculus as is, even those that try to be.

peterkelly 6 years ago |

Here's a simpler version:

Lisp has mutable variables. Lambda calculus doesn't.

deckard1 6 years ago |

When pedantry goes wrong...?

This article repeats this "TL;DR Lisp is not based on the Lambda Calculus"

But that's not actually what McCarthy said. McCarthy said:

> one of the myths concerning LISP that people think up or invent for themselves becomes apparent, and that is that LISP is somehow a realization of the lambda calculus

"Based on" and "realization of" are two different things. This kind of exaggerated or hyperbolic pedantry strikes me as clickbait. Which is unfortunate, because the article does contain some good content.

If you read the LISP I manual, you will see that concepts beyond the obvious lambda notation are used directly from The Calculi of Lambda Conversion. Notably, the distinction between forms and functions.

Clearly, we're splitting some very fine hairs here.

proc0 6 years ago |

So it wasn't based on LC, but is probably isomorphic to LC, right? This is why it's hard for me to believe math is invented. Different people and separate efforts but all arrive at the same patterns with different names.

pankajdoharey 6 years ago |

It is entirely possible to realise Lambda calculus using lisp. But McCarthy not understanding it is surprising.

Isamu 6 years ago | |

>McCarthy not understanding it is surprising.

I think he is commenting on the subtleties of it.

I think many reading here will say they understand it or have studied it in a course but I am not so sure everyone gets the subtle points. Myself I have always puzzled over the difference between what programmers call LC and what seems to be discussed by Church.

ozmaverick72 6 years ago | | |

I understand that Turing and Church came up with different approaches to describing the fundamentals of computing. You can see there is a relationship between LC and LISP. My question is how did we get to the von Neumann architecture and CPU instruction sets from either Church or Turing's work ?

lonelappde 6 years ago | |

Why? Lambda calculus is based on functions. Lisp supports functions. Both use lambda because that's a known notation for functions.

Lambda calculus can be modeled in lisp. But there are millions of things you can build with Lisp that McCarthy might not know or care about.

pankajdoharey 6 years ago | | |

It is not that hard to begin with.

pdpi 6 years ago | |

McCarthy not understanding it at the time

qubex 6 years ago |

Anybody who holds forth of syntactical matters (lambda calculus and LISP being two examples thereof) and commits the grammatical heresy of writing “I wasn’t going to go home” (emphasis mine) in lieu of “I wouldn’t be going home” has just neutered themselves, in my humble opinion at least.

foldr 6 years ago | |

There's nothing grammatically wrong with "going to go home".

qubex 6 years ago | | |

When I was taught English it was most definitely frowned upon and disparaged as “at best an Americanism”. It is grating to the native British ear and has no place in formal writings. There is no situation where it cannot be avoided by rephrasing the sentence (usually, by no more than employing “will be going”, but occasionally resorting to other constructs). During the IB we were absolutely forbidden from using it and would be marked down severely.

quickthrower2 6 years ago | |

Programming language grammars and spoken language grammars are two completely separate things. Someone can conceivably be a great programming language researcher who sometimes get's a rule of the English language wrong in a sentence.