Not everything is an expression(codewords.recurse.com) |
Not everything is an expression(codewords.recurse.com) |
The author suggests that "the obvious way to implement a DSL as a macro, as we saw with if-match, hard-codes the form of the new syntax class". I disagree. That's not what I'd consider the obvious way at all.
I'd consider the most obvious approach would be to pass the macro onto a polymorphic function of some description:
(defmulti if-match*
(fn [pat _ _ _] (if (list? pat) (first pat) (type pat)))
(defmacro if-match [pat expr then else]
(if-match* pat expr then else))
Macros have all the same capabilities for extensibility as regular functions. In Clojure at least, macros are just functions with some metadata attached.However:
1. The code you give still isn't smart enough. It dispatches on the symbol at the head of the list, but that doesn't account for namespacing. So your pattern-macros will all end up in one giant namespace. You could probably invent something clever to account for this but...
2. My overall point[1] was that writing a macro-extensible macro shouldn't require cleverness or new code - it should be in the standard library! Indeed, ideally defining a "pattern-macro" should be accomplished via the same mechanism as defining an "expression-macro"; you shouldn't need separate, custom macro-defining-macros for each syntax class. I'd settle for it just being easy to define an extensible syntax class along with a macro-defining-macro for it, though.
[1] Admittedly, this point could have been far clearer.
The idea that there should be "pattern-macros" and "expression-macros" is confusing how macros are used with what macros are.
The namespacing issue can be solved in the usual way; by evaluating the symbol in some fashion. How that's done really depends on what you want the syntax to look like. The simplest mechanism is just to pass the macro on to another macro:
(defmacro if-match [pat expr then else]
(if (list? pat)
(list pat expr then else)
(if-match-on-type pat expr then else)))And for 2, even though what you say seems desirable on the surface, you still approaches the problem in a way that is too fuzzy, or abstract. Just as saying "we should write more secure code" and then failing to attack the problem directly.
No offense, but even though you may have a nice idea, your explanation is a little too handwavy.
That aside, clojure.core.match is implemented pretty much as you describe:
https://github.com/clojure/core.match/wiki/Extending-match-f...
I too was a bit confused about what the article was getting at.
I'm working on a Lisp variant that recognizes the difference between expressions and... non-expressions? But taking the opposite approach: I found a way to make it so that everything is an expression while allowing one to do everything you would do with a non-expression via expressions. To achieve this, there are two kinds of expression: mutations and functions.
There are two things to understand about this:
1. All expressions take in the environment. Most functions don't use it, while most mutations do.
2. All expressions run inside a trampoline that evaluates them. The difference between a mutation and a function is that when the trampoline evaluates a function, it places its result into the return register (where it can be picked up by something else). In contrast, when the trampoline evaluates a mutation, it replaces the environment with the result. This is why mutations typically use the environment--rather than destroying the environment, you usually want to build the new environment with most of the old environment.
Some examples:
((mut () env (assoc env :foo 1))) ; equivalent to (define foo 1)
((mut () env
(assoc env :my-define (mut (dest src) env (assoc env dest src)))))
; this is actually how `define` is defined
((mut () env (map)))
((+ 1 1)) ; throws exception "undefined symbol +" because previous line emptied the environment
The "everything takes env" bit is inspired by J. Shutt's paper on his Kernel programming language: https://www.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unr... and a lot of what I'm working on is built on his work.The fact that Lisp does not distinguish between statements and declarations is closely tied to the fact that Lisp is very much a dynamic language (in particular, it is dynamically typed). The article uses the example of Python declaration vs. statement; but actually Python declarations are statements, too. This is typical of dynamic languages.
On the other hand, in a statically typed language there is necessarily a distinction between code that is executed at runtime and (although we often don't talk about it this way) code that is executed at compile time. Declarations happen at compile time. Expressions and statements happen at runtime. The two categories almost always use very different syntax.
Among statically typed languages, Haskell is particularly interesting, because, while it necessarily makes a strong distinction between expressions and declarations, it has erased the distinction between expression and statement: the latter is represented by an expression that returns a list of side effects.
Another interesting take on this issue can be found in Daan Leijen's Koka programming language[1]. In Koka, whether a function has side effects is part of its type. So effect inference can be done. The result, if I understand things correctly, is that the expression-or-statement issue becomes more than just a yes/no thing. I think these ideas are worth further exploration.
Lastly: an extensible pattern set. My goodness, yes. That's the big lack I feel in Haskell; I want to define new kinds of patterns. I've read that F# has good support for this, but I know nothing about it; can anyone comment?
I may be confused about what the exact claim is here, but I don't see how this has anything to do with whether or not something is an expression. I don't quite understand how having "not everything is an expression" helps solve this problem.
syntax-rules is a form of pattern matching (mentioned in footnote 2). But it's only for matching on syntax, not on ordinary values. I can't write the fibonacci function using syntax-rules.
The idea behind "not everything is an expression" is that ordinary macros only extend the expressions in a language. But languages have more than expressions to them - they also have patterns, and possibly other syntax classes (loop formats, LINQ, monadic do-syntax). I think those syntax classes ought to be macro-extensible as well. That's what I mean when I say that ordinary macros don't acknowledge that not everything is an expression.
It's rather a roundabout way to say it, I guess.
[0] https://common-lisp.net/project/iterate/
[1] https://common-lisp.net/project/iterate/doc/Rolling-Your-Own...
That sounds great, for lisps, when everything is an expression. I feel like you haven't fully embraced lisp, because the article just kind of asserts that not everything is an expression, without justifying it.
In lisp the point of expressions isn't that everything has a return value, the bigger gain is that everything fits together, like legos. One of the big wins is that you can use your same toolkit, functions & macros, on everything. The point of the article seems to desire restricting that.
I would say that declarations, ... are not syntax but semantic classes.
Too bad the conclusion does not offer a glimpse of what would the extension mechanism look like. Still, nice article.
[0] http://www.amazon.com/Art-Metaobject-Protocol-Gregor-Kiczale...
(defmacro foo [x] :foo)
which always returns a keyword.
All lisps that I'm aware of only let macros dispatch on list evaluation, i.e. when you see (foo ...), call the function defined in (defmacro foo), but I'm not aware of any limitation preventing you from applying that to other types of data.
[1] technically, they run at macro-expansion time, which is after reading the expression, and before evaluating.
In a language that doesn't allow that, we end up having to do some really weird things (https://github.com/yawaramin/lambdak).
Code:
fn sum(l: &[i64]) -> i64 {
match l {
[] => 0,
[x, xs..] => x + sum(xs)
}
}
Assembly: _ZN3sum20hf66fc5855a7cf5fc3aaE:
.cfi_startproc
cmpq %fs:112, %rsp
ja .LBB2_2
movabsq $24, %r10
movabsq $0, %r11
callq __morestack
retq
.LBB2_2:
pushq %rbx
.Ltmp10:
.cfi_def_cfa_offset 16
subq $16, %rsp
.Ltmp11:
.cfi_def_cfa_offset 32
.Ltmp12:
.cfi_offset %rbx, -16
movq 8(%rdi), %rcx
xorl %eax, %eax
testq %rcx, %rcx
je .LBB2_4
movq (%rdi), %rax
decq %rcx
movq (%rax), %rbx
addq $8, %rax
movq %rax, (%rsp)
movq %rcx, 8(%rsp)
leaq (%rsp), %rdi
callq _ZN3sum20hf66fc5855a7cf5fc3aaE
addq %rbx, %rax
.LBB2_4:
addq $16, %rsp
popq %rbx
retq
So... no. fn sum_tail(v : i64, l: &[i64]) -> i64 {
match l {
[] => v,
[x, xs..] => sum(v + x, xs)
}
}With respect to variable assignments or raising exceptions (though examples exist of the others...) aren't these by the author's definition statements?
(def a "apple")
or
(throw (Exception. "my exception message"))For instance, in JavaScript one can use ternary syntax in place of an if-statement. Is a ternary condition actually an expression? It seems more of an expression than a statement, but one could argue it's just a simplified syntax of a conditional statement.
I would have code his `if-mathch` in a simpler way in clojure:
(case (f data-structure)
0 (do-something)
1 (do-something-else)
(do-default))
Where `f` can be a function defined in a protocol or a multimethod, so you can actually implement your own `f` for any data structure you like.Now, what I am missing ?
One needs a registry, an interning function and a driving function. Below is just an example:
(defvar *patterns* (make-hash-table))
(defparameter *pattern-names* nil)
(defun intern-pattern (name if-pattern then-pattern)
(setf *pattern-names*
(append *pattern-names* (list name)))
(setf (gethash name *patterns*)
(list (compile nil `(lambda (pat)
,if-pattern))
(compile nil `(lambda (pat expr then else)
(declare (ignorable pat expr then else))
,then-pattern))))
name)
(defmacro if-match (pat expr then else)
(loop for name in *pattern-names*
for (if-part then-part) = (gethash name *patterns*)
when (funcall if-part pat)
do (return (funcall then-part pat expr then else))))
(intern-pattern 'variable
'(and pat (symbolp pat))
'`(let ((,pat ,expr)) ,then))
(intern-pattern 'literal-atom
'(atom pat)
'`(if (equalp ',pat ,expr) ,then ,else))
(intern-pattern 'cons
'(eq 'cons (car pat))
'(destructuring-bind (_ p-car p-cdr) pat
(declare (ignore _))
(let ((tmp (gensym)))
`(let ((,tmp ,expr))
(if (consp ,tmp)
(if-match ,p-car
(car ,tmp)
(if-match ,p-cdr
(cdr ,tmp)
,then
,else)
,else)
,else)))))
Writing the macro DEFPATTERN is then trivial...I help maintain an old Lisp-based web server, which was written in the mid-90s on the Symbolics Lisp Machine. It literally has zillions of these registry/intern/machinery/defining-macro combinations...
It's just: one has to program those. But it has been done many many many times.
There's lot of good work on programming languages that allow metasyntactic runtime extensions, going back to Brian Smith's work at PARC.
match compare x v with
| 0 -> ...
| 1 -> ...
| _ -> ...
with the minor wart that since compare returns an int, you need to match the last case with a default to prevent the compiler from warning you about possibly not matching other int values.Hopefully it doesn't make my research redundant. :)
I think "statements" is the usual term, with "sentences" being the general term which encompasses both (a "sentence" in this context is something recognized by your formal grammar).
Can your Lisp variant handle unrestricted continuations?
Here's what I'm doing with continuations:
At the bytecode level, call/cc is trivial to implement given my current design. All you need is the current instruction pointer, the current environment (environments right now are just immutable A-lists), and the current return pointer. But it's not obvious to me how to translate call/cc from the syntax to the bytecode.
Tail call elimination is the next thing I'm implementing, and unless there's something wrong with my design I haven't noticed yet, it won't be hard to do. Another step to take down that direction, though, is adding a pass transforming to continuation-passing style. I think this is going to be one of the last things I do, because I'd like to explore other optimizations so I can compare performance.
I'm not sure what you mean by sentences being "recognized" by the grammar. Keep in mind that there are languages where only a series of 'statements' is allowed, no freestanding expressions.
I'm not sure it's appropriate to say that declarations are "executed at" compile time in a statically-typed language. In SML, for example, there's a notion of "phase separation" by which you can split the meaning of a program into its compile-phase and its run-phase meanings. Many declarations end up having both compile- and run-time components. In Haskell you might say that declarations are executed at compile time, but this means nothing more than that name-resolution and type-checking happen at compile time.
Is the distinction or separation necessarily obvious? Statically typed languages which have a unified term- and type-level seem to get a bit subtle in this regard, since a type can be used as a value. And Idris passes proofs around, and has to use proof erasure in order to not let it infect the runtime: proofs can remain until run-time if you're not careful.
I am asking this as a question since I don't have enough experience with such languages to really know myself.
I don't think so, but in practice it usually is obvious, in the programming languages that actually get used.
OTOH, the notion that the compile-time language and the run-time language could be very similar is an idea that might be slowly catching on. We'll see ....
And in this system, patterns are expressions. So "everything is an expression" isn't exactly wrong in this case.
In Scala, you are given full control over both how a name "applies" (i.e. what Foo(...) does when called as a sort of constructor or factory method) and how it matches, in the form of an "unapply" static method that you implement. It seems like a "match" macro in a LISP could desugar a pattern into a program that matches that pattern, the same way the Scala compiler generates code that calls "unapply".
The really important point, that moxy explored and that I think I didn't make clear in the article, is that patterns are just one example of where you want extensibility of a non-expression syntax class. Really you want to be able to define your own syntax classes and be able to extend them!
I only mentioned moxy in a footnote in the article because it's "research-quality" at the moment - it's not well documented and I'm still having second thoughts about its design. I've moved on to other thing and am busy with work, so I don't imagine I'll be working on it in the immediate future.
Declarations are definitely a syntactically distinct class in, for example, Haskell or SML or C. Not so much in Lisp, Python, Javascript, and so forth.
I think the question of what an extensible extension mechanism would look like is a tough one, which is why I punted on it. I have worked on it a bit myself, as mentioned in another comment, but not in Lisp. I definitely might try my hand at a Lisp version in future, though, and I'm interested to see what other people might come up with!
Re 1:
In a.clj:
(ns a (:refer-clojure))
(defmulti if-match*
(fn [pat _ _ _] (if (list? pat) (first pat) (type pat))))
(defmacro if-match [pat expr then else]
(if-match* pat expr then else))
(defmethod if-match* clojure.lang.Symbol
[pat expr then else]
`(let [~pat ~expr] ~then))
In b.clj: (ns b (:refer-clojure) (:require a))
;; a 'foo pattern-macro that does negation
(defmethod a/if-match* 'foo [pat expr then else]
`(a/if-match ~(second pat) ~expr ~else ~then))
In c.clj: (ns c (:refer-clojure) (:require a))
;; a 'foo pattern-macro that is the identity pattern
(defmethod a/if-match* 'foo [pat expr then else]
`(a/if-match ~(second pat) ~expr ~then ~else))
Now, at the repl: user=> (use 'a)
nil
user=> (require 'b)
nil
user=> (if-match (foo x) 'yes x 'no)
no
user=> (require 'c)
nil
user=> (if-match (foo x) 'yes x 'no)
yes
Note that 'require doesn't actually import the symbols defined by b.clj or c.clj, and despite that we're able to use the pattern-macro 'foo they define, because they're modifying if-match's dispatch-table. So our pattern-macros aren't being scoped the same way our regular macros are. I think this is wrong. Moreover, our pattern-macros have a single namespace, so we get collisions between what b.clj defines 'foo to mean and what c.clj defines it to mean, which is why (if-match (foo x) 'yes x 'no) changes behavior after the (require 'c).Without justifying? The article gives some sample code (in a lisp) colored according to what syntax class each bit of code belongs to. The code is not all red (like it would be if it were only expressions).
What I mean by a "macro" is: a convenient way to extend the syntax of a particular syntax class by telling my language how to translate the syntactic construct I want to add into constructs it already understands. In this sense one definitely can have expression-macros and pattern-macros. Is there some reason you think one shouldn't have this in a language?
Your new proposed `if-match' is strange. I'm not sure how I'd use it to, for example, implement a "list" pattern-macro, such that `(list p1 p2 p3)' matches any list of three elements, whose first element matches p1, second matches p2, and third matches p3. You seem to intend that `((list p1 p2 p3) expr then else)` should macro-expand to the code that performs this pattern match. This is difficult for two reasons:
1. I'd need to shadow the definition of `list' with a macro that usually behaves like the list function, but if used inside of if-match behaves differently. This seems difficult, and is conceptually ugly.
2. AIUI, to get `((list p1 p2 p3) expr then else)' to macro-expand specially, `(list p1 p2 p3)' has to macro-expand to a symbol naming a macro, call it MAGIC, such that `(MAGIC expr then else)` expands to the desired result. This is difficult, because MAGIC needs to know about p1, p2, and p3. So `(list p1 p2 p3)' will need to dynamically define a new macro (with a gensym'ed name) that knows specifically about p1, p2, and p3, and return that. This may be possible, but is gnarly as fuck.
This is certainly very far from being "the obvious way to do it" to anyone but a wizard.
Maybe I'm misunderstanding you, or you made a typo?
In this sense, the idea of an "expression-macro" and a "pattern-macro" makes as much sense as an "expression-function" or "pattern-function". We don't typically have different classes of function to fulfil different purposes.
The `if-match` macro you describe could be written as a function, albeit with less pleasing syntax:
(defn if-match [pat expr then else]
(if-let [match (if (fn? pat) (pat expr) (isa? pat expr)]
(then match)
(else match)))
(if-match 0 0 (fn [_] true) (fn [_] false))
(if-match (pattern/cons 0 'x)
'(0 hi there)
(fn [{:syms [x]}] x)
(fn [_] nil))
So in the above case, we allow literal values to be passed in, or a function which returns a map of symbols to matching values. It's the function that allows for the extensibility.A macro can easily smooth over the janky syntax of the `then` and `else` functions, and we'd end up something like this:
(if-match (pattern/cons 0 x)
'(0 hi there)
x
nil)
This is essentially what my previous example was alluding to, though in that I intended to use a namespaced macro rather than a namespaced function for extensibility.You might well complain that `(pattern/cons 0 x)` is more verbose than `(cons 0 x)`, but it's also unambiguous, and doesn't require what would amount to a separate namespacing system, which a Lisp-1 like Clojure tends to reject.
Now, I don't think the above examples are magical in any way. The macro is a small improvement over the function, and the function isn't particularly complex. If higher-order functions are your idea of wizardry, then there's not much in Lisp that you wouldn't consider magic.
Maybe your objection just stems from the use of something like `p/cons` instead of just `cons`, while still desiring `cons` to be namespaced in some way. In which case, I'd suggest that what you're really asking is not 'can I have expression-macros and pattern-macros', but 'can I create new namespacing systems'.
More fundamentally, being known to something else is not the direction "is a" works in.
(a) it's not a well-known technique
(b) it's not in the standard library of any Lisp I know of
(c) there are interesting open design questions to be answered in the implementation of such a system
I hope this article will get folks thinking about these issues.
For example, walking the AST and calling macroexpand will work, but then you can't have a macro that expands differently in different contexts - when interpreted as an expression versus as a pattern, for example. I think this is an important feature.
This[0] may be of some help. You could set it to expand macros differently according to the current context (some global variable).
You could also write your own macroexpand function which would default to the implementation's macroexpand.
This is definitely something interesting to think about. Thank you for the discussion.
[0] http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec...
Here's a more thorough summary of the problem: http://qiita.com/guicho271828/items/07ba4ff11bff494dc03f
Coincidentally, you have contributed to that, so I thought I would ask you what is the problem with using macroexpand-dammit?
A macroexpansion facility works more like a framework than a library. Each macro you define is basically a callback function triggered when the compiler sees a form with a particular symbol at the head. You want to be able to describe what should happen in particular situations you're interested in, without having to specify anything about other situations. A code walker spoils that separation of responsibilities because it has to reimplement large parts of a macroexpansion framework just to make it possible to add a few new features which are then used in conjunction with the original framework.
The potential for bug contagion is quite high, because any single bug's effects aren't localized to the few forms that make use of the extended functionality you've rigged up. This is tantamount to introducing a bug in the language implementation. If you're expanding something like (my-macro (... (my-other-macro))), a bug in your code walker can break the intermediate code. This might be acceptable if you're just using it to write a couple of one-off macros that don't trigger any weird corner cases and are only used in the same project in which they're defined. But if you're creating a general-purpose DSL like iterate that you intend for people to use in all sorts of situations you never anticipated, it's not. In fact, the same bug is more dangerous as part of a code walker than it would be as part of a Lisp implementation's internals. If it's in the implementation, at least it would show up consistently whether the vulnerable code is wrapped in your macro or not, making easier to identify, document, and manage the problem. And probably easier to get somebody to fix, because the implementation ought to be better maintained than the walker.
These are not merely hypothetical concerns. Some projects already rely on macroexpand-dammit working in the quirky way it does; Masataro tried to have his improved version supersede the existing version on Quicklisp, but it broke some packages that were using it: https://github.com/guicho271828/macroexpand-dammit/pull/4#is...
Sometimes I use those packages too, so instead of using his version with the fix I submitted, it's actually less of a headache for me to apply my fix using a wrapper package on top of the Quicklisp version, duplicating some of the code which itself duplicates what Lisp already does. The original duplication problem led to bugs which led to the need to load two separate implementations which led to a name clashing problem which justified additional duplication.
Also, I tried to implement that same iter example using a similar trick before, and I see that your code generates the same mysterious output, at least on SBCL:
; compilation unit finished
; caught 1 fatal ERROR condition
A fatal error during compilation, yet it clearly compiled and ran. What was the error? I don't know, it looks like something in the code walker muffled the details. Does it matter? Could it matter in any other scenario? Something about macroexpand-dammit doesn't play nice with the implementation, but we don't know what, why, or how, and it makes it more difficult to use the very language features that would normally help us find out.I think a lot of critiques of the "bipolar Lisp programmer" [http://www.lambdassociates.org/blog/bipolar.htm] are overblown, but in the case of these kinds of tools, they're justified. These kinds of techniques will always be restricted to obscure one-off software projects unless they're supported by core language features like what rntz and Masataro are proposing. The designs of macro systems are optimized for making it easy to make small tweaks to the language to fit the particular problem at hand. They're not quite as flexible as something like the MOP. If you try to use a facility designed for making small, local interventions in the language to write very general language extensions which themselves make broad, global interventions in the underlying system for making small, local interventions, you're gonna have a bad time.