Python vs Common Lisp, workflow and ecosystem (2019)(lisp-journey.gitlab.io) |
Python vs Common Lisp, workflow and ecosystem (2019)(lisp-journey.gitlab.io) |
One thing, re “In Python we typically restart everything at each code change“: I sometimes run Python in Emacs with a REPL. I evaluate region to pick up edits. Not bad.
The big win for the Common Lisp REPL is being able to modify data, do restarts, etc. I usually use Common Lisp, but for right now I am heavily using Clojure to write examples for a new Clojure AI book that I am writing. I miss the Common Lisp REPL!
Clojure was an eye opener for me and I think it offers a great developer experience (e.g. I'm addicted to C-c C-p)
It seems my journey hasn't ended and I definitely have to check out CL!
But atm it's hard for me to give up the things clojure offers to me: persistent datastructures, access to a great ecosystem and a very good designed standard library.
I know people are tired of the "lisp/smalltalk did it better" but what features of python are not possible (or hard) in CL[OS] ?
ps: how many CL shops are out there ? I'd work near free just to try a CL team once.
The way methods are ‘attached’ to classes gives you a natural form of type-directed name lookup. CL generics have the advantage that you can define your own methods on existing classes while naming the methods in your own package so they don't conflict, but also have a curse of inconvenience along the way where importing a class doesn't naturally pull in everything associated with it, and you wind up writing the class name again and again when dealing with fields of an object. with-slots et al. are poor substitutes. (In an experimental sublanguage at one point I actually had local variables with object type declarations implicitly look up the class using the MOP and symbol-macrolet every available var.slot combination within the scope as a brute hack around the most common desirable case.)
Python's short infix/prefix operators are naturally generic, since they're implemented as method calls. In CL there's the generic-cl extension, but I haven't seen it have that much uptake… in particular, any library code that isn't explicitly aware of it won't use it ‘naturally’ on foreign objects, which could be good or bad.
That shades into the very-concrete type system that CL starts out with, where any attempt at ad-hoc polymorphic interop is a disaster unless everyone already agrees on what methods to use. I can't make a thing that acts like a hash table but uses a different implementation underneath, then pass it to something that expects to be able to gethash on it. I especially seem to get bitten by this in cases where alists are the expected way of representing key-value maps: there's no way to extricate yourself from the linear search without rewriting every piece of code that touches it, there's often an implicit contract that you don't want duplicate keys but it's easy to violate by accident and create bad behavior down the line, and so on. By comparison, Java collections in particular got this very right in terms of decoupling intention from implementation, and Python does basically the same thing but with a looser set of ‘expected’ methods.
By default, Python objects have a ‘purely’ dynamic set of properties, rather than the fixed slots CLOS imputes on an object via its class. Indeed the class-level property one can set in Python to constrain this for possible performance gains is called __slots__.
CLOS already offered AOP, which you can use to control such calls.
https://lispcookbook.github.io/cl-cookbook/clos.html#dispatc...
All I can think of is what a mistake those features were in python. "Fancy tricks" are generally the author trying to be clever (in the Kernighan sense) and ends up obfuscating the result. Not saying it works this way in CL, I don't have the experience to make the call, but in Python it was (and probably still is) prevalent.
> the author trying to be clever (in the Kernighan sense)
I know that Kernighan is the author of the C book, but its been a while since I skimmed it.
This manifested in a few spots. The first is, yes, s-expressions. Yes, yes, I know, s-expressions are integral to the power of lisp, and they're really not hard to read once you get used to them. All of that is beside the point. The reality I saw on the ground, when teaching people to program, is that even people who have no prior programming experience whatsoever, and therefore no preconceptions to get over, have a harder time grasping s-expressions than they do algol-style syntax. I don't know why. I didn't have the same problem myself. But it's a real phenomenon that I struggled to help people through on a regular basis, and the lisp community's defensiveness about it is not going to make it go away.
Arbitrary-seeming names with zero mnemonic value is another problem. Car, cdr, progn, etc. - it takes a special kind of personality to not be put off by this sort of stuff. Not everyone has that kind of personality. Not everyone should have that kind of personality.
Finally, all the hair-splitty (at least to a newcomer) distinctions to contend with. =, eq, eql, equal, and equalp, or let, let* and letrec. Sure, there are reasons for these distinctions. But a language that can get by without quite so many of them is going to be a lot more attractive to newcomers. Even if it comes at the cost of footguns, if they're unlikey to be discovered until later.
I like python but I'm a bit fed up with the mobthink (how surprising).
For someone to to even make the determination that "Lisp did my better" they must
A. Have a comprehensive knowledge of both Python and Lisp.
B. Have some understanding of the history of the languages.
C. Understand the problem on an intrinsic, fundamental level that enables them to evaluate which approach is better.
D. Generalize the problem to a broader scope to demonstrate why one language is better on a broader level.
E. Understand every other language ever used so when they show why Lisp does it better, they can defend themselves when someone comes up and tells them, "actually, Pascal does this even better yet."
Furthermore, it doesn't really matter what language did it better. The point of most talks is simply to demonstrate how to solve some problem in a certain language. If every talk started with "you can do this in Python, and I'll show you how, but you should probably just switch to Lisp because it does it better," that isn't very helpful.
It's more about the 'wheel reinvention' syndrome that creates fatigue in me.
Can the common lisp condition system be adapted to Elixir? Is there an advantage to doing so? Is there some obvious tradeoff between the two I'm not expressing?
Thanks.
see this thread form HN for more about adapting the condition system elsewhere.
LFE(Lisp-flavored Erlang) is an existing language that combines Lisp syntax with Erlang's backend, though I haven't used it myself yet.
In lisp, you are learning a new language with every lib because each author think they are a god language designer and that their macro rock. Also they don't need a good doc cause they are obvious. Or good error message cause they never break.
Also the way the author dismiss the number gap of packages available is ignoring the elephant in the room.
You have any specific examples of Common Lisp libraries you found hard to understand? I had a hard time when first learning the language, but once I got proficient to write my own code, I found it easy to understand most libraries I ended up using myself, the same as any language really. That the REPL makes it so easy to explore them with your own context, helped a lot as well.
> Also the way the author dismiss the number gap of packages available is ignoring the elephant in the room.
In the very same section, the author describes why the number of package don't matter as much as you think it does. Curation VS free-for-all-publishing (like APT vs NPM). Add together that Common Lisp "the language" has been stable for decades, makes it much more possible to be able to use any of the libraries you find as well, where in the Python world, we both know this not to be true (just Python2 VS Python3 makes this a whole other world of messes).
[1] Off the top of my head, I can recall Cells (early reactive/dataflow system - weirdness included symbol names made to sort first in Allegro CL IDE) and hu.dwim.* stuff which had its own wrapper around CL:DEFUN and CL:DEFMETHOD, iirc. But I successfully used their stuff without caring about that.
I want to say this is an over generalization but I am living this in clojure right now.
However, I think this article is a bit skewed and not highlighting things Python has.
For instance, the standard library means that Python is more usable out of the box.
Also when you've got things like iPython or Jupyter it means you can get off the ground easily.
So, in the end, they're two different languages, and I do not think either is better of the two. Right tool for the job and all that.
[0]: https://lisp-journey.gitlab.io/pythonvslisp/#state-of-the-li...
There definitely are many areas where Python is best, but for ML and data science, Julia is (i) very competitive in library coverage, (ii) more performant and flexible, and (iii) has a very good Python bridge if it's needed.
I can imagine there are niches within ML and data science where what you need are Python-only libraries, you don't miss anything restricting yourself to the numpy type hierarchy and there's no advantage to calling the libraries from Julia, but I'm curious to check if that is what you actually meant and if so, what you are doing.
I think people often underestimate just how much faster Julia is than numpy, I've consistently seen performance improvements on the order of 10x-30x when porting code.
There's so much buggy low quality stuff in that space that I'd write a serious application in C or C++ from scratch.
It would be a custom application, sure, but not everything needs to be general.
Also, I find Lisp much more natural for mathematical reasoning.
Pandas is huge, libraries like Spacy, NetworkX, etc exist. It's a massive and good ecosystem. Python is the goto for scientific computing in most of the sciences for newer students I'd hazard a guess over the older R and Julia.
This will be blindingly obvious if you work in that area. Yes, you can do it in another language, but you're missing out on a lot of stuff that is just done and is state of the art and is fast because the speedy parts aren't in Python. The complaints about parens for lisp are superficial, but it's my experience the same same goes for whitespace in Python. They just don't matter.
People are actually doing it. And a lot of it too. Both in terms of data science (as a broad term that can mean a bunch of different things) and in terms of computation for specific scientific fields like physics or biology.
Python is a great high level language for basically everything, but hardcore low latency apps. I can parse text, connect to databases, do sparse matrix computations on massive matrices, calculate network flows, generate large node-graph diagrams, use a Python based API to connect to any vendor software I've seen, do any kind of statistical analysis thing I need with pandas, amazing and free IDE allows me to use a REPL, code editor, and data structure viewer with ease, Python notebooks for education...etc etc. I've frequently found that I can rewrite a vendor's 10k line C++ program in a few pages of Python as the built-in Python data structures make text parsing extremely flexible and simple.
Programming is something interesting and fun they are going to do for a few years while young, until (pick one) {their band takes off, someone funds their startup idea and they hire others to do the programming while they generate genius ideas and run the business, they get promoted to a high paying executive position that involved management and architecture and others do the coding, their podcast becomes a hit and they can live off that, the small company they work at IPOs or gets bought and they make enough to retire at 30, etc).
So they learn a fairly easy language that has lots of libraries that cover most things you do in a routine developer job.
...and before they know it they are 50-60 and still writing a lot of code, and realizing that if they had known they would still be doing this 30-40 years later they would have been better off if they had learned and used and gotten good with some of the languages that have a reputation of being very productive but hard to learn.
I'd also add spreadsheets and database to that. At one point I was the database guy at work, because no one else was available. I learned enough SQL to get by, but was in no way a database expert. Heck, we had to pick job titles at one point that described what we did to have on the business cards the company was giving us, and I put down "Database Roustabout" [1], which should give you an idea of where I stood. That was 20 years and I'm still the database guy at work. It would have been a lot better if sometime early on I had said "I'm going to become really good at SQL even though I'm sure someone else will become database guy in a year or so".
[1] Roustabout. NOUN. An unskilled or casual laborer. (North American) A circus laborer.
class KnowItAll(object):
def __getattr__(self, attr):
return lambda: "yes, I know how to " + attr
We then have: >>> k = KnowItAll()
>>> k.reveal()
'yes, I know how to reveal'
>>> k.transfigure()
'yes, I know how to transfigure'
>>> k.make_sandwiches()
'yes, I know how to make_sandwiches'
Ruby does this with method_missing instead, which is where I'm most used to it happening (and I think it's used a lot more in Ruby than in Python owing to the language-culture's higher tolerance for magic). Smalltalk used doesNotUnderstand, IIRC. One of the key secondary results of this is that you can do things like https://paste.ee/p/tUaRP, which is a toy “Tracer” class which intercepts, prints, and forwards method calls and attribute accesses (ignoring some edge cases).If we were in CLOS, and started with:
;;; widget.lisp
(defclass widget () ((radius :initarg :radius)))
(defmethod grow ((w widget)) (incf (slot-value w 'radius)))
What I would expect for the equivalent is that, given: ;;; tracer.lisp
(declaim (ftype (function (t) t) make-tracer))
(defun make-tracer (object) ...)
;; ... further code goes here ...
Somewhere else, we can do: ;;; fiddle-with-widgets.lisp
(let* ((w (make-instance 'widget :radius 3))
(w* (make-tracer w)))
;; ???
(grow w*))
Can you add code to tracer.lisp, without specific reference to anything from widget.lisp, such that this has the effect of (grow w) but prints what it's doing? Note also that my use of slot-value above is very deliberately a ‘raw’ access. I'm here completely ignoring the “make up entirely new methods on the fly as needed” part that method_missing also gets used for, which is even more impossible in CLOS given that it would require intercepting, what, all symbol lookups…In CLOS, the class doesn't ‘own’ the method, it provides a type for dispatching on, so there's no way to do “give me some control over every generic function so long as the first arg is of ‘my’ type”. Which is a reasonable model, but means you can't do the same thing. Indeed the flip side is that in the Smalltalk-like model, generics are not reified, and methods that are specializations of the ‘same’ thing have no ‘real’ identity to them, so you can't do a type-ignoring :around method for an ‘entire generic’. (Often there will be a superclass to attach to instead, but it's considered dangerous “monkey patching” to mess around with someone else's class hierarchy like that, and in the case of more abstract interfaces there's nothing.)
Does that make sense?
(defclass tracer () ((actual :initarg :actual)))
(defun make-tracer (object)
(make-instance 'tracer :actual object))
;; But please don't.
(defmethod no-applicable-method :around (gf &rest args)
(if (typep (car args) 'tracer)
(let* ((tracer (car args))
(args* (cdr args))
(actual (slot-value tracer 'actual)))
(format t "Calling ~S on ~S" gf (cons actual args*))
(apply gf actual args*))
(apply #'call-next-method gf args)))
You can't do this with a real specialization on no-applicable-method, incidentally, because the first arg isn't special enough, it's just folded into the &rest. And that, I'm pretty sure, means this doesn't coexist with other uses of no-applicable-method properly… and you still can't do on-the-fly method names that aren't attached to a generic, and so on, but this does sort of account for the object forwarder case (and in fact you could extend it to allow tracers on more of the arguments!). It does, I expect, remain extremely unidiomatic by comparison.As someone new to the ecosystem of Common Lisp and in general a bit sadist (ref https://www.youtube.com/watch?v=mZyvIHYn2zk), could you share which ones these are so I can enjoy not understanding them at all?
Edit: I see now after I made my comment you added examples, thanks :)
But their use isn't that widespread, and in practice you can pretty quickly get used to the rare case that needs you to understand them.
I think the most complex is stuff that requires code-walkers and involved things like macros for CPS transformers of code.
This is completely impossible to do in the Python language, unless you resort to external tooling written in C or Fortran. Sure, you can call these codes from Python, as you can call them from any other language.
Numerical methods and data science are mostly done by engineers, mathematicians, and other random stem folks. I've yet to meet someone who is even cognizant that Numpy is really calling out to some low level C, C++, or Fortran library. They just know that you call a library like any other and the code works.
If you're trying to say that any language with FFI capabilities can do that, you'd be right, but it also doesn't matter much. Python has somehow found a sweet spot where it's easy to learn and onboard people and there is support for a lot of stuff with relatively low hassle. It certainly isn't lisp, but somehow seems to be orders of magnitude more successful.
I've been searching for a tool/language/ecosystem to replace Python for ages, but nothing ends up becoming close. I spent a significant amount of time learning lisp, but a lot of what I saw (besides the power of macros and restarting) was just a less intuitive way of doing things I could easily do in Python, Ruby, or Perl. Lisp is secret alien technology if you're coming from C or C++, but coming from Python it seems closer to a wash.
then you've never met anybody who builds the tools that you use. Which is alright. But if you disparage their point of view then you sound a bit funny.
Of course you could call the same functions from the ffi of any other language, but nobody does that for the same reason that nobody writes web applications in C.
I hate python, as far as I'm concerned it's a nightmare hell of a language that does everything wrong, and yet it's probably the language I use the most due to its sheer convenience and massive ecosystem.
There's pypy, a jit python interpreter writen entirely in python, and it does not depend in C. It is also much faster than the common interpreter, cpython. Unfortunately it is still not appropriate for numerical computation, as the language itself makes working directly with numbers very cumbersome (and this was the point I wanted to make).
I'd love to write my own solution in assembly or C where I give birth to every function, but nobody has time for that level of monumental effort. Low level matrix libraries have a lot of inertia for a reason.
I'm not disparaging anybody's point of view. Yours is certainly valid for a small group of elite users. I'm just trying to point out that it is only a valid point for a very small group. Most simply view these things from the perspective of the entire ecosystem. Even scientists well aware of the C internals will not always use that knowledge.
I think we may be speaking past each other a bit.
Having worked months with a slew of senior data scientists, this was a bit painful. Python is so slow and those data scientists were very good at coming up with solutions for the issues of the company, but the implementations (using Spacy, Pandas and other libs) had enough Python in them to make them not practical for the company use case. Nice prototypes which I then had to fix them or even rewrite to C/C++(we worked Rust as well to try it out) to make them usable in the company data pipeline.
I think companies are burning millions (billions in total?) on depressingly slow solutions in this space by throwing massive power at it all to make them complete their computations before the sun dies out.
Example: we needed a specific keyword extraction algorithm for multiple languages; my colleague used Spacy and Python to create it. It took a couple of seconds per page of text; we needed max a few ms on modern hardware. He spent quite a lot of time rewriting and changing it, but never got it under 1s per page on xlarge aws instances. My version takes a few ms on average executing the same algorithm but in optimised c/c++.
Sure we could've spun up a lot more instances, but my rewrite was far cheaper than that, even in the first month.
If you want to email me at matt@explosion.ai , I'd be interested in the specifics of the algorithm and why the implementation was slow.
The idea for something like that keyword extraction algorithm would be that if the Python API is slow, you should just use Cython. The Cython API of spaCy is really fast because the `Doc` is just a `TokenC*`, and the tokens just hold a pointer to their lexeme struct, which has the various attributes encoded as integers.
I've never really done a good job of teaching people to use the Cython API though. I completely agree that it's not productive to have slow solutions, and using too many libraries can be a problem. The issue is that Python loops are just too slow, you need to be able to write a loop in C/Cython/etc. Thinking through data structures is also very important.
I get very frustrated that there's this emphasis on parallelism to "solve" speed for Python. Very often the inputs and outputs of the function calls are large enough that you cannot possibly outrace the transfer, pickle and function call overheads, so the more workers you add, the slower it is. Meanwhile if you just write it properly it's 200x faster to start with, and there's no problem.
Cython part sounds good; I will try it out and email you if I get totally stuck, thanks!
In either case, I much prefer prototypes in Python than, say, Matlab. To speed things up I once rewrote an internal Scipy function to a version that allowed me to use it in vectorized code on my end. If the prototype is in Matlab, the optimization and integration possibilities are much more limited due to licensing, toolboxes, and the closed ecosystem in general.
Even as someone who 'knows' C and C++ I still find it faster and easier overall to do the exploratory and 'science' part in Python, making sure it works and gives me the answers I need in the format I want etc. only to then rewrite and optimize the slow parts in C or C++ if necessary.
This development process allowed creation of a good solution that you were then able to quickly port to a performant production platform.
Sure maybe a 3 minute task in python that reconciles a few million transactions and builds some very useful projections is too slow for some pipelines, but it worked for my clients.
I can't speak as to the specific use cases that you've encountered, but performance wise, I have found Python to be a fine choice for several ML services.
The underlying point I often make to people is that Python's slowness introduces a lot of incidental complexity, and you find yourself fiddling with numpy or something instead of just writing normal code and expecting it to perform normally.