Optimizing Python – A Case Study(airpair.com) |
Optimizing Python – A Case Study(airpair.com) |
My personal favourite optimisation, from needing to shave a few milliseconds off our API response times, was discovering that it's measurably slower to use * args and * *kwargs, and switching to explicitly declaring and passing arguments in the relevant parts of the code.
We also did a few other neat things:
- Rolled our own UUID-like generator in pure Python (I was surprised this helped, but the profiler doesn't lie)
- Switched to working directly with WebOb Request and Response objects rather than using a framework
- Used a background thread with a single slot queue to make sure our response was returned to the user before we emitted the event log message, but always emit the message before moving to the next request
- Heavy optimisation of memcache / redis reads and writes
Edit: Fixed formatting
The crosstown_traffic API in hendrix does exactly this.
Dump your virtualenv, create a new one with pypy, reinstall libraries and test your app. Takes less than 20 minutes, even for complex applications.
This is the advantage Python has over lower level languages - easy way to implement complicated things.
Kind of like Linus's quote: "Bad programmers worry about the code. Good programmers worry about data structures and their relationships."
For example, I am writing code that implements networks that evolve over time for AI research. Prototyping it in Python makes it easy to test things out, but I expect that I will have to rewrite it in C++ or maybe something more fun, like Haskell[1].
1. Mostly for the sheer joy of trolling my colleagues with a learning agent monad.
Well, most of the time at least. Think about DST and leap seconds.
JRuby is faster than standard C Ruby too.
In my experience the reason why you wrote something in python (implement features faster) remain valid reasons later on when you want to add functionality.
By using pypy, cpython, rewriting small parts in C/C++ and/or using the libraries which are written in C (such as numpy) you can normally make the hotspots in your code fast enough, while keeping the advantages of python.
Come check out #hendrix on freenode.
That's why you must quantify advantages and disadvantages, including risks minimization and only then you'll see, whether given course of action is viable.
Also, profiling pypy is less straightforward than profiling CPython code since the hotspot changes the runtime characteristics of the program. This means you need to run tests many times over to make sure the code warms up. This makes further optimizations slightly more difficult. It's not a problem for people with experience optimizing Python code, but for people who actually hope to learn something from OP's blog post, it might be a sticking point.
In my experience in using Python for stats, script type work (as opposed to writing servers and daemons), pypy just isn't that useful. All the Python code is doing is gluing numpy and Cython code together and pypy isn't likely to be able to warm up in time to beat it - and it won't beat it since it's spending most of its time in C.
Obviously, if pypy is an ideal choice for you, use it. But I don't think your experience should really be put forward as a general approach.
Everything you say is true, but under the systems I use/write I am able to test for correctness pretty quickly. The nice thing about switching from CPython to PyPy is that everything get faster. I have also found that using PyPy has removed lots of cases where I would want to drop down to native code.
Changing platforms can make one's designs simpler and more robust. When it comes to structured storage, I'll start with sqlite, then when it starts to get slow I'll switch to PostgreSQL. It takes almost no work to port from one to the other.
You really should give PyPy another shot. It supports more of numpy every day and the startup time is excellent. Maybe give jitpy a try if you are not likely to move off of CPython.
It IS an interpret itself, and it runs on a virtual machine that takes care of JIT compilation.
Jython may do the same: rather than create cpython bytecode it may produce JVM bytecode which may be optimized by the JVM. However, I do not know anything about Jython performance and they could not be employing the same tactics as JRuby.
With modern Java features like lambdas coming into java 7 and 8 and other interesting languages like Scala, Groovy, etc being written for the JVM I'm sure things have come a long way since the time jython 2.4 was being developed on Java 5/6 and I'm sure the JVM has many more optimizations that dynamic languages may benefit from.
I think Clojure is one of the most interesting so I'm keen you include it in your minimal list of examples. I don't think Clojure actually uses the Java 7 "invoke dynamic" bytecode though.
An interpreter executes scripting instructions directly.
A virtual machine implements a faux (virtualized) cpu, with its instruction set etc, and executes its "assembly code" (with or without JITing).
(Things get complicated in that you can also have combinations of those concepts).