Write Fast Apps Using Async Python 3.6 and Redis(eng.paxos.com) |
Write Fast Apps Using Async Python 3.6 and Redis(eng.paxos.com) |
more performant than....what exactly? If I need to load 1000 rows from a database and splash them on a webpage, will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms? Answer: no. async only gives you throughput, it has nothing to do with "faster" as far as the Python interpreter / GIL / anything like that. If you aren't actually spanning among dozens/hundreds/thousands of network connections, non-blocking IO isn't buying you much at all over using blocking IO with threads, and of course async / greenlets / threads are not a prerequisite for non-blocking IO in any case (only select() is).
it's nice that uvloop seems to be working on removing the terrible performance latency that out-of-the-box asyncio adds, so that's a reason that asyncio can really be viable as a means of gaining throughput without adding lots of latency you wouldn't get with gevent. But I can do without the enforced async boilerplate. Thanks javascript!
From the last benchmark I ran [1] async IO was insignificantly faster than thread-per-connection blocking IO in terms of latency, and marginally faster only after we hit a large number of clients.
Async IO doesn't necessarily make your code faster, it just makes it difficult to read.
[1] http://byteworm.com/evidence-based-research/2017/03/04/compa...
const tweets = await getTweets(users);
console.log(tweets);
Is async code really harder to read?
Then you'll really be irritated.
yes it does not magically make your fetching 100 rows faster or your pbkdf2()/bcrypt() function. you still need to wait for those.
This type of operation is a given in any production quality webserver, whether it runs with multiple threads and blocking IO or using a non-blocking approach with greenlets. For a web application, this is an implementation detail that should not be explicit within the request handling code (a request handled in the context of a web container after all is a package of data in, a package of data out. no network reading/writing is usually exposed to the web application unless it's trying to expose IO handles to the app, which is unusual). Easy enough with something like Gunicorn.
If you have to do 1000 queries it could, since could async will make it feasible to do them parallel. If it's a single query, maybe async would make it feasible to shard the database.
[1] http://byteworm.com/evidence-based-research/2017/03/04/compa...
Am I wrong?
Potentially, it depends on if you can do other tasks for the same request that don't depend on the data. You might be able to render most of the page for instance. It's not purely about throughput.
Please tell me that 300ms was made up too and that it's not really taking that long.
it seems the main bottleneck when using aiohttp is aiohttp itself, which practically makes the use of uvloop irrelevant
Well, actually, yes. Without async rendering, your webpage is not ready until your 1000 rows of list is placed in Python memory then rendered to HTML as a whole then returned to your browser after like 300ms of server cost.
With async rendering, your webpage's headers and such can be returned immediately, thus your first-byte-to-response time can be done under 50ms, and your page loads by enumerating the rest of 1000 rows and renders the page incrementally.
def on_connection:
send(headers)
send(start of page)
for row in db:
send(row)
send(footer)
will have the exact same effect as what you said (not like that applies regardless, I don't think jinja outputs partial renders, since its made for flask)The performance comparison is between python managed green threads, and OS managed actual threads. You don't get any new features
While I can write this kind of code, I don't feel like I completely understand some of the concepts.
https://pragprog.com/book/pb7con/seven-concurrency-models-in...
When working with Python and Ruby I find 80ms responses acceptable. In very optimized situations (no framework) this can do down to 20ms.
Now I've used some Haskell, OCaml and Go and I have learned that they can typically respond in <5ms. And that having a framework in place barely increases the response times.
In both cases this includes querying the db several times (db queries usually take less then a millisecond, Redis shall be quite similar to the extend that it does not change outcome).
<5ms makes it possible to not worry about caching (and thus cache invalidation) for a much longer time.
I've come to the conclusion that --considering other languages-- speed is not to be found in Python and Ruby.
Apart from the speed story there's also resource consumption, and in that game it is only compiled languages that truly compete.
Last point: give the point I make above and that nowadays "the web is the UI", I believe that languages for hi-perf application development should: compile to native and compile to JS. Candidates: OCaml/Reason (BuckleScript), Haskell (GHCJS), PureScript (ps-native), [please add if I forgot any]
Wow, never managed to do that. Maybe I have to try it again (last time checked on Django was some years ago).
I'm confused by the relationship between Paxos, the company, and Paxos, the algorithm. Do the authors of Paxos work for Paxos?
Edit:
https://en.m.wikipedia.org/wiki/Paxos_(computer_science)
Ah; both are named for a fictional financial systen
https://magic.io/blog/uvloop-blazing-fast-python-networking/
I would prefer standard benchmarks for this. I hope they submit their framework to TechEnpower benchmarks.
One of the benefits of modern RDBMS is that they make extremely sophisticated use of RAM, and all levels of fast to slow storage below that SSD / RAIDs / slow single spindle.
It is a relative thin layer of rust code between the Redis module interface and SQLite.
At the moment you can simply execute statements but any suggestion and feature request is very welcome.
Yes, it is possible to do join, to use the LIKE operator and pretty much everything that SQLite gives you.
It is a multi-thread module, which means that it does NOT block the main redis thread and perform quite well. On my machine I achieved 50.000 inserts per seconds for the in memory database.
If you have any question feel free to ask here or to open issues and pull request in the main repo.
:)
Proceeds to show an animation of posting a blog post that performs no faster than if it was built using Django.
Might be that the server is insanely slow, but I would have no problems reaching 10k page views per second with some basic PHP and even MariaDB on a low end E3-1230 server. Pretty sure more would be quite easy to...
In async applications event loop is what actually executes your code and performs IO. In essence, event loops are under load all the time.
In real world, your web page consists more than one db (like mysql + redis + some RPC calls to microservices) queries, with async apis, you can concurrently request for all queries at once and join them all at rendering.
The async benefits can add up to a much faster responsive server.
a program with threads can support multiple requests simultaneously. a program with green threads can support multiple requests simultaneously.
You arn't giving any reasons why green threads in python perform better than threads in the OS.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms6...
It's built by Tom Christie - the original author of Django Rest Foundation.
http://discuss.apistar.org/t/how-does-api-star-fit-in-with-d...
Interestingly it eschews Swagger/OpenAPI in favour of JSON Schema, wonder how that'll pan out; I like the promise of codegen that swagger offers, but haven't found the generated clients to be particularly usable.
I keep hitting a wall with Python when I want to do something like:
1. subscribe to a websocket connection and keep the last received message in state 2. expose an http endpoint to let a client GET that last message.
If you were going to share state in memory between threads, how would you handle the case where the second request goes to a different server or that the process has restarted? You'd need redis anyway, so you might as well just use it in all cases.
http://flask.pocoo.org/docs/0.12/patterns/caching/
I'm not sure how this works with multiple threads though, I imagine you would have to synchronize it yourself.
Enjoy!
Its typical because people are still in a "single thread single transaction ORM crud" model of thinking. "Its linear because thats how it is"?
Sure, but it can now be a less heavy web request! ¯\_(ツ)_/¯
> a request typically has a single transaction going out to the database
The fact of the matter is, as applications develop, become richer, and grow larger, it becomes less and less uncommon to have more than one query per page. Especially in the context of larger organizations, it's very common to have everything wrapped behind a service call with an entire armada of infrastructure hidden behind it, and having to make many service calls to put together one web API result or page.
---
sigh Slight tangent. Look at where we are now and how we came here.
Back in the non-ajax days we used to do them all on the server side, then render the whole page all in one go. This would have come in handy back then! Imagine doing 5x 50ms queries asynchronously, dropping a 250ms response delay down to 50ms! But this stuff was hard back then, and we mostly left it alone.
This is also along the times when we figured out that since we can have pages that take a long time to load and block the interpreter, perhaps it's not such a great idea to serve many requests with a single interpreter, so people started using stuff like nginx to run multiple python interpreters in parallel (not even getting into threads here), which was easier to reason about since each python process is a separate universe that can block entirely, but overall we can still serve a new request with a new interpreter, so for the most part things are good.
Then the twisted people thought that this was silly, and why should we block in the first place, and they decided that the way to fix this was to change the way we program entirely, and re-create or wrap an entire ecosystem of software. It sort of worked, except there wasn't a good twisted package for your thing. But all in all it worked.
Then the greenlets (or one of its other 20 names) people came and wanted to instead use fine-grained implicit concurrency, and said "no no, we can get something with nicer abstraction packaging while mostly not changing the code we have", and that was even nicer, except when something didn't get monkey patched correctly for some reason. We got stuff like gunicorn, which was impressive.
Then as we moved more stuff to the client to create more responsive (in the original meaning of the word) applications, so we pushed the burden of requesting and fetching data to the browser side, which means that as a page loads, it might call REST APIs one by one (hopefully asynchronously!), each of which might make a single (finer-grained) database or service call behind the scenes.
So how different is this now from the gunicorn model? In the latter, you get fine threads of control, each working asynchronously to fetch their own thing, which gets put together in the server side, and then sent back to the client. In the former, you get similarly fine threads of control, but the fine threads perhaps live in their own universes, and it doesn't get all put back together until it travels over the internet to the browser.
So it's a little bit different, but overall what's happening is similar. It feels like we just keep moving concerns and procedures up and down the stack.
Surely there's reasons for all this. Times and technologies change, and we find ways to adapt. I like the "async" stuff because it makes things explicit. It's the middle-ground result of the culmination of our learnings that hiding async behavior makes libraries hard to design and can result in frustrating and unpredictable behavior, whilst changing the entire programming model isn't great either. So we get asyncio. I'm mostly happy with this result. Admittedly this article isn't doing any of this justice.
In Python, you've also got to run the event loop and pass the async function to it. This makes playing with async code in the interpreter more difficult. Also don't forget that async is also turtles all the way up (same as in JS). It'll infect any synchronous code that touches it.
I've written a Tornado app which makes heavy use of asyncio, and while it's pretty efficient, I would reconsider writing it the same way if I had to go back in time.
In your example you'd probably want to be using Promise.all to run two IO operations simultaneously.
The benefits are generally larger-scale than a single method.
Actual asychronocity, usually with event based systems, gets very ugly, very fast, because you end up having to make callback chains and queueing up your async work. There can be a good benefit to doing it, but its going to be a lot less readable than most sync code, and sometimes not any faster, in the case of Node.JS and its community forcing the usage of async function in places where they don't need to be used.
If APIStar happens to target the same subset, that's not a problem of course.
I'm happy with the choice.
The toy experiment is how to do what's trivial in Node with Python. Mainly because I like working with python. I think the answer might be: Python is the wrong tool for the job.
The simplest solution is to use a small DB system like sqlite. It is built into Python (import sqlite3) performs reasonably well and you do not have to run an additional service.
Now if a small DB like sqlite already feels overblown to you (and it really is simple and small) you might not need concurrent access either, so the simplest solution is to just use a file where you store your state.
What you're referring to works equally well in the single-process case for both.
Python is excellent for toy implementations, and real ones too in many cases.
https://github.com/mkj/wort-templog/blob/master/web/templog.... is my not-quite-toy example - a single process runs from uwsgi with Bottle (like Flask) and gevent. The long polling waits on a global Event variable that's updated by another request, nice and simple.
And IMHO nodejs is not a standard BSD license and comes with patent grant. That discussion went on for a year in the TSC . https://github.com/nodejs/node/blob/master/LICENSE
In general, this stuff is not always evident. But the BSD license by itself is not as good as Apache.
While in doubt, use Apache !
P.S. fyi, doing this later is super heavy-duty hard.
Then I found this http://oboejs.com/ and it was even more work, and I gave up. In the end it required rethinking everything and battling against a whole set of tools and libraries that just didn't think that way.
besides all that, it's just simple cases that they are testing. I would never ever trust this site or any result they got.
I said transaction, not query. A database transaction is on a single connection at a time and queries are performed via the transaction serially.
Unless you're doing e-commerce or banking sites, that's far less common that non-transaction requests.
edit: also, I'd challenge you to prove that for a web request that needs to make ten read queries to a relational database, from Python, that you can get better performance by opening up ten separate database connections (or from a pool) and running one query in each, bundled into the async construct of your choice and then merging them all back into your response, vs. just running ten queries on a single connection in serial. Assume these are not slow reporting-style transactions, just the usual "load the users full name, load the current status, load the user's current items", etc., small queries common in a web request that is looking for a very fast response with ten SQL queries.
Note that at the very least, it means your web application needs to use ten times as many database connections for a given set of load. In database-land that's more or less crazy.
and also my count example, it just makes no sense to have the count and the list data called inside a transaction (ok there are cases, but these are way more rare, because mostly It's not to bad to give users a wrong count, you don't need strict Serializability)
Anyway, I respect your position that yes, for the average user, throwing a bunch of "async" in there isn't going to make their code faster, and it's just cargo cult programming. And yes, there is some tradeoff curve where sometimes, for a small benefit, it's not worth the effort to worry about it, as with all things. But it's just a tough sell to argue that no one should need this :-)
More and more often today, the backend serves as glue between frontend clients and a horde of services / data systems. This is often an I/O heavy workload (wait while I make a request, wait for a response, wait while I download x10). This kind of workload is ripe for speeding up with async. That's all I'm saying!
At least in Java, C#, Golang. And even psycopg2 offers a Pooling Abstraction (I guess it's not used in Django, but SQLAlchemy offers that aswell) But of course running a blocking driver atop a non-blocking framework does not give the best performance.
However just challenging it without proof is not really that useful.
Also some workloads are better for Threaded Servers while others are better in Async Fashion, it's also highly unlikely that just wrapping your Database connection in a Async function that it will be faster or better suited for a async workload. If you are not non-blocking from the ground up you will still carry a lot of overhead around.
OK but you're doing....500 req/s let's say, so, if base latency is 50ms, you're going to have at least 25 requests in play at once, so that's 500 database connections. That's one worker process. If your site is using....two app servers, or your web service has multiple worker processes, or etc., now you have 1000, 1500, etc. database connections in play at capacity. This is a lot. Not to mention you'd better be using a middleware connection pool if you have that many connections per process to at least reduce the DB connection use for processes that aren't at capacity.
On MySQL, each DB connection is a thread (MariaDB has a new thread pooling option also), so after all the trouble we've gone to not use threads, we are stuck with them anyway. On Postgresql, each DB connection is a fork(), and they also use a lot of memory. In both of these cases, we have to be mindful of having too many connections in play for the DB servers to perform well. We're purposely using many, many more DB connections than we need on the client side to try to grab at some fleeting performance gain by stacking small queries among several transactions/connections per request which is not how these databases were designed to be used (a DB like Redis, sure, but an RDBMS, not so much), and on the client side, I still argue that the overhead of all the async primitives is going to be in a very tight race to not be ultimately slower than running the queries in serial (plus the code is much more complicated), and throughput across many requests is reduced using this approach. Marginal / fleeting gains on the client vs. huge price to pay on the server + code complexity + ACID is gone makes this a pretty tough value proposition.
Postgresql wiki at https://wiki.postgresql.org/wiki/Number_Of_Database_Connecti...: "You can generally improve both latency and throughput by limiting the number of database connections with active transactions to match the available number of resources, and queuing any requests to start a new database transaction which come in while at the limit. ". Which means stuffing a load of connections per request means you're limiting the throughput of your applications....and throughput is the reason we'd want to use non-blocking IO in the first place.
> However just challenging it without proof is not really that useful.
this is all about a commonly made assertion (async == speed) that is never shown to be true and I only ask for proof of that assertion. Or maybe if blog posts like this one could be a little more specific in their language, which would go a long way towards bringing people back into reality.
yes there are workloads, where everything you says is true. but most other workloads, like 80% of all the web pages don't need what you describe.
also some pages don't have a conventionell database at all. some people have a cache or some other services in place, some people use microservices, some people connect to other internet providers, other services like lpd/ipp etc. the world is just not black and white. everything what you describe is uterly crap since you just try to talk around, cause your application is not as complex as others. and yes in prolly 60-70% of the cases async will not yield more "speed"/"performance" however you call it.
I work with Openstack. I don't think you're going to find something more complicated :). (it does use eventlet for most services , though it's starting to move away from that model back to mod_wsgi / threads).