Maybe Rust isn’t a good tool for massively concurrent, userspace software

Maybe Rust isn’t a good tool for massively concurrent, userspace software(bitbashing.io)

704 points by mrkline 2 years ago | 613 comments

Animats 2 years ago |

OK, I suppose I should write to this.

As I've mentioned before, I'm writing a high performance metaverse client. Here's a demo video.[1] It's about 40,000 lines of Rust so far.

If you are doing a non-crappy metaverse, which is rare, you need to wrangle a rather excessive amount of data in near real time. In games, there's heavy optimization during game development to prevent overloading the play engine. In a metaverse, as with a web browser, you have to take what the users create and deal with it. You need 2x-3x the VRAM a comparable game would need, a few hundred megabits per second of network bandwidth to load all the assets from servers, a half dozen or so CPUs running flat out, and Vulkan to let you put data into the GPU from one thread while another thread is rendering.

So there will be some parallelism involved.

This is not like "web-scale" concurrency, which is typically a large number of mini-servers, each doing their own thing, that just happen to run in the same address space. This is different. There's a high priority render thread drawing the graphics. There's a update thread processing incoming events from the network. There are several asset loading and decompression threads, which use up more CPU time than I'd like. There are about a half dozen other threads doing various miscellaneous tasks - handling moving objects, updating levels of detail, purging caches, and such.

There's considerable locking, but no "static" data other than constants. No globals. Channels are used where appropriate to the problem. The main object tree is single ownership, and used mostly by the update thread. Its links to graphics objects are Arc reference counted, and those are updated by both the update thread and the asset loading threads. They in turn use reference counted handles into the Rend3 library, which, via WGPU and Vulkan, puts graphics content (meshes and textures) into the GPU. Rendering is a loop which just tells Rend3 "Go", over and over.

This works out quite well in Rust. If I had to do this in C++, I'd be fighting crashes all the time. There's a reason most of the highly publicized failed metaverse projects didn't reach this level of concurrency. In Rust, I have about one memory related crash per year, and it's always been in someone else's "unsafe" code. My own code has no "unsafe", and I have "unsafe" locked out to prevent it from creeping in. The normal development process is that it's hard to get things to compile, and then it Just Works. That's great! I hate using a debugger, especially on concurrent programs. Yes, sometimes you can get stuck for a day, trying to express something within the ownership rules. Beats debugging.

I have my complaints about Rust. The main ones are:

- Rust is race condition free, but not deadlock free. It needs a static deadlock analyzer, one that tracks through the call chain and finds that lock A is locked before lock B on path X, while lock B is locked before path A on path Y. Deadlocks, though, tend to show up early and are solid problems, while race conditions show up randomly and are hard to diagnose.

- Async contamination. Async is all wrong when there's considerable compute-bound work, and incompatible with threads running at multiple priorities. It keeps creeping in. I need to contact a crate maintainer and get them to make their unused use of "reqwest" dependent on a feature, so I don't pull in Tokio. I'm not using it, but it's there.

- Single ownership with a back reference is a very common need, and it's too hard to do. I use Rc and Weak for that, but shouldn't have to. What's needed is a set of traits to manage consistent forward and back links (that's been done by others) and static analysis to eliminate the reference counts. The basic constraints are ordinary borrow checker restrictions - if you have mutable access to either parent or child, you can't have access to the other one. But you can have non-mutable access to both. If I had time, I'd go work on that.

- I've learned to live without objects, but the trait system is somewhat convoluted. There's one area of asset processing that really wants to be object oriented, and I have more duplicate code there than I like. I could probably rewrite it to use traits more, but it would take some bashing to make it fit the trait paradigm.

- The core graphics crates aren't finished. There was an article on HN a few days ago about this. "Rust has 5 games and 50 game engines". That's not a language problem, that's an ecosystem problem. Not enough people are doing non-toy graphics in Rust. Watch my video linked below.[1] Compared to a modern AAA game title, it's not that great. Compared to anything else being done in Rust (see [2]) it's near the front. This indicates a lack of serious game dev in Rust. I've been asked about this by some pro game devs. My comment is that if you have a schedule to meet, the Rust game ecosystem isn't ready. It's probably about five people working for a year from being ready.

[1] https://video.hardlimit.com/w/tp9mLAQoHaFR32YAVKVDrz

[2] https://gamedev.rs/

Dinux 2 years ago | |

We've been building our robotic simulators in Rust for the past 3 years and I have the exact same experience. So far, I think, we've encountered maybe 5 actual runtime bugs over the last 3 years. Sure rust has some problems and yes the async isn't fully there yet, but overal the benefits outweigh the problems.

jvanderbot 2 years ago | | |

Async as a paradigm seems so against what GP was discussing. If I understood, and from my experience, we're talking more about concurrent execution with carefully-designed priorities, locks, and timing requirements. This is closer to the embedded / systems-level concurrency, if I understand it right. Are we really expecting a coroutine/ async style to just lift into this world?

NalNezumi 2 years ago | | |

Out of curiosity, is this robotics simulator open source/available?

JoshTriplett 2 years ago | |

> Rust is race condition free, but not deadlock free. It needs a static deadlock analyzer, one that tracks through the call chain and finds that lock A is locked before lock B on path X, while lock B is locked before path A on path Y.

That sounds like a great idea. Something in the style of lockdep, that (when enabled) analyzes what locks are currently held while any other lock is taken, and reports any potential deadlocks (even if they haven't actually deadlocked).

That would require some annotation to handle cases of complex locking, so that the deadlock detection knows (for instance) that a given class of locks are always obtained in address order so they can't deadlock. But it's doable.

elabajaba 2 years ago | | |

There's tracing-mutex that builds a dag of your locks when you acquire them and panics (at runtime) if it could deadlock: https://github.com/bertptrs/tracing-mutex

parking_lot has a deadlock detection feature for when you deadlock that iirc tells you what deadlocked (so you're not trying to figure it out with a debugger and a lot of time) https://amanieu.github.io/parking_lot/parking_lot/deadlock/i...

I also just found out about https://github.com/BurtonQin/lockbud which seems to detect deadlocks and a few other issues statically? (seems to require compiling your crate with the same version of rust as lockbud uses, which from the docs is an old 1.63 nightly build?)

wrsh07 2 years ago | | |

Google has acquired before: https://abseil.io/docs/cpp/guides/synchronization#thread-ann...

It's quite nice, but for cpp not rust

nine_k 2 years ago | | |

I wonder if locks may have some thread-local registry, at least in debug builds.

If locks can be numbered or otherwise ordered, it would be easy to enforce a strict order of taking locks and an inverse strict order of releasing them, by looking up in the registry which locks your thread is currently holding. This would prevent deadlocks.

This, of course, would require to have an idea of all the locks you may want to hold, and their relative order (at least partial), as Dijkstra described back in the day. But thinking about locks is ahead of time is a good idea anyway.

winrid 2 years ago | |

I'm doing basically the same thing in Java for an MMO and the JDK makes it so easy. Just move objects via concurrent queues from network to model creation to UI threads. It's actually quite boring, and fast!

Animats 2 years ago | | |

Is there video or a demo?

ttfkam 2 years ago | | |

Non-blocking I/O is quite mature on Java, and it shows. Unfortunately Java is still a rabid devourer of memory. Its RAM consumption tends to be the biggest con whenever evaluating the pros of using Java. Sometimes it's worth it. More and more often it's not anymore.

elcritch 2 years ago | |

Good luck on the metaverse app! I'd love to see more interesting metaverse takes.

One quibble though. Rust isn't race condition free, it's data race free. You can still end up with race conditions outside of data access. https://news.ycombinator.com/item?id=23599598

hedora 2 years ago | |

> Async is all wrong when there's considerable compute-bound work, and incompatible with threads running at multiple priorities

The priority thing is relatively easy to fix:

Either create multiple thread pools, and route your futures to them appropriately.

Or, write your own event loop, and have it pull from more than one event queue (each with a different priority).

It should be even easier than that, but I don’t know of a crate that does the above out of the box.

One advantage of the second approach is (if your task runtime is bounded) that you can have soft realtime guarantees for high priority stuff even when you are making progress on low priority stuff and running at 100% CPU.

astrange 2 years ago | | |

This doesn't help with priority inversions; since you don't know who is waiting on a future/promise until it starts waiting on it, you can't resolve them until then, which means you can have work running at too low a priority. It's not structured enough.

o11c 2 years ago | |

> Single ownership with a back reference is a very common need, and it's too hard to do.

I've been collecting a list[1] of what memory-management policies programmers actually want in their code; it is far more extensive than any particular language actually implements. Contributions are welcome!

I already had back reference on the list, but added some details. When the ownership is indirect (really common) it is difficult to automate.

One thing that always irritates me: Rust's decision to make all objects moveable really hurts it at times.

[1] https://gist.github.com/o11c/dee52f11428b3d70914c4ed5652d43f...

Animats 2 years ago | | |

Yes, back-linked objects are probably going to have to be pinned.

sillysaurusx 2 years ago | |

Cheering for your metaverse app. Hope to hear more about it. I suspected you might be doing gamedev but this is the first time you’ve shown extensive work.

One challenge with rust is that (for better or worse) most gamedev talent is C++. If you ever open source it I’d be interested in contributing, though I’m not sure how effective the contributions would be.

Good luck!

Animats 2 years ago | | |

Email sent.

I'm not that interested in self-promotion here as I am in getting more activity on Rust graphics development. I think the Rust core graphics ecosystem needs about five good graphics people for a year to get unstuck. Rust is a good language for this sort of thing, but you've got to have reliable heavy machinery down in the graphics engine room.

Until that exists, nobody can bet a project with a schedule and a budget on Rust. The only successful commercial high-detail game title I know of that uses Rust is a sailing race simulator. They simply linked directly to "good old DX11" (Microsoft Direct-X 11) and wrote the game logic in Rust. Bypassed Rust's own 3D ecosystems completely.

0xDEF 2 years ago | |

    "I've learned to live without objects, but the trait system is somewhat convoluted. There's one area of asset processing that really wants to be object oriented, and I have more duplicate code there than I like. I could probably rewrite it to use traits more, but it would take some bashing to make it fit the trait paradigm."

Can you expand on this? I come from the C# world and the Rust trait system feels expressive enough to implement the good parts of OOP.

bombela 2 years ago | | |

I understand this not as objects are missing, after all, struct with methods and traits are objects aren't they? But more like the lack of hierarchical inheritance, that is most often used in OOP to conveniently share common code with added specialization. Override only the methods you want. You can do it with Traits of course, but it's much more verbose. You can technically use the defer trait to simulate a sort of method inheritance, but it is frowned upon as it should be reserved for smart-pointer like objects (so the doc says).

throw10920 2 years ago | |

> Async contamination

I've always wondered why the "color" of a function can't be a property of its call site instead of its definition. That would completely solve this problem - you declare your functions once, colorlessly, and then can invoke them as async anywhere you want.

lmm 2 years ago | | |

> I've always wondered why the "color" of a function can't be a property of its call site instead of its definition. That would completely solve this problem - you declare your functions once, colorlessly, and then can invoke them as async anywhere you want.

If you have a non-joke type system (which is to say, Haskell or Scala) you can. I do it all the time. But you need HKT and in Rust each baby step towards that is an RFC buried under a mountain of discussion.

ditsuke 2 years ago | | |

The rust guys are working on this very problem with the keyword generics proposal https://blog.rust-lang.org/inside-rust/2022/07/27/keyword-ge...

kprotty 2 years ago | | |

If a function calls something that does something async, that can't be evaluated synchronously due to 1) no setup; could be async IO and require being called in the context of an async runtime (library feature, not language feature) and 2) blocking synchronously on an async task in an async runtime can result in deadlocks from task waiting on runtime IO polling but the waiting preventing the runtime from being polled.

kevincox 2 years ago | | |

I most runtimes you can just call something like `block_on`. There are some things to be careful about to avoid starving other takes but most general-purpose runtimes will spawn more threads as needed. Similarly blocking in an asynx task is generally not much of an issue for these runtimes for the same reasons.

It isn't like JavaScript where there is truly only one thread of execution at a time and blocking it will block everything.

sznio 2 years ago | | |

`std::thread::spawn()` and `.join()` are the ultimate async implementation.

RetpolineDrama 2 years ago | |

>The normal development process is that it's hard to get things to compile, and then it Just Works. That's great! I hate using a debugger, especially on concurrent programs. Yes, sometimes you can get stuck for a day, trying to express something within the ownership rules. Beats debugging.

This is a far superior workflow when you factor in outcomes. More up front time to get a "correct"/more-reliable output scales infinitely better than than churning out crap that you need to wrap in 10,000 lines of tests to keep from breaking/validate (See: the dumpster-fire that is Rails)

Someone 2 years ago | | |

> This is a far superior workflow when you factor in outcomes.

I’m a strong-typing enthousiast, too, but still, I’m not fully convinced that’s true.

It seems you can’t iterate fast at all in Rust because the code wouldn’t compile, but can iterate fast in C++, except for the fact that the resulting code may be/often is unstable.

If you need to try out a lot of things before finding the right solution, the ability to iterate fast may be worth those crashes.

Maybe, using C++ for fast iterations, and only using various tools to hunt down issues the borrow checker would catch on the iteration you want to keep beats using Rust.

Or do Rust programmers iterate fast using unsafe where needed and then fix things once they’ve settled on a design?

Animats 2 years ago | | |

I tend to agree, but pro game dev is a hell where people demand that a new feature be demoed for the producer by 1 PM tomorrow. I have the luxury of not being under such pressure.

jstx1 2 years ago | |

> Yes, sometimes you can get stuck for a day, trying to express something within the ownership rules.

This is a big problem. Fast iteration time is very valuable.

And who likes doing this to themselves anyway? Isn't it a very frustrating experience? How is this the most loved language?

ghosty141 2 years ago | | |

> And who likes doing this to themselves anyway? Isn't it a very frustrating experience? How is this the most loved language?

The thing is, these dependencies do exist no matter what language you use if they stem from an underlying concept. In that case rust just makes you explicitly write them which is a good thing since in C++ all these dependencies would be more or less implicit and everytime somebody edits the code he needs to think all these cases through and get a mental model (if he sees it at all!). In Rust you at least have the lifetime annotations which make it A: obvious there is some special dependency going on and B: show the explicit lifetimes etc.

So what I'm saying, you need to put in this work no matter which language you choose, writing it down is then not a big problem anymore. If you don't think about these rules your program will probably work most of the time but only most of the time, and that can be very bad for certain scenarios.

throw10920 2 years ago | | |

> How is this the most loved language?

Personal preference and pain tolerance. Just like learning Emacs[1] - there's lots of things that programmers can prioritize, ignore, enjoy, or barely tolerate. Some people are alright with the fact that they're prototyping their code 10x more slowly than in another language because they enjoy performance optimization and seeing their code run fast, and there's nothing wrong with that. I, myself, have wasted a lot of time trying to get the types in some of my programs just right - but I enjoy it, so it's worth it, even though my productivity has decreased.

Plus, Rust seems to have pushed out the language design performance-productivity-safety efficiency frontier in the area of performance-focused languages. If you're a performance-oriented programmer used to buggy programs that take a long time to build, then a language that gives you the performance you're used to with far fewer bugs and faster development time is really cool, even if it's still very un-productive next to productivity-oriented languages (e.g. Python). If something similar happened with productivity languages, I'd get excited, too - actually, I think that's what's happening with Mojo currently (same productivity, greater performance) and I'm very interested.

[1] https://news.ycombinator.com/item?id=37438842

pshc 2 years ago | | |

Debugging rare crashes and heisenbugs is more frustrating, and in non-safe languages, a chronic problem.

Whereas after you prove the safety of a design once, it stays with you.

otikik 2 years ago | | |

The “beats debugging” part I took it as meaning “it is better than spending that day debugging”.

I have fought the ownership rules and lost (replaced references by integers to a common vector-ugly stuff, but I was time constrained). But I have seen people spend several weeks debugging a single problem, and that was really soul-crushing.

devjab 2 years ago | | |

I think you may be misunderstanding what GP means. It's about spending a day working on issues. You're either doing it before you launch your iteration, or you're doing it after. GP thinks it's better to spend the time before you push the change. From a quality perspective it's hard to see how anyone could disagree with that, but I can certainly see why there would be different preferences from programmers.

I don't personally mind debugging, too much, but if your goal is to avoid bugs in your running software, then Rust has some serious advantages. We mainly use TypeScript to do things, which isn't really comparable to Rust. But we do use C when we need performance, and we looked into Rust, even did a few PoCs on real world issues, and we sort of ended up in a situation similar to GP. Rust is great though a bit "verbose" to write, but its eco-system is too young to be "boring" enough for us, so we're sticking with C for the time being. But being able to avoid running into crashes by doing the work before your push your code is immensely valuable in fault-intolerant systems. Like, we do financial work with C, it cannot fail. So we're actually still doing a lot of the work up-front, and then we handle it by rigorously testing everything. Because it's mainly used for small performance enhancement, our C programs are small enough to where this isn't an issue, but it would be a nightmare to do with 40.000 lines of C code.

kelnos 2 years ago | | |

I agree that fast iteration time is valuable, but I don't think this has to hold 100% of the time.

I would much rather bang my head against a compiler for N hours, and then finally have something that compiles -- and thus am fairly confident works properly -- than have something that compiles and runs immediately, but then later I find I have to spend N hours (or, more likely, >N hours) debugging.

Your preferences may differ on this, and that's fine. But in the medium to long term, I find myself much more productive in a language like Rust than, say, Python.

bigyikes 2 years ago | |

I’m working on an unrelated project that does some stuff similarly to you. I’m at 4k lines right now.

Just wondering, how long did it take you to hit 40k lines? I’m a new Rust developer and it’s taken me ages to get this far.

I totally relate to your experience though. When I finally get my code to compile, it “just works” without crashes. I’ve never felt so confident in my code before.

Animats 2 years ago | | |

> how long did it take you to hit 40k lines?

3 years.

jksmith 2 years ago | | |

>When I finally get my code to compile, it “just works” without crashes. I’ve never felt so confident in my code before.

This isn't a new idea for a desirable state. Same experience with Modula-2 three decades ago. A page or more of compiler errors to clear, then suddenly adiabatic. A very satisfying experience.

benreesman 2 years ago | |

I don’t know what you mean by web-scale, you’d be mistaken if you meant “the multi-threaded services that power giant internet properties”.

If you want extreme low contention extreme high-utilization, you’re doing threading and event-driven simultaneously, there are no easy answers on heavily contended data structures because you can’t duplicate to exploit immutability if mere insane complexity is an alternative, and mistakes cost millions in real time.

There’s a reason why those places scout so heavily for the lock-free/wait-free galaxy brains the minute they finish their PhDs.

Ygg2 2 years ago | |

> There was an article on HN a few days ago about this. "Rust has 5 games and 50 game engines".

That's not a serious article. That's a humourous video.

Source: https://youtu.be/TGfQu0bQTKc?t=169

vacuity 2 years ago | | |

It has some truth to it, still.

yazaddaruvala 2 years ago | |

Out of curiosity, why not go all in on Tokio? Make everything a future, including writing to the GPU.

And are you using an ECS based architecture? Do you feel you’d have a different opinion if you were?

ay 2 years ago | |

As a past active SecondLife user back in the day (circa 15 years ago), and a short-stunt OpenSimulator dev, I had been thinking a lot about how much better SecondLife could be if it had the modern tech absorbed - thanks for doing this! :-) I did a short return to try SL recently, and the lagginess of the viewer made me sad.

Is there a ML to subscribe to, to learn when the viewer is more generally available for testing? Thanks again!

rhaps0dy 2 years ago | |

What’s the server for a metaverse client? Is there a standardized protocol, or a particularly popular one you’re targeting?

Animats 2 years ago | | |

It's a client for Second Life or Open Simulator.

LukaD 2 years ago | |

Rust is not race condition free, it guarantees no data races though.

jonstewart 2 years ago | |

This is very interesting. How do you manage latency of events coming over the network?

Do... you... wind up having to set TCP_NODELAY?

•͡˘㇁•͡˘

Animats 2 years ago | | |

Embarrassingly, yes, because I can't turn off delayed ACKs from Rust.

gigel82 2 years ago | |

> If I had to do this in C++, I'd be fighting crashes all the time.

Why? I'd take modern C++ over Rust every day of the week.

m4tthumphrey 2 years ago | |

> As I've mentioned before, I'm writing a high performance metaverse client.

Why? (Serious question)

ipaddr 2 years ago | | |

Started 3 years ago during covid when metaverse looked attractive. In 3 years many of these AI applications will face the same questions.

Reticularas 2 years ago | |

What you've just described is basically every networked video game, the majority of which are happily running via c++.

(Plus some increase in content load over the network, which does exist ala runtime mod loading, streaming, etc)

pjc50 2 years ago | | |

Yes? Not architecturally different, but with fewer bugs. People are always complaining about bugs in videogames.

silisili 2 years ago | |

Looks great!

Without judgment I must ask, what made you decide to target metaverse specifically? Is it more of a fun challenge, or do you see it having a bright/popular future?

lordnacho 2 years ago |

I find myself in this weird corner when it comes to async rust.

The guy's got a point in that doing a bunch of Arc, RwLock, and general sharing of state is going to get messy. Especially once you are sprinkling 'static all over the place, it infects everything, much like colored functions. I did this whole thing once back when I was starting off where I would Arc<RwLock> stuff, and try to be smart about borrow lifetimes. Total mess.

But then rust also has channels. When you read about it, it talks about "messages", which to me means little objects. Like a few bytes little. This is the solution, pretty much everything I write now is just a few tasks that service some channels. They look at what's arrived and if there's something to output, they will put a message on the appropriate channel for another task to deal with. No sharing objects or anything. If there's a large object that more than one task needs, either you put it in a task that sends messages containing the relevant query result, or you let each task construct its own copy from the stream of messages.

And yet I see a heck of a lot of articles about how to Arc or what to do about lifetimes. They seem to be things that the language needs, especially if you are implementing the async runtime, but I don't understand why the average library user needs to focus so much on this.

grug_htmx_dev 2 years ago |

Yes, async is effectively a much harder version of Rust, and it's regrettable how it's been shoved down the throats of everyone, while only 1% of projects using it really need it. Hover, async is also amazing in these 1% of cases when it's useful.

If you have a service that handles massive amounts of network calls at the core (think linkerd, nginx, etc.), or you want to have a massive amount of lightweight tasks in your game, or working on an embedded software where you want cooperative concurrency, async Rust is an amazing super-power.

Most system/application level things is not going to need async IO. Your REST app is going to be perfectly fine with a threadpool. Even when you do need async, you probably want to use it in a relatively small part of your software (network), while doing most of the things in threads, using channels to pass work around between async/blocking IO parts (aka hybrid model).

Rust community just mindlessly over-did using async literally everywhere, to the point where the blocking IO Rust (the actually better UX one) became a second class citizen in the ecosystem.

Especially visible with web frameworks where there is N well designed async web frameworks (Axum, Wrap, etc.) and if you want a blocking one you get:

  tiny_http, absolute bare bones but very well done
  rouille - more wholesome, on top of tiny_http, but APIs feel very meh comparing to e.g. Axum
  astra - very interesting but immature, and rather barebones

lionkor 2 years ago |

Async is also spread through so many crates that your program will have to be async in its entirety, or at least depend on the tokio crate for a lot of things. Want a web server? Async + tokio or gtfo. Want an sql connector? You better write your own, unless you want async. Each with a different solution to the various problems async brings -- and dont even get me started on async closures and such shit, thats where hell pokes through the earth and does unholy things your compiler.

I enjoy Rust, and I love how the compiler helps me solve problems. However, the ecosystem is "async or gtfo", or "just write it yourself if you dont want async lmao", and that's not good enough.

Sytten 2 years ago | |

A lot of that pain could have been avoided if the language had better primitives for async in the std or in the futures crate. Like a trait that executor must implement and a "default" blocking executor to execute async code from sync.

Right now even building a library that support multiple async runtimes is a PITA, I have done it a couple times. So you end up supporting either just tokio and maybe async-std.

half-kh-hacker 2 years ago | | |

so it's clear to non-Rust devs, we do have basic primitives for "running async code from sync":

https://docs.rs/futures/latest/futures/executor/fn.block_on....

imagine you have an:

    async fn do_things() -> Something { /* ... */ }

you can:

    use futures::executor::block_on;
    fn my_normal_code() {
      let something = block_on(do_things());
    }

but this does get messy if the async code you're running isn't runtime-agnostic :(

JoshTriplett 2 years ago | | |

> A lot of that pain could have been avoided if the language had better primitives for async in the std or in the futures crate. Like a trait that executor must implement and a "default" blocking executor to execute async code from sync.

This is one of the goals of the async working group. Hopefully, when ready, that'll make it possible to swap out async runtimes underneath arbitrary code without issues.

bionhoward 2 years ago |

Admittedly, I’m no expert in async rust, but I’ve written several thousand lines of sync rust this month. One thing I’ve found is when rustc makes a particular approach hard to implement, it usually does so for a good reason (i.e. there is a better way to achieve a similar result).

If you’re learning the language, I would suggest starting out with some more vanilla sync code, loops and if statements, get used to the borrowing. Async is clearly still under heavy development, and not just from an implementation level, but also from the level of our philosophical paradigm about what async means and how it ought to work for the user. It’s entirely possible for humanity to have the wrong approach to this issue and maybe someone in this discussion will be able to answer it more effectively.

The compiler really depends on traits, and the ability for traits to handle async is not stable. Many highly intelligent people are hard at work thinking about how to make async rust more correct, readable, and accessible. For example, look here: https://blog.rust-lang.org/inside-rust/2022/11/17/async-fn-i...

I would argue, if the async functionality of traits is not stable in rust, then it is silly for us to attack rust for not having nice async code, because we’re effectively criticizing an early rough draft of what will eventually be a correct and performant and accessible book.

Aurornis 2 years ago |

> Used pervasively, Arc gives you the world’s worst garbage collector. Like a GC, the lifetime of objects and the resources they represent (memory, files, sockets) is unknowable. But you take this loss without the wins you’d get from an actual GC!

The lifetime of an Arc isn’t unknowable, it’s determined by where and how you hold it.

I think maybe the disconnect in this article is that the author is coming at Rust and trying to force their previous mental models on to it (such as garbage collection) rather than learning how to work with the language. It’s a common trap for anyone trying a new programming language, but Rust seems to trip people up more than most.

mmastrac 2 years ago |

I love Rust, but async is a hot mess and you cannot just write async code the same way that you write sync code. I'm getting more convinced that mixing the two is a bad idea, and that Go's approach of making everything sync with a single async channel primitive might be right.

I'm currently plumbing through some logic to call a sync method on a struct that implements Future and it's... an interesting challenge.

While we can make zero-cost async abstractions somewhat easy for users, the library developers are the ones who suffer the pain.

fzeindl 2 years ago |

Async Everything is a bad language.

Async/await was a terrible idea for fixing JavaScript's lack of proper blocked threading that is currently being bolted onto every language. It splits every language and every library-ecosystem in half and will cause pains for many years to come.

Everyone who worked with multi-threading outside of JavaScript knows that using actors/communicating sequential processes is the best way to do multi-threading.

I recently found an explanation for that in Joe Armstrong's thesis. He argues that the only way to understand multi-threaded programs is writing strictly sequential code for every thread and not muddling all the code for all the threads in one place:

"The structure of the program should exactly follow the structure of the problem. Each real world concurrent activity should be mapped onto exactly one concurrent process in our programming language. If there is a 1:1 mapping of the problem onto the program we say that the program is isomorphic to the problem.

It is extremely important that the mapping is exactly 1:1. The reason for this is that it minimizes the conceptual gap between the problem and the solution. If this mapping is not 1:1 the program will quickly degenerate, and become difficult to understand. This degeneration is often observed when non-CO languages ["non concurrency-oriented", looking at you JavaScript!] are used to solve concurrent problems. Often the only way to get the program to work is to force several independent activities to be controlled by the same language thread or process. This leads to a inevitable loss of clarity, and makes the programs subject to complex and irreproducible interference errors." [0]

[0] https://erlang.org/download/armstrong_thesis_2003.pdf

There is also a good rant against async/await by Ron Pressler who implemented project loom in java: https://www.youtube.com/watch?v=oNnITaBseYQ

TensorTinkerer 2 years ago |

While the article elucidates well on the intricacies and challenges of async Rust, I feel it's crucial to note that one of Rust's core philosophies is ensuring memory safety without sacrificing performance.

The async patterns in Rust, especially with regards to data safety assurances for the compiler, are emblematic of this philosophy. Though there are complexities, the value proposition is a safer concurrency model that requires developers to think deeply about their data and execution flow. I do concur that Rust might not be the go-to for every massively concurrent userspace application, but for systems where robustness and safety are paramount, the trade-offs are justifiable. It's also worth noting that as the ecosystem evolves, we'll likely see more abstractions and libraries that ease these pain points.

Still, diving into the intricacies as this article does, gives developers a better foundational understanding, which in itself is invaluable.

hedora 2 years ago |

I've been writing a lot of async lock free rust. The main problem is that tokio futures are 'static, which is because of a design mistake that's baked deep into the rust ecosystem: Leaking memory is 'safe'.

This implies that you can't statically guarantee that a future is cleaned up properly, which means that if you spawn some async work, something may std::mem::forget a future, and then the borrow checker won't know that the references that were transitively handed out by the future are still live.

Rather than sprinkle Arc everywhere, I just use an unsafe crate like this:

https://docs.rs/async-scoped/latest/async_scoped/

This catches 99% of the bugs I would have written in C++, so it's a reasonable compromise. There's been some work to try to implement non-'static futures in a safe way. I'm hoping it succeeds.

The other big problem with rust (but this is on the roadmap to be fixed this year) is that async trait's currently require Box'ed futures, which adds a malloc/free to function call boundaries(!!!)

As for the "just use a channel" advice: I've dealt with large codebases that are structured this way. It explodes your control flow all over the place. I think of channels as the modern equivalent of GOTO. (I do use them, but not often, and certainly not in cases where I just need to run a few things in parallel and then wait for completion.)

divyekapoor 2 years ago |

We do not want red and blue functions. Any language that implements async / await as coroutines instead of green threads is making a fundamental CS mistake. https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...

Concurrency's correct primitive is Hoare's Communicating Sequential Processes mapped onto green threads. Some languages that have it right are Java (since JDK17 - Java Virtual Threads), Go, Kotlin.

notacoward 2 years ago |

I've used all of the models mentioned. Personally I've found async/await to be the most annoying. However, I think there's an important option that's overlooked here. For many of the domains where these approaches tend to fall down hardest, notably network servers with very many connections (I was cited in the C10K paper on this subject BTW), arenas can absorb a lot of the annoying lifetime issues. They can be per-request, per-connection, per-something else, but however you do it is likely to leave you with a much smaller set of objects and lifetimes that can be managed by other means. Arenas for most things plus a few very well-isolated bits of code to handle the remainder worked for me from at least 1992 through 2017 in code bases up to more than a million lines (yes, highly concurrent the whole way) without having to deal with async/await or promises/futures. Maybe a bit of CSP/actor model here and there, but that's it.

miguelmurca 2 years ago |

This is bound to get some criticism (or some tangent-at-best discussion), but it seems like a pretty fair discussion to me.

What I'm missing at the end of the article is the author's point: I believe they're advocating for the use of raw threads and manual management of concurrency, and doing away with the async paraphernalia. But, at the same time, earlier in the article they give the example of networking-related tasks as something that isn't so easy to deal with using only raw threads.

So, taking into account that await&co. are basically syntactic sugar + an API standard (iirc, I haven't used Rust so much lately), I wonder about what the alternative is. In particular, it seems to me like the alternative you could have would be everyone rolling their own "concurrency API", where each crate (inconsistently) exposes some sort of `await()` function, and you have to manually roll your async runtime every time. This would obviously also not be ideal.

nemetroid 2 years ago | |

I thought the author's point was relatively clear: Rust might not be a good fit for the kind of tasks that need more concurrency than raw threads can give you. Such programs should be written in some other language instead.

> Maybe Rust isn’t a good tool for massively concurrent, userspace software. We can save it for the 99% of our projects that don’t have to be.

hot_gril 2 years ago | | |

So 99% of projects need raw threads only, according to the author. I doubt that.

lowbloodsugar 2 years ago | |

I thought his point was async is not good for apps with lots of work to do, and that green threads are a much better idea. IDK.

marcosdumay 2 years ago | |

The author's point is that Rust is not a good language for software like that example. But very, very little software is like that, and you can always divide it up in large blocks inside of what Rust fits quite well.

Personally, I'm a bit more radical than the author. You won't be able to write software like the example correctly. It should just not be done, ever. Machines can still optimize some sanely organized software into the same thing, maybe, if it happens to be a tractable problem (I'm not sure anybody knows). But people shouldn't touch that thing.

Taek 2 years ago |

Rust made a critical safety mistake when it chose its async paradigm. It gave the code the option to decide when to yield.

What that means is that when I'm writing async code, I have to audit every library I import to make sure that library is guaranteed to yield after a few microseconds of execution, otherwise my own core loops starve. Importing unknown code when using async rust is not safe for any application that needs to know its own threads won't starve.

A safe async language must guarantee that threads will make progress. Rust should change the scheduler so that it can pre-empt any code after that code has hogged a thread for too long.

SkiFire13 2 years ago | |

> Rust should change the scheduler

Rust doesn't have a scheduler, and having one would be a no-go for any sufficiently low level code (e.g. in microcontrollers).

zamalek 2 years ago | |

> A safe async language must guarantee that threads will make progress.

You might be looking for parallelism, not concurrency.

brickteacup 2 years ago | | |

No? You don't need parallelism to guarantee global progress as long as the scheduler has the ability to preempt tasks. Of course coroutines (as opposed to e.g. userspace threads) can't really be preempted, which is the issue here.

kasdi 2 years ago | |

Naive question: Why not use OS concurrency then?

ilyt 2 years ago |

Promise/Future style of async is just a bad idea regardless of language.

It was used because of ineptitude of languages where it become popular, and its far easier to implement into GC-less languages than message-passing-based asynchronous, but it's just misery to write code in. I'd prefer to suffer Go ineptitudes just to use bastardised message passing called channels there rather than any of the Python/JS/Rust async.

marcosdumay 2 years ago | |

Yes. That atomized model of concurrency where your state goes everywhere and you somehow collect it back at some point was always (literally) the textbook example of how not to do it.

It was created to be an improvement over the Javascript situation, and somehow every language that had a sane structure adopted it as if it was not only good, but the way to do things. This is insane.

hot_gril 2 years ago | | |

And yet, people are going to use async in Rust. The feature has already proven itself useful long ago in other languages, beyond the timespan a fad could survive. Everyone started out doing it the other way and got sick of it.

seabrookmx 2 years ago | | |

> It was created to be an improvement over the Javascript situation

I see this repeated everywhere in this thread. async/await originated in C# not JS.

riku_iki 2 years ago | | |

> concurrency where your state goes everywhere and you somehow collect it back at some point was always (literally) the textbook example of how not to do it.

can you tell why it is not how not to do it in your opinion? What are the obvious issues with this approach?

smallerfish 2 years ago | |

> Promise/Future style of async is just a bad idea

JVM's futures are a joy to work with compared to JS's promises (or Kotlin's coroutines for that matter). While similar, I don't think you can conflate them.

SkiFire13 2 years ago | |

You're conflating the idea of using channels with green threads. They are different, you can easily use channels with async/await and global state/mutexes with green threads.

snailscale 2 years ago |

I somewhat agree with the author, sometimes with async rust I need to figure out how to tell the compiler that yes I want to recursively call this async function. This can be a huge pain, especially because it’s not always clear what went wrong.

Other times however rust stops me from writing buggy code and where I didn’t quite understand what I was doing. In some sense it can help you understand what your software better (when the problem isn’t an implementation detail).

I get the authors frustration, I often have the same feelings. Sometimes you just want to tell rust to get out of your way.

As an aside, I think there is room for a language similar to golang with sum types and modules and be a joy.

zupa-hu 2 years ago | |

What do you mean by modules?

dekhn 2 years ago |

I wish people would stop saying concurrency and parallelism are different.

Concurrency is a subtype of parallelism. All concurrency is parallelism, but leaving some aspects of parallelism off the table.

I've worked in both worlds: I've built codes that manage thousands of connections through the ancient select() call on single processes (classic concurrency- IO multiplexing where most channels are not active simultaneously, and the amount of CPU work per channel is small) to synchronous parallelism on enormous supercomputers using MPI to eke out that last bit from Amdahl's law.

Over time I've come to the conclusion that a thread pool (possibly managed by the language runtime) that uses channels for communication and has optimizations for work stealing (to keep queues balanced) and eliminating context switches. Although it does not reach the optimal throughput of the machine (because shared memory is faster than message passing) it's a straightforward paradigm to work with and the developers of the concurrency/parallel frameworks are wise.

pdimitar 2 years ago | |

If that's what you need: it's Erlang / Elixir then.

vore 2 years ago |

Async Rust also ends up with these super nasty types involving Future that can't even be named half the time and you have to refer to them by existential types, like `impl Future<Output = Foo>`.

But these existential types can only be specified in function return or parameter position, so if you want to name a type for e.g.:

  let x = async { };

You can't! Because you can only refer to it as `impl Future<Output = ()>` but that's not allowed in a variable binding!

saghm 2 years ago | |

You're not wrong, but I don't really see the problem? Even well before async Rust, closures worked the same way with not being able to specify a concrete type, and `impl Trait` syntax didn't even exist for a while. Annotating local variable types is a way to fix certain things that would otherwise be ambiguous; it's a means to an end, only an end itself.

vore 2 years ago | | |

Ah that's true, but I think it ends up hairier when you combine the two together and have closures that are async, e.g.:

  let x = || -> i32 { 1 };  // fine
  let x = || -> impl Future<Output = i32> { async { 1 } };  // error: `impl Trait` only allowed in function and inherent method return types, not in closure return types

Unless I'm missing something, sometimes you do have to name the return type of an async closure if it's returning e.g Result<T, Box<dyn Error>>, and use of the ? operator means that the return type can't be inferred without an explicit annotation.

cmrdporcupine 2 years ago |

Now, I think async is bad syntactic sugar that hides what's really going on under the surface. And I rail against it all the time. Especially the way dropping in async contaminates code bases by building tendrils across call-sites all through the application. ... But the tools that have been built around it are very useful and there's some good stuff there.

I have some quibbles with this article:

"Rust comes at this problem with an “async/await” model"

No, it does not. It allows for that, and there's a big ... community ... around the async stuff, but in reality the language is entirely fine with operating using explicit concurrency constructs. And in fact for most applications I think standard synchronous functions combined with communicating channels is cleaner. I work in code bases that do both, and I find the explicit approach easier to reason about.

In the end, Async is something people ideally reach for only after they hit the wall with blocking on I/O. But in reality they're often reaching for it just because -- either because it's cool... or because some framework they are relying on mandates it.

But I think the pendulum will swing back the other way at some point. I don't think it's fair to tar the whole language with it.

uhura 2 years ago |

I don't get the "Maybe Rust isn’t a good tool for massively concurrent, userspace software" conclusion.

Rust is all about lifetimes and the borrow checker. Async code (a la C#) will introduce overhead to reason about lifetime and it might not be as "fun" as it is with other languages that makes use of GCs and bigger runtimes.

The CSP vs Async/Await discussion is valid, but like in the majority of the cases, the drawbacks and benefits are not language relevant.

In CSP, the concurrent snippets behave just like linear/sequencial code as channels abstracts await a lot of the ugly bits. Sequential code tends to be easier to reason and this might be very important for Rust considering it design.

A good tool for massively concurrent software will as expected depend on the aspects you're evaluating: - Performance: the text does not show benchmarks evaluating Rust as a slow language. - Code/Feature throughput: the overall conclusion from the text if that Async Rust is a complex tool and expose the programmers in many ways to shoot themselves in the foot.

Assuming the "Maybe Rust..." is only talking about Async Rust, the existence of big Async Rust projects is a good counter argument. We also have the whole rest of the Rust language to code massively concurrent, userspace software.

Massively concurrent, userspace software tends to be complex and big to the point that design decisions generally impact way more the language decision.

Rust is a modern language with interesting features to prevent programmers from writing unsafe programs and this is a good head start to many when making those kind of programs, more than whether you want to use Async code or not.

addisonj 2 years ago |

This is a pretty interesting article... and I generally agree with the pain points... but I don't really like the conclusion

* While the author states that not many apps "need" high concurrency in userspace... I would invert that and say that we may be missing so much performance, new potential applications, etc because highly concurrent code is so hard to get right. One bit of evidence of this (to me at least) is how often in my career I have had to scale things up due to memory or other resource limitations and not CPU. And when it is CPU, so often looking into it more finds bugs with concurrency that are the root cause or at least exacerbate the issue

* While I completely agree that rust is not easy with async and have myself poked around at which magical type things I need to do each time I have touched async rust code, I don't really like the suggestion being to "go use a different language", first, because if you are picking up rust, you (IMHO) should have a very good reason to already have chosen it. Rust is not easy enough or ubiquitous enough that you should be choosing it "just for fun" and your reason for using Rust should be compelling enough that you (right now) are willing to put in the effort to learn async when you need it

* What the other mentions in the body of the article, but I think is more of what my suggestion would be: don't use async unless you need it!. While I would love to see Rust (and think it should) evolve to the point where async is "easy", maybe we instead just need to get more pragmatic in what is taught and written about. I think when people start Rust they want to use all the fanciness, which includes async, and while some of that is just programmers, I think it is also how tutorials, docs, and general communication about a programming language happens where we show the breadth of capability, rather than the more realistic learning path, which leads people to feel like if they don't use async, they aren't doing it right

Finally, I do really hope Rust keeps working on the promise of these zero cost abstractions that can really simplify things... but if that doesn't work, I am at least hopeful of what people can build on top of the rust featureset/toolchain to help make things like async more realistic to be the default without the need for a complex VM/runtime.

jiggawatts 2 years ago |

I wonder if Rust should have gone down the same path as Java’s Project Loom and implemented async I/O using the same memory model that is used with operating system threads.

I suspect that to take advantage of 1024-thread systems the only sane programming model will be structured concurrency with virtual threads instead of coroutines.

It’s the same progression as we saw in the industry going from unstructured imperative assembly programming to structured programming with modular features.

Both traditional mutexes and to a degree async programming are unstructured and global. They infect the whole codebase and can’t be reasoned about in isolation. This just doesn’t scale.

za3faran 2 years ago | |

I believe rust started with green threads early on, before ditching them.

To your point, the C# guys seem to be interested in experimenting with green threads: https://twitter.com/davidfowl/status/1532880744732758018

neonsunset 2 years ago | | |

It was looked at and deemed an inferior design. Especially so given that existing async/await paradigm in .NET works really well with existing language features that would make adoption of green-threads-like approach problematic.

whoknowsidont 2 years ago |

Use Erlang/Elixir for orchestration and call into rust implementations.

It's an amazing combination.

technojamin 2 years ago | |

Elixir/Rust is the new Python/C++, and Rustler makes the communicating between the 2 languages super easy: https://github.com/rusterlium/rustler

balencpp 2 years ago | | |

Yup. I think it's because Elixir has powerful LISP-like meta-programming facilities that it allows the seamless communication Rustler and Zigler provide. I haven't seen anything as good as Rustler for Python despite its popularity.

travisgriggs 2 years ago | | |

And if Zig feels like it fits the gestalt of Elixir/Erlang better for you... then there's Zigler (https://hexdocs.pm/zigler/Zig.html). The fact that you can just "insert a little zig code right here" in the middle of your Elixir code in a ~Z sigil is the coolest darn thing there is. I haven't seen something that cool in the way of embedding performance enhancing fragments/snippets in another dynamic/expressive system since Klaus Gittinger's Smalltalk/X (https://live.exept.de/doc/online/english/programming/primiti...)

phendrenad2 2 years ago |

Java performance is really good, and I find Java development much faster than Rust. Also the JVM has all these awesome performance monitoring and profiling tools. Rust is unmatched for very high performance concurrency, but for most things, a garbage collected language is going to do just great.

pornel 2 years ago |

Most of the rant, apart from the old man yells at function colors, is about lifetimes of arguments of async functions. And it's presenting a special case as some kind of pervasive limitation.

Async functions don't have to always own their arguments. Just the outermost future that is getting spawned on another thread has to. The rest of the async program can borrow arguments as usual. You don't need to spawn() every task — there are other primitives for running multiple futures, with borrowed data, on the same thread.

In fact, this ability for a future to borrow from itself is the reason why Rust has native await instead of using callbacks. Futures can be "self-referential" in Rust, and nothing else is allowed to.

stallmanwasrigh 2 years ago |

> At this scale, threads won’t cut it—while they’re pretty cheap, fire up a thread per connection and your computer will grind to a halt.

Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.

10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.

efficax 2 years ago | |

The context switch for threads remains very expensive. You have 4,000 threads but that's lots of different processes spinning up their own threads. it's still more efficient to have one thread per core for a single computational problem, or at most one per CPU thread (often 2 threads per core now). You can test this by using something like rayon or GNU parallel using more threads than you have cores. It won't go faster, and after a certain point, it goes slower.

The async case is suited to situations where you're blocking for things like network requests. In that case the thread will be doing nothing, so we want to hand off the work to another task of some kind that is active. Green threads mean you can do that without a context switch.

lossolo 2 years ago | | |

> The context switch for threads remains very expensive

It got even more expensive in recent years after all the speculative execution vulnerabilities in CPUs, so now you have additional logic on every context switch with mitigations on in kernel.

marcosdumay 2 years ago | | |

Since that time, context switching changed from a O(log(n)) operation to an O(1) one.

I have no doubt that having a thread per core and managing the data with only non-blocking operations is much faster. But I'm pretty current machines can manage a thousand or so threads locked almost the entire time just fine.

mwcampbell 2 years ago | |

> Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.

So do we discard existing ways of making software more efficient because we can be more wasteful on more recent hardware? What if we could develop our software such that 2000s computers are still useful, rather than letting those computers become e-waste?

ripley12 2 years ago | |

Yes, I think you're generally right. I'm a big fan of this blog post: https://eli.thegreenplace.net/2018/measuring-context-switchi...

> The numbers reported here paint an interesting picture on the state of Linux multi-threaded performance in 2018. I would say that the limits still exist - running a million threads is probably not going to make sense; however, the limits have definitely shifted since the past, and a lot of folklore from the early 2000s doesn't apply today. On a beefy multi-core machine with lots of RAM we can easily run 10,000 threads in a single process today, in production. As I've mentioned above, it's highly recommended to watch Google's talk on fibers; through careful tuning of the kernel (and setting smaller default stacks) Google is able to run an order of magnitude more threads in parallel.

riku_iki 2 years ago | | |

so, in that benchmark, context switch is comparable to copying 64k mem, which is kinda significant, I run some heavy load database with few hundreds threads, and see that it does 100k context switching per sec some times.

Jtsummers 2 years ago | |

> 10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.

By the 2010s the problem had been updated to C10M. The people discussing it (well, perhaps some) aren't idiots and understand that the threshold changes as hardware changes.

Also, the issue isn't creating 10k threads it's dealing with 10k concurrent users (or, again, a much higher number today).

HippoBaro 2 years ago |

Async Rust is especially problematic in the enterprise world where large software is built out of micro-services connected through RPC.

Typically, if you want to build something with Rust, it'll have to use async, at least because gRPC and the like are implemented that way. So the vanilla (and excellent, IMO) Rust language doesn't exist there. Everything is async from the get-go.

jnxx 2 years ago | |

> Async Rust is especially problematic in the enterprise world where large software is built out of micro-services connected through RPC.

A weird way to use Rust since you can do a lot of messaging within the process, and use the computing power much more efficiently.

RPC is essentially messaging and message-passing. Message-passing is a way to avoid mutable shared state - this is the model with which Go became successful.

RPC surely has its use but message passing is another, and very often inferior, solution to the problem set where Rust has excellent own solutions for.

hot_gril 2 years ago | | |

RPCs are for separate services possibly operating on separate machines, where in-process message passing wouldn't work.

nu11ptr 2 years ago |

One of these days I really want to sit down and read a bunch of the takes on different async approaches by the Rust designers, and ponder the design choices in depth. Until then, I will defer judgement. I will say I prefer green threads as a user, but the common argument is that it is not a zero cost abstraction, and thus not appropriate for Rust, makes sense to me atm. I do wish async would get rid of its rough edges though (lack of async drop, async traits, scoped task, etc.) and at least become a first class citizen.

poorman 2 years ago |

Does anyone have links to resources that someone new to Rust should read on how to conceptualize async code in Rust? Based on reading the comments, it would seem there are ways to start with writing synchronous code, and if necessary make it async but do it in such a way that is runtime agnostic... don't just reach for Tokio.

If I'm implementing a library, how should I write it so that the consumer of the library doesn't have to pull in Tokio if they don't want to?

mightyham 2 years ago |

I really can't even begin to empathize with the author. I find async/await to be significantly more ergonomic then the alternative in almost every use case.

The arguments about Arc fall flat because how else would you safely manage shared references, even in other lower level languages. And so called "modern GCs" still do come with a significant hit in performance; it's not just some "bizarre psyop".

Really the only problem I've run into with Rust's async/await is the fact that there is not much support for composing async tasks in a structured way (i.e. structured programming) and the author doesn't even touch on this issue.

Ultimately the goals and criticism of the author are just downright confusing because at the end he admits that he doesn't actually care for the fact that Rust is design constrained by being a low level language and instead advocates for using Haskell or Go for any application that requires significant concurrency. So to reformulate his argument: we should never use or design into low level languages an ergonomically integrated concurrency runtime because it may have a handful of engineering challenges. When put concisely, their thesis is really quite ridiculous.

drogus 2 years ago | |

I find this post extremely weird. The author starts with saying sharing state is bad and then continues to complain sharing state in async Rust is hard, which is like, use channels if you don't want to pass mutexes everywhere?

zozbot234 2 years ago |

Async Rust is a bit half-baked still, but the Rust folks are quite aware of that, and working to improve things. See https://rust-lang.github.io/wg-async/welcome.html and https://github.com/rust-lang/wg-async

ribit 2 years ago |

To do user space concurrent software efficiently, one really needs at least some degree of OS cooperation. Only the system knows what kind of work is running on the machine and what the relative priorities are. One problem with the common executors is that they attempt to grab all the resources the system can offer - if you have multiple applications like that, you end up with extra thread contention. I also agree with the article in that the common coroutine as state machine model has its pitfalls.

With all this in mind, I really like Swift concurrency runtime. It does automatic thread migration and compaction to reduce the overhead of context switches, balances the thread allocation system-wide taking relative priorities into account, and it appears to be based on continuations instead of state machines. A very interesting design worth studying IMO.

andrewstuart 2 years ago |

Rust still isn't the language I'm looking for.

It's too complex.

Something simpler is needed with the benefits of memory safety.

memsafmeme 2 years ago | |

So anything from Python, Perl, PHP, Visual Basic, Java, C#, Go, Ruby, Erlang, OCaml, Scheme, Common Lisp?

lowbloodsugar 2 years ago |

I don’t know enough rust to comment on that part of the article but I’ve run GC-based systems at scale and load and the remarks about that do not match my experience at all.

I've coded performant applications on an OS that used channels and it sucked. It just got in the way and was confusing to engineers used to lower level constructs. "Just get out of my way!"

I think rust async is hard.

And that's what it comes down to. 99.9% (maybe more nines) of people do not need that level of control. They need conceptually simple things, like channels, and GC, and that will work for nearly everyone. The ones who need to drop to rust either have the engineers to do that, or their problem is intractable (for them). I pity those who drop to rust prematurely because it's cool.

vulcan01 2 years ago | |

> an OS that used channels

I'm very curious; what OS is this?

jnxx 2 years ago |

> We want to use the whole computer. Code runs on CPUs, and in 2023, even my phone has eight of the damn things. If I want to use more than 12% of the machine, I need several cores.

Isn't that already, in this strong generality, an almost always wrong assumption?

Sure, one can do massively parallel or embarrassingly parallel computation.

Sure, graphic cards are parallel computers.

Sure, OS kernels use multiple cores.

Sure, languages and concepts like Clojure exist and work - for a specific domain, like web services (and for that, Clojure works fascinatingly well).

But there are many, even conceptually simple algorithms which are not easy to parallelize. There is no efficient parallel Fast Fourier Transform I know of.

hot_gril 2 years ago | |

And there are even different degrees of parallelization. Some things will scale almost linearly to CPU cores, some will share a little state and see diminishing returns, some will share a lot of state and maybe only make good use of 2 cores, and it'll all depend on the hardware too.

anonymoushn 2 years ago | |

Well, if he wants to get close to using 12% of the machine, he'll need the SIMD intrinsics that are hidden behind `unsafe` :(

aidenn0 2 years ago |

In the past I did concurrency with non-preemptive multitasking and all I/O was handled by an event-loop. I find this strictly superior to async. It seems to have about 0.1% the popularity of async though.

nitwit005 2 years ago |

> Some problems demand a lot of concurrency. The canonical example, described by Dan Kagel as the C10K problem back in 1999, is a web server connected to tens of thousands of concurrent users. At this scale, threads won’t cut it—while they’re pretty cheap,5 fire up a thread per connection and your computer will grind to a halt.

Try it. It'll probably work fine. It may be very expensive, memory wise, but it's easy to get a machine with a lot of memory.

pdimitar 2 years ago | |

It's not just that. As you increase OS thread active count, each thread starts to respond slower and slower.

It's been tried, periodically. Still sucks.

nitwit005 2 years ago | | |

Write a little program that starts up 10k threads that just wait. The other tasks on the machine won't be any slower once they're set up.

Of course, if they're doing real work they'll be using CPU time, but that's true of any scheme you might pick.

devit 2 years ago |

Rust is designed like this because it seeks to achieve zero-cost abstractions and safety.

Or in other words, the goal is that you can think in abstract what the natural optimal machine code would be for a program, and you can write a Rust program that, in principle, can compile to that machine code, with as little constraints as possible on what that machine code looks like.

Unlike C, that also has this property, Rust additionally seeks to guarantee that any code will satisfy a bunch of invariants (such as that a variable of a data type actually always holds a valid value of that data type) provided the unsafe code part satisfies a bunch of invariants.

If you use Go or Haskell, that's not possible.

For example, Go requires a GC (and thus requires to waste CPU cycles uselessly scanning memory), and Haskell requires to use memory to store thunks rather than the actual data and has limited mutation (meaning you waste CPU cycles uselessly handling lazy computations and copying data). Obviously neither of this are required for the vast majority of programs, so choosing such a language means your program is unfixably handicapped in term of efficiency, and has no chance to compile to the machine code that any reasonable programmer would conceive as the best solution to the problem.

reacharavindh 2 years ago |

Is it possible to use Rust and all the important 3rd party crates without using async?

Out of curiosity, could Rust be limited to a language subset to mimic the simplicity of Golang (with channels and message passing) and trade-off some of the powerful features that seem to be causing pain?

Pardon a naïve question. I’m a systems engineer who occasionally dabbles with simple cli tools in all languages for fun, but don’t have a serious need for them.

pdimitar 2 years ago | |

I'd really like something like this! I would also love a Cargo switch / config that disables all panicky Rust API like `expect` and `unwrap`.

From what I can gather, such projects will never happen though. That's why I moved part of my work to Golang itself.

Rust is an amazing language. Though the team really takes the "system language" thing very seriously and they're making decisions and tradeoffs based on that, so it seems us its users should adapt and not use Rust for everything. That's what I ended up doing.

agentultra 2 years ago |

Although still experimental, GHC Haskell has a linear types extension that enables developers to specify lifetimes [0] that can be statically checked.

Good call, re: garbage collection FUD. Ultimately many programs have to clean up memory after it is no longer needed by the program and at a certain scale in a program it becomes necessary to write code that handles allocations/deallocations; and you end up manually writing a garbage collector. Done well you can get better performance for certain cases but often it's done haphazardly and you end up with poor performances.

It seems a good amount of Rust evangelism has given up on the, "no GC is required for performance," maxim. Is that the case, Rust friends?

That being said, I think it would be neat if there were a language like Haskell where there was an interface exposed by the compiler where a user could specify their own GC.

[0] https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/line...

samsquire 2 years ago |

I think the persons behind Rust are working on a particularly hard and complicated problems, so I don't want to say that async Rust work is bad. I would instead say it is complicated and hard to understand and use casually, instead.

Async Rust is many language features and behaviour all interacting with each other at the same time, to create something more complicatedly than how you would describe the problem you're actually trying to solve (I want to do X when Y happens, and I want to do X when Y happens × the number of hardware threads). When you're using async rust, you are having to think more carefully about:

* memory management (Arc) and safety and performance

* concurrency

* parallelism, arbitrary interleavings

* thread safety

* lifetimes

* function colouring

All interacting together to create a high cognitive load.

Is the assembly equivalent of multithreading and async complicated?

Multithreading, async, coroutines, concurrency and parallelism is my hobby of research I enjoy. I journal about it all the time.

* I think there's a way to design systems to be data intensive (Kleppmann) and data orientated (Mike Acton) with known-to-scale practices.

* I want programming languages to adopt these known-to-scale practices and make them easy.

* I want programs written in the model of the language to scale (linearly) by default. Rama from Red Planet Labs is an example of a model that scales.

* HN user mgaunard [0] told me about "algorithmic skeletons" which might be helpful if you're trying to parallelise. https://en.wikipedia.org/wiki/Algorithmic_skeleton

I think the concurrency primitives in programming languages are sharp edged and low level, which people reach for and build upon primitives that are too low level for the desired outcome.

[0]: https://news.ycombinator.com/item?id=36792796

[1]: https://blog.redplanetlabs.com/2023/08/15/how-we-reduced-the...

Note: You can use async Rust without threading but I assumed you're using multithreading.

brundolf 2 years ago |

Whether or not you agree with the conclusion, this is an excellent article. It gives a great overview of the background topics (I had a few gaps filled in!), and does a really good job of explaining the problem

Re: the conclusion, I wonder if this is a problem that can be solved over time with abstractions (i.e. async Rust is a good foundation that's just too low-level for direct use)?

hannofcart 2 years ago |

As far as I understand the article, the author is talking about "massively concurrent userspace programs" which also have one more constraint: they can't rely on IPC to breakup or offload that concurrency.

(They mention this extra constraint early in the article: "But this approach has its limitations. Inter-process communication is not cheap, since most implementations copy data to OS memory and back.")

I'm familiar with writing services with large throughputs by offloading tasks onto a queue (say Redis/Rabbitmq whatever) and having a lot of single threaded "agents" or "workers" picking them off the queue and processing them.

But as implied in the earlier quote from the article, this is not an acceptable fast or cheap enough solution for the problems the author is talking about.

So now am left wondering: what are some examples of the class of (1%) problems the author is talking about in this article?

kunley 2 years ago |

I am very happy someone finally addressed the concerns, and kudos for breaking out of the anti-gc zealotry.

nesarkvechnep 2 years ago |

Not a single mention of Erlang. Haskell's parallelism is very much inspired by Erlang's model.

endorphine 2 years ago |

Can someone explain where the stackless/stackful naming comes from and what does it mean?

dgroshev 2 years ago | |

Rust futures don't have stacks, all local variables get translated into struct fields, so suspending execution on an awake point is trivial, you just start executing another future. It also means there is no need for a runtime for a future to make sense (it's just a struct with a trait implemented) and the futures are completely static and amenable to compiler optimisations like inlining. On the other hand, it means that the structure is by necessity self referential (one "local" variable might refer to another, so the struct has addresses in it), which means it can't be safely moved (because will point at the old address).

"Stackful" coroutines, on the other hand, do have runtime stacks (holding local variables) that get swapped out by the runtime on await points. It makes the code behave exactly like non-async code, but requires a runtime to manage those stacks. Rust didn't go this way, preferring the benefits of the stackless approach.

Veliladon 2 years ago |

> You might say this isn’t a fair comparison—after all, those languages hide the difference between blocking and non-blocking code behind fat runtimes, and lifetimes are handwaved with garbage collection. But that’s exactly the point! These are pure wins when we’re doing this sort of programming.

Until all the work you're trying to push is generating so many allocations that your GC goes to shit once every two minutes trying to clean up the mess you made. (https://discord.com/blog/why-discord-is-switching-from-go-to...)

mrkline 2 years ago | |

Go's GC is hardly state of the art.

pdimitar 2 years ago | | |

I got no horse in this race -- I like both Golang and Rust and use them for different things -- but as far as I can tell, Golang's GC has improved a lot.

Not sure where does it stand on a global competition rating board (if there's even such a thing) but it's pretty good. I've never seen it crap the bed, though I also never worked at the scale of Twitch and Discord.

hot_gril 2 years ago | |

I have a lot of respect for Discord's technical decisions. They know when to do things the bland way and when to use more specialized technologies. Note that that article also praises async Rust.

adam_arthur 2 years ago |

Anyone know why there isn't a single type/interface that allows for consumers to supply any of Arc, Rc etc boxed values?

I haven't investigated it deeply, but I was developing something in Rust, and whether something needs to be threadsafe or not is entirely on the consumer's use case... bad separation of concerns for the provider of a generic interface to have to specify the specific type of boxed value. 100% fine if the behavior in this case is to pre-allocate the max possible boxed type memory requirement.

This is the only thing I was really frustrated with in Rust

loeg 2 years ago | |

> Anyone know why there isn't a single type/interface that allows for consumers to supply any of Arc, Rc etc boxed values?

Your generic interface just takes a reference to the value inside the box.

adam_arthur 2 years ago | | |

Borrowing the value didn't work for this case, but forget the reasoning at the moment. This is a struct that may or may not be used in threaded code. Code is highly polymorphic and involves a few traits

Using Arc everywhere solves it, but dumb and inefficient for non threaded use cases. Maybe compiler optimizes this though, who knows. Semantically it's wrong though.

Honestly forget the specifics enough at this point to discuss so I'll drop it haha.

Was just curious whether somebody else was tracking this, or there was a known workaround. I think it's something the language will eventually support. I saw other threads on rustlang asking for the same thing, and best I saw was some sort of enum style hack representing the boxed types to emulate it

devit 2 years ago | |

You can use impl Borrow<T> for that if the choice is static.

If it's dynamic, you can use Cow or the supercow/bos/... crates if you want Arc/Rc to be options as well.

adam_arthur 2 years ago | | |

I forget the specifics at the moment, but impl Borrow didn't work for my case, I definitely tried it and expected it to work though.

I'll check the Cow crate.

legohead 2 years ago |

Hopefully this is an appropriate place to ask this question: I don't know what to do about threads/CPU for a game server project I want to write.

I really want to use TypeScript, as I like the language and I want to use this as a way to learn it better. I'm not expecting to have some super successful game, but the programmer part of my mind is upset at not utilizing all the cores of the machine. So, what do people do? Split up the server into multiple independent running components, or is my choice really to just use another language?

tiberriver256 2 years ago |

Anyone with electrical engineering know if pata vs. sata cables is a good analogy for async vs. sync?

I know parallel ATA cables were all the rage. They had a higher theoretical throughput when compared with serial ATA cables but there was too much cross-talk involved to make it actually faster in the end so now we have serial ATA cables everywhere with much higher throughput than parallel ATA cables could ever achieve.

Should we move back away from parallelism and focus on handling synchronous stuff faster instead?

adrienthebo 2 years ago | |

PATA vs. SATA is a somewhat limited metaphor; PATA had a number of limitations such as the inability to hot swap hardware as well as using wide ribbon cables that made it largely obsolete. In contrast, both sync and async programming have reasonable applications; we're likely using both for the foreseeable future. The best EE analogy I can think of is using hyperthreading to execute multiple processes on a single core vs scheduling each thread to a separate core, but that's less a metaphor and more of a simplified model of what async vs sync is actually doing.

> Should we move back away from parallelism and focus on handling synchronous stuff faster instead?

Rust already has excellent handling of synchronous computation, given that it can meet/sometimes exceed equivalent performance in C. The problem is when you're I/O or network bound; you can either throw threads at the problem (and by extension throw memory at the problem for the thread stacks) or use async programming.

ansible 2 years ago | |

We're dealing with fundamental limitations of I/O though, which is causing the delays.

I want to write stuff to disk (SSD these days). I can issue a request, then have to wait tens to hundreds of milliseconds (in the average case, the worst case can be far longer) for that request to finish and let me know that my I/O request succeeded or failed. There's no getting around that with present-day technology.

The situation is worse and even less reliable with network I/O. If you are talking to a server in another continent, the speed of light determines the minimum of time I hear back from it, even if it (and all the intermediary network links) are lightly loaded and functioning perfectly.

bullen 2 years ago |

I don't know if I'm adding to the noise or signal but there is a definite solution to this problem: Put your C data in Arrays of 64 byte Structs.

Java is ok too if you want object oriented atomic joint parallelism, but I only recommend using it on the server where you need a VM anyhow.

C from 1970 and Java from 1990 still got things right.

Also Vulkan/Metal/DX12 does not really help, OpenGL 3 with VAO is enough.

nailer 2 years ago |

> We want to use the whole computer. Code runs on CPUs, and in 2023, even my phone has eight of the damn things. If I want to use more than 12% of the machine, I need several cores.

Er, no. That’s not what those words mean.

“We want to use the whole computer. Code runs on CPU cores, and in 2023, even my phone has eight of the damn things. If I want to use more than 12% of the machine, I need several threads.”

qwerty456127 2 years ago |

> We want to use the whole computer. Code runs on CPUs

Well, I would hardly mind to use the GPU for any part of my program which would fit it. That's why I believe it could be a great idea for a modern programming language to include first-class GPU-accelerated types and instructions.

vlovich123 2 years ago |

Async actually doesn’t require sync/send. This only happens if you have a work stealing runtime like tokio. However, single threaded runtimes are possible and do exist. Work stealing is appropriate for some use cases, thread per core designs for other use cases.

bmacho 2 years ago |

> Maybe Rust isn’t a good tool for massively concurrent, userspace software. We can save it for the 99% of our projects that don’t have to be.

Please make it happen! I want my userspace software to be in Rust!

Although, if it won't happen, then even better, a free real estate for a RustScript.

the__alchemist 2 years ago |

I'm a bit on my own for HAL-level tooling in embedded Rust, partly because many of the libs and discussion in OSS Rust mediums focus on async. I've tried to grokk and use async a few times, and have come off disliking it each.

Not my cup of tea.

chmod600 2 years ago |

What's expensive about having a lot of threads/processes? Could those costs be brought down somehow? Would it help to configure the threads as they are starting (e.g. start a tiny thread)?

IceHegel 2 years ago |

I like how this guys writes. Simple, causal, clean.

no_wizard 2 years ago |

Go style coroutines would have been a better path than the async keyword. Java with project Valhalla got this right.

I think having to keyword async is frustrating as a design decision

not_me_ever 2 years ago |

What a pile of garbage.

chrisweekly 2 years ago |

I'm not a Rust programmer (yet?) but this seems like a pretty valid set of criticisms which I'll keep in mind if/when that changes.

gorenb 2 years ago |

You are wrong. Very wrong. Async Rust is the best. Best Rust is async. I’m going to await at you. Async rust is good. Rust is fast and safe. More than C.

neonsunset 2 years ago |

I have been using "async/await is bad, use {feature name[0]}" as a litmus test for people who are generally bad at programming, especially so at concurrent flavour of such.

Sure, Rust is certainly verbose and very strict how the ownership rules apply in the context of async, but this is a hard constraint of its memory safety model. We could probably do better while retaining all performance but this is by far one of the best implementations. Another example of nice to use async/await is C# which trades performance/memory (state has to be boxed if it is to live across continuations) for convenience (you just write it naturally without worrying about underlying behavior).

There is a reason Rust toyed with "green threads" at its inception but decided against such. The only popular languages of today that do these are Go and Java (which basically forced to do this because you can't go async without introducing the feature early in the lifecycle of the language, and the authors of project Loom are simply wrong with their excuses why this is superior to async/await).

Async/await is here to stay and is the right abstraction, git good, and it's not even difficult to use anyway.

[0] where feature name is green threads, not doing concurrency at all, doing it manually, etc.

var user = try alloc.create(@Frame(service.GetUser)); user.* = async service.GetUser(id); defer alloc.destroy(user); var promos = try alloc.create(@Frame(service.GetPromotions)); promos.* = async service.GetPromotions(category); defer alloc.destroy(promos); var eligible = GetEligibility(await user.*, await promos.*);