Java is ok too if you want object oriented atomic joint parallelism, but I only recommend using it on the server where you need a VM anyhow.
C from 1970 and Java from 1990 still got things right.
Also Vulkan/Metal/DX12 does not really help, OpenGL 3 with VAO is enough.
Er, no. That’s not what those words mean.
“We want to use the whole computer. Code runs on CPU cores, and in 2023, even my phone has eight of the damn things. If I want to use more than 12% of the machine, I need several threads.”
Well, I would hardly mind to use the GPU for any part of my program which would fit it. That's why I believe it could be a great idea for a modern programming language to include first-class GPU-accelerated types and instructions.
Please make it happen! I want my userspace software to be in Rust!
Although, if it won't happen, then even better, a free real estate for a RustScript.
Not my cup of tea.
I think having to keyword async is frustrating as a design decision
Sure, Rust is certainly verbose and very strict how the ownership rules apply in the context of async, but this is a hard constraint of its memory safety model. We could probably do better while retaining all performance but this is by far one of the best implementations. Another example of nice to use async/await is C# which trades performance/memory (state has to be boxed if it is to live across continuations) for convenience (you just write it naturally without worrying about underlying behavior).
There is a reason Rust toyed with "green threads" at its inception but decided against such. The only popular languages of today that do these are Go and Java (which basically forced to do this because you can't go async without introducing the feature early in the lifecycle of the language, and the authors of project Loom are simply wrong with their excuses why this is superior to async/await).
Async/await is here to stay and is the right abstraction, git good, and it's not even difficult to use anyway.
[0] where feature name is green threads, not doing concurrency at all, doing it manually, etc.
It's probably the right abstraction for Haskell, or any other language that works well with functional programming, lambdas and monads. Loom is a better fit for Java. Rust also would have probably been better off with something else. Effect handlers might have been a good choice.
Why?
I guess I should specify that this is true even in a single-threaded context and even without any additional buffers or whatever.
I just write the same code that I would write if it were synchronous and it executes asynchronously.
var user = service.GetUser(id);
var promos = service.GetPromotions(category);
var eligible = GetEligibility(await user, await promos);
var user = try alloc.create(@Frame(service.GetUser));
user.* = async service.GetUser(id);
defer alloc.destroy(user);
var promos = try alloc.create(@Frame(service.GetPromotions));
promos.* = async service.GetPromotions(category);
defer alloc.destroy(promos);
var eligible = GetEligibility(await user.*, await promos.*);
but the claimed difference is not present in this code. It is that GetUser and GetPromotions do not themselves know whether they are async. The author of service is able to just take some readers and writers or some types or objects implementing some protocols and use them, then service's methods inherit the async-ness of its dependencies.The point of async/await is that it converts your function into a reentrant state machine (in a different way than compiling a sync function already turned it into a state machine.) The problem with the usual design is that it uses futures, which are bad because they have dynamic lifetimes.
Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.
10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.
The async case is suited to situations where you're blocking for things like network requests. In that case the thread will be doing nothing, so we want to hand off the work to another task of some kind that is active. Green threads mean you can do that without a context switch.
It got even more expensive in recent years after all the speculative execution vulnerabilities in CPUs, so now you have additional logic on every context switch with mitigations on in kernel.
I have no doubt that having a thread per core and managing the data with only non-blocking operations is much faster. But I'm pretty current machines can manage a thousand or so threads locked almost the entire time just fine.
So do we discard existing ways of making software more efficient because we can be more wasteful on more recent hardware? What if we could develop our software such that 2000s computers are still useful, rather than letting those computers become e-waste?
> The numbers reported here paint an interesting picture on the state of Linux multi-threaded performance in 2018. I would say that the limits still exist - running a million threads is probably not going to make sense; however, the limits have definitely shifted since the past, and a lot of folklore from the early 2000s doesn't apply today. On a beefy multi-core machine with lots of RAM we can easily run 10,000 threads in a single process today, in production. As I've mentioned above, it's highly recommended to watch Google's talk on fibers; through careful tuning of the kernel (and setting smaller default stacks) Google is able to run an order of magnitude more threads in parallel.
By the 2010s the problem had been updated to C10M. The people discussing it (well, perhaps some) aren't idiots and understand that the threshold changes as hardware changes.
Also, the issue isn't creating 10k threads it's dealing with 10k concurrent users (or, again, a much higher number today).
Typically, if you want to build something with Rust, it'll have to use async, at least because gRPC and the like are implemented that way. So the vanilla (and excellent, IMO) Rust language doesn't exist there. Everything is async from the get-go.
A weird way to use Rust since you can do a lot of messaging within the process, and use the computing power much more efficiently.
RPC is essentially messaging and message-passing. Message-passing is a way to avoid mutable shared state - this is the model with which Go became successful.
RPC surely has its use but message passing is another, and very often inferior, solution to the problem set where Rust has excellent own solutions for.
If I'm implementing a library, how should I write it so that the consumer of the library doesn't have to pull in Tokio if they don't want to?
The arguments about Arc fall flat because how else would you safely manage shared references, even in other lower level languages. And so called "modern GCs" still do come with a significant hit in performance; it's not just some "bizarre psyop".
Really the only problem I've run into with Rust's async/await is the fact that there is not much support for composing async tasks in a structured way (i.e. structured programming) and the author doesn't even touch on this issue.
Ultimately the goals and criticism of the author are just downright confusing because at the end he admits that he doesn't actually care for the fact that Rust is design constrained by being a low level language and instead advocates for using Haskell or Go for any application that requires significant concurrency. So to reformulate his argument: we should never use or design into low level languages an ergonomically integrated concurrency runtime because it may have a handful of engineering challenges. When put concisely, their thesis is really quite ridiculous.
With all this in mind, I really like Swift concurrency runtime. It does automatic thread migration and compaction to reduce the overhead of context switches, balances the thread allocation system-wide taking relative priorities into account, and it appears to be based on continuations instead of state machines. A very interesting design worth studying IMO.
It's too complex.
Something simpler is needed with the benefits of memory safety.
I've coded performant applications on an OS that used channels and it sucked. It just got in the way and was confusing to engineers used to lower level constructs. "Just get out of my way!"
I think rust async is hard.
And that's what it comes down to. 99.9% (maybe more nines) of people do not need that level of control. They need conceptually simple things, like channels, and GC, and that will work for nearly everyone. The ones who need to drop to rust either have the engineers to do that, or their problem is intractable (for them). I pity those who drop to rust prematurely because it's cool.
I'm very curious; what OS is this?
Isn't that already, in this strong generality, an almost always wrong assumption?
Sure, one can do massively parallel or embarrassingly parallel computation.
Sure, graphic cards are parallel computers.
Sure, OS kernels use multiple cores.
Sure, languages and concepts like Clojure exist and work - for a specific domain, like web services (and for that, Clojure works fascinatingly well).
But there are many, even conceptually simple algorithms which are not easy to parallelize. There is no efficient parallel Fast Fourier Transform I know of.
Try it. It'll probably work fine. It may be very expensive, memory wise, but it's easy to get a machine with a lot of memory.
It's been tried, periodically. Still sucks.
Of course, if they're doing real work they'll be using CPU time, but that's true of any scheme you might pick.
Or in other words, the goal is that you can think in abstract what the natural optimal machine code would be for a program, and you can write a Rust program that, in principle, can compile to that machine code, with as little constraints as possible on what that machine code looks like.
Unlike C, that also has this property, Rust additionally seeks to guarantee that any code will satisfy a bunch of invariants (such as that a variable of a data type actually always holds a valid value of that data type) provided the unsafe code part satisfies a bunch of invariants.
If you use Go or Haskell, that's not possible.
For example, Go requires a GC (and thus requires to waste CPU cycles uselessly scanning memory), and Haskell requires to use memory to store thunks rather than the actual data and has limited mutation (meaning you waste CPU cycles uselessly handling lazy computations and copying data). Obviously neither of this are required for the vast majority of programs, so choosing such a language means your program is unfixably handicapped in term of efficiency, and has no chance to compile to the machine code that any reasonable programmer would conceive as the best solution to the problem.
Out of curiosity, could Rust be limited to a language subset to mimic the simplicity of Golang (with channels and message passing) and trade-off some of the powerful features that seem to be causing pain?
Pardon a naïve question. I’m a systems engineer who occasionally dabbles with simple cli tools in all languages for fun, but don’t have a serious need for them.
From what I can gather, such projects will never happen though. That's why I moved part of my work to Golang itself.
Rust is an amazing language. Though the team really takes the "system language" thing very seriously and they're making decisions and tradeoffs based on that, so it seems us its users should adapt and not use Rust for everything. That's what I ended up doing.
Good call, re: garbage collection FUD. Ultimately many programs have to clean up memory after it is no longer needed by the program and at a certain scale in a program it becomes necessary to write code that handles allocations/deallocations; and you end up manually writing a garbage collector. Done well you can get better performance for certain cases but often it's done haphazardly and you end up with poor performances.
It seems a good amount of Rust evangelism has given up on the, "no GC is required for performance," maxim. Is that the case, Rust friends?
That being said, I think it would be neat if there were a language like Haskell where there was an interface exposed by the compiler where a user could specify their own GC.
[0] https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/line...
Async Rust is many language features and behaviour all interacting with each other at the same time, to create something more complicatedly than how you would describe the problem you're actually trying to solve (I want to do X when Y happens, and I want to do X when Y happens × the number of hardware threads). When you're using async rust, you are having to think more carefully about:
* memory management (Arc) and safety and performance
* concurrency
* parallelism, arbitrary interleavings
* thread safety
* lifetimes
* function colouring
All interacting together to create a high cognitive load.
Is the assembly equivalent of multithreading and async complicated?
Multithreading, async, coroutines, concurrency and parallelism is my hobby of research I enjoy. I journal about it all the time.
* I think there's a way to design systems to be data intensive (Kleppmann) and data orientated (Mike Acton) with known-to-scale practices.
* I want programming languages to adopt these known-to-scale practices and make them easy.
* I want programs written in the model of the language to scale (linearly) by default. Rama from Red Planet Labs is an example of a model that scales.
* HN user mgaunard [0] told me about "algorithmic skeletons" which might be helpful if you're trying to parallelise. https://en.wikipedia.org/wiki/Algorithmic_skeleton
I think the concurrency primitives in programming languages are sharp edged and low level, which people reach for and build upon primitives that are too low level for the desired outcome.
[0]: https://news.ycombinator.com/item?id=36792796
[1]: https://blog.redplanetlabs.com/2023/08/15/how-we-reduced-the...
Note: You can use async Rust without threading but I assumed you're using multithreading.
Re: the conclusion, I wonder if this is a problem that can be solved over time with abstractions (i.e. async Rust is a good foundation that's just too low-level for direct use)?
(They mention this extra constraint early in the article: "But this approach has its limitations. Inter-process communication is not cheap, since most implementations copy data to OS memory and back.")
I'm familiar with writing services with large throughputs by offloading tasks onto a queue (say Redis/Rabbitmq whatever) and having a lot of single threaded "agents" or "workers" picking them off the queue and processing them.
But as implied in the earlier quote from the article, this is not an acceptable fast or cheap enough solution for the problems the author is talking about.
So now am left wondering: what are some examples of the class of (1%) problems the author is talking about in this article?
"Stackful" coroutines, on the other hand, do have runtime stacks (holding local variables) that get swapped out by the runtime on await points. It makes the code behave exactly like non-async code, but requires a runtime to manage those stacks. Rust didn't go this way, preferring the benefits of the stackless approach.
Until all the work you're trying to push is generating so many allocations that your GC goes to shit once every two minutes trying to clean up the mess you made. (https://discord.com/blog/why-discord-is-switching-from-go-to...)
Not sure where does it stand on a global competition rating board (if there's even such a thing) but it's pretty good. I've never seen it crap the bed, though I also never worked at the scale of Twitch and Discord.
I haven't investigated it deeply, but I was developing something in Rust, and whether something needs to be threadsafe or not is entirely on the consumer's use case... bad separation of concerns for the provider of a generic interface to have to specify the specific type of boxed value. 100% fine if the behavior in this case is to pre-allocate the max possible boxed type memory requirement.
This is the only thing I was really frustrated with in Rust
Your generic interface just takes a reference to the value inside the box.
Using Arc everywhere solves it, but dumb and inefficient for non threaded use cases. Maybe compiler optimizes this though, who knows. Semantically it's wrong though.
Honestly forget the specifics enough at this point to discuss so I'll drop it haha.
Was just curious whether somebody else was tracking this, or there was a known workaround. I think it's something the language will eventually support. I saw other threads on rustlang asking for the same thing, and best I saw was some sort of enum style hack representing the boxed types to emulate it
If it's dynamic, you can use Cow or the supercow/bos/... crates if you want Arc/Rc to be options as well.
I'll check the Cow crate.
I really want to use TypeScript, as I like the language and I want to use this as a way to learn it better. I'm not expecting to have some super successful game, but the programmer part of my mind is upset at not utilizing all the cores of the machine. So, what do people do? Split up the server into multiple independent running components, or is my choice really to just use another language?
I know parallel ATA cables were all the rage. They had a higher theoretical throughput when compared with serial ATA cables but there was too much cross-talk involved to make it actually faster in the end so now we have serial ATA cables everywhere with much higher throughput than parallel ATA cables could ever achieve.
Should we move back away from parallelism and focus on handling synchronous stuff faster instead?
> Should we move back away from parallelism and focus on handling synchronous stuff faster instead?
Rust already has excellent handling of synchronous computation, given that it can meet/sometimes exceed equivalent performance in C. The problem is when you're I/O or network bound; you can either throw threads at the problem (and by extension throw memory at the problem for the thread stacks) or use async programming.
I want to write stuff to disk (SSD these days). I can issue a request, then have to wait tens to hundreds of milliseconds (in the average case, the worst case can be far longer) for that request to finish and let me know that my I/O request succeeded or failed. There's no getting around that with present-day technology.
The situation is worse and even less reliable with network I/O. If you are talking to a server in another continent, the speed of light determines the minimum of time I hear back from it, even if it (and all the intermediary network links) are lightly loaded and functioning perfectly.
Whereas spawning native OS threads (not sure about the 10k number, could be even more with the good hardware these days) and having them all do stuff is gonna lag a whole lot more due to context switches.
So you know, apples to apples, but some apples are much better than others.
I'm not sure how that's relevant here, if for example something takes 1ms and I do it 1000 times a second, I'm using 1000 ms of CPU time vs not doing it at all. So if you want to use big o notation in this context it should be O(n) where n is the number of context switches, because you are not comparing algorithms used to switch between threads but you are comparing doing context switch or not doing it at all.
As I've mentioned before, I'm writing a high performance metaverse client. Here's a demo video.[1] It's about 40,000 lines of Rust so far.
If you are doing a non-crappy metaverse, which is rare, you need to wrangle a rather excessive amount of data in near real time. In games, there's heavy optimization during game development to prevent overloading the play engine. In a metaverse, as with a web browser, you have to take what the users create and deal with it. You need 2x-3x the VRAM a comparable game would need, a few hundred megabits per second of network bandwidth to load all the assets from servers, a half dozen or so CPUs running flat out, and Vulkan to let you put data into the GPU from one thread while another thread is rendering.
So there will be some parallelism involved.
This is not like "web-scale" concurrency, which is typically a large number of mini-servers, each doing their own thing, that just happen to run in the same address space. This is different. There's a high priority render thread drawing the graphics. There's a update thread processing incoming events from the network. There are several asset loading and decompression threads, which use up more CPU time than I'd like. There are about a half dozen other threads doing various miscellaneous tasks - handling moving objects, updating levels of detail, purging caches, and such.
There's considerable locking, but no "static" data other than constants. No globals. Channels are used where appropriate to the problem. The main object tree is single ownership, and used mostly by the update thread. Its links to graphics objects are Arc reference counted, and those are updated by both the update thread and the asset loading threads. They in turn use reference counted handles into the Rend3 library, which, via WGPU and Vulkan, puts graphics content (meshes and textures) into the GPU. Rendering is a loop which just tells Rend3 "Go", over and over.
This works out quite well in Rust. If I had to do this in C++, I'd be fighting crashes all the time. There's a reason most of the highly publicized failed metaverse projects didn't reach this level of concurrency. In Rust, I have about one memory related crash per year, and it's always been in someone else's "unsafe" code. My own code has no "unsafe", and I have "unsafe" locked out to prevent it from creeping in. The normal development process is that it's hard to get things to compile, and then it Just Works. That's great! I hate using a debugger, especially on concurrent programs. Yes, sometimes you can get stuck for a day, trying to express something within the ownership rules. Beats debugging.
I have my complaints about Rust. The main ones are:
- Rust is race condition free, but not deadlock free. It needs a static deadlock analyzer, one that tracks through the call chain and finds that lock A is locked before lock B on path X, while lock B is locked before path A on path Y. Deadlocks, though, tend to show up early and are solid problems, while race conditions show up randomly and are hard to diagnose.
- Async contamination. Async is all wrong when there's considerable compute-bound work, and incompatible with threads running at multiple priorities. It keeps creeping in. I need to contact a crate maintainer and get them to make their unused use of "reqwest" dependent on a feature, so I don't pull in Tokio. I'm not using it, but it's there.
- Single ownership with a back reference is a very common need, and it's too hard to do. I use Rc and Weak for that, but shouldn't have to. What's needed is a set of traits to manage consistent forward and back links (that's been done by others) and static analysis to eliminate the reference counts. The basic constraints are ordinary borrow checker restrictions - if you have mutable access to either parent or child, you can't have access to the other one. But you can have non-mutable access to both. If I had time, I'd go work on that.
- I've learned to live without objects, but the trait system is somewhat convoluted. There's one area of asset processing that really wants to be object oriented, and I have more duplicate code there than I like. I could probably rewrite it to use traits more, but it would take some bashing to make it fit the trait paradigm.
- The core graphics crates aren't finished. There was an article on HN a few days ago about this. "Rust has 5 games and 50 game engines". That's not a language problem, that's an ecosystem problem. Not enough people are doing non-toy graphics in Rust. Watch my video linked below.[1] Compared to a modern AAA game title, it's not that great. Compared to anything else being done in Rust (see [2]) it's near the front. This indicates a lack of serious game dev in Rust. I've been asked about this by some pro game devs. My comment is that if you have a schedule to meet, the Rust game ecosystem isn't ready. It's probably about five people working for a year from being ready.
That sounds like a great idea. Something in the style of lockdep, that (when enabled) analyzes what locks are currently held while any other lock is taken, and reports any potential deadlocks (even if they haven't actually deadlocked).
That would require some annotation to handle cases of complex locking, so that the deadlock detection knows (for instance) that a given class of locks are always obtained in address order so they can't deadlock. But it's doable.
parking_lot has a deadlock detection feature for when you deadlock that iirc tells you what deadlocked (so you're not trying to figure it out with a debugger and a lot of time) https://amanieu.github.io/parking_lot/parking_lot/deadlock/i...
I also just found out about https://github.com/BurtonQin/lockbud which seems to detect deadlocks and a few other issues statically? (seems to require compiling your crate with the same version of rust as lockbud uses, which from the docs is an old 1.63 nightly build?)
It's quite nice, but for cpp not rust
If locks can be numbered or otherwise ordered, it would be easy to enforce a strict order of taking locks and an inverse strict order of releasing them, by looking up in the registry which locks your thread is currently holding. This would prevent deadlocks.
This, of course, would require to have an idea of all the locks you may want to hold, and their relative order (at least partial), as Dijkstra described back in the day. But thinking about locks is ahead of time is a good idea anyway.
One quibble though. Rust isn't race condition free, it's data race free. You can still end up with race conditions outside of data access. https://news.ycombinator.com/item?id=23599598
The priority thing is relatively easy to fix:
Either create multiple thread pools, and route your futures to them appropriately.
Or, write your own event loop, and have it pull from more than one event queue (each with a different priority).
It should be even easier than that, but I don’t know of a crate that does the above out of the box.
One advantage of the second approach is (if your task runtime is bounded) that you can have soft realtime guarantees for high priority stuff even when you are making progress on low priority stuff and running at 100% CPU.
I've been collecting a list[1] of what memory-management policies programmers actually want in their code; it is far more extensive than any particular language actually implements. Contributions are welcome!
I already had back reference on the list, but added some details. When the ownership is indirect (really common) it is difficult to automate.
One thing that always irritates me: Rust's decision to make all objects moveable really hurts it at times.
[1] https://gist.github.com/o11c/dee52f11428b3d70914c4ed5652d43f...
One challenge with rust is that (for better or worse) most gamedev talent is C++. If you ever open source it I’d be interested in contributing, though I’m not sure how effective the contributions would be.
Good luck!
I'm not that interested in self-promotion here as I am in getting more activity on Rust graphics development. I think the Rust core graphics ecosystem needs about five good graphics people for a year to get unstuck. Rust is a good language for this sort of thing, but you've got to have reliable heavy machinery down in the graphics engine room.
Until that exists, nobody can bet a project with a schedule and a budget on Rust. The only successful commercial high-detail game title I know of that uses Rust is a sailing race simulator. They simply linked directly to "good old DX11" (Microsoft Direct-X 11) and wrote the game logic in Rust. Bypassed Rust's own 3D ecosystems completely.
"I've learned to live without objects, but the trait system is somewhat convoluted. There's one area of asset processing that really wants to be object oriented, and I have more duplicate code there than I like. I could probably rewrite it to use traits more, but it would take some bashing to make it fit the trait paradigm."
Can you expand on this? I come from the C# world and the Rust trait system feels expressive enough to implement the good parts of OOP.I've always wondered why the "color" of a function can't be a property of its call site instead of its definition. That would completely solve this problem - you declare your functions once, colorlessly, and then can invoke them as async anywhere you want.
If you have a non-joke type system (which is to say, Haskell or Scala) you can. I do it all the time. But you need HKT and in Rust each baby step towards that is an RFC buried under a mountain of discussion.
It isn't like JavaScript where there is truly only one thread of execution at a time and blocking it will block everything.
This is a far superior workflow when you factor in outcomes. More up front time to get a "correct"/more-reliable output scales infinitely better than than churning out crap that you need to wrap in 10,000 lines of tests to keep from breaking/validate (See: the dumpster-fire that is Rails)
I’m a strong-typing enthousiast, too, but still, I’m not fully convinced that’s true.
It seems you can’t iterate fast at all in Rust because the code wouldn’t compile, but can iterate fast in C++, except for the fact that the resulting code may be/often is unstable.
If you need to try out a lot of things before finding the right solution, the ability to iterate fast may be worth those crashes.
Maybe, using C++ for fast iterations, and only using various tools to hunt down issues the borrow checker would catch on the iteration you want to keep beats using Rust.
Or do Rust programmers iterate fast using unsafe where needed and then fix things once they’ve settled on a design?
This is a big problem. Fast iteration time is very valuable.
And who likes doing this to themselves anyway? Isn't it a very frustrating experience? How is this the most loved language?
The thing is, these dependencies do exist no matter what language you use if they stem from an underlying concept. In that case rust just makes you explicitly write them which is a good thing since in C++ all these dependencies would be more or less implicit and everytime somebody edits the code he needs to think all these cases through and get a mental model (if he sees it at all!). In Rust you at least have the lifetime annotations which make it A: obvious there is some special dependency going on and B: show the explicit lifetimes etc.
So what I'm saying, you need to put in this work no matter which language you choose, writing it down is then not a big problem anymore. If you don't think about these rules your program will probably work most of the time but only most of the time, and that can be very bad for certain scenarios.
Personal preference and pain tolerance. Just like learning Emacs[1] - there's lots of things that programmers can prioritize, ignore, enjoy, or barely tolerate. Some people are alright with the fact that they're prototyping their code 10x more slowly than in another language because they enjoy performance optimization and seeing their code run fast, and there's nothing wrong with that. I, myself, have wasted a lot of time trying to get the types in some of my programs just right - but I enjoy it, so it's worth it, even though my productivity has decreased.
Plus, Rust seems to have pushed out the language design performance-productivity-safety efficiency frontier in the area of performance-focused languages. If you're a performance-oriented programmer used to buggy programs that take a long time to build, then a language that gives you the performance you're used to with far fewer bugs and faster development time is really cool, even if it's still very un-productive next to productivity-oriented languages (e.g. Python). If something similar happened with productivity languages, I'd get excited, too - actually, I think that's what's happening with Mojo currently (same productivity, greater performance) and I'm very interested.
Whereas after you prove the safety of a design once, it stays with you.
I have fought the ownership rules and lost (replaced references by integers to a common vector-ugly stuff, but I was time constrained). But I have seen people spend several weeks debugging a single problem, and that was really soul-crushing.
I don't personally mind debugging, too much, but if your goal is to avoid bugs in your running software, then Rust has some serious advantages. We mainly use TypeScript to do things, which isn't really comparable to Rust. But we do use C when we need performance, and we looked into Rust, even did a few PoCs on real world issues, and we sort of ended up in a situation similar to GP. Rust is great though a bit "verbose" to write, but its eco-system is too young to be "boring" enough for us, so we're sticking with C for the time being. But being able to avoid running into crashes by doing the work before your push your code is immensely valuable in fault-intolerant systems. Like, we do financial work with C, it cannot fail. So we're actually still doing a lot of the work up-front, and then we handle it by rigorously testing everything. Because it's mainly used for small performance enhancement, our C programs are small enough to where this isn't an issue, but it would be a nightmare to do with 40.000 lines of C code.
I would much rather bang my head against a compiler for N hours, and then finally have something that compiles -- and thus am fairly confident works properly -- than have something that compiles and runs immediately, but then later I find I have to spend N hours (or, more likely, >N hours) debugging.
Your preferences may differ on this, and that's fine. But in the medium to long term, I find myself much more productive in a language like Rust than, say, Python.
Just wondering, how long did it take you to hit 40k lines? I’m a new Rust developer and it’s taken me ages to get this far.
I totally relate to your experience though. When I finally get my code to compile, it “just works” without crashes. I’ve never felt so confident in my code before.
3 years.
This isn't a new idea for a desirable state. Same experience with Modula-2 three decades ago. A page or more of compiler errors to clear, then suddenly adiabatic. A very satisfying experience.
If you want extreme low contention extreme high-utilization, you’re doing threading and event-driven simultaneously, there are no easy answers on heavily contended data structures because you can’t duplicate to exploit immutability if mere insane complexity is an alternative, and mistakes cost millions in real time.
There’s a reason why those places scout so heavily for the lock-free/wait-free galaxy brains the minute they finish their PhDs.
That's not a serious article. That's a humourous video.
And are you using an ECS based architecture? Do you feel you’d have a different opinion if you were?
Is there a ML to subscribe to, to learn when the viewer is more generally available for testing? Thanks again!
Do... you... wind up having to set TCP_NODELAY?
•͡˘㇁•͡˘
Why? I'd take modern C++ over Rust every day of the week.
Why? (Serious question)
(Plus some increase in content load over the network, which does exist ala runtime mod loading, streaming, etc)
Without judgment I must ask, what made you decide to target metaverse specifically? Is it more of a fun challenge, or do you see it having a bright/popular future?
The guy's got a point in that doing a bunch of Arc, RwLock, and general sharing of state is going to get messy. Especially once you are sprinkling 'static all over the place, it infects everything, much like colored functions. I did this whole thing once back when I was starting off where I would Arc<RwLock> stuff, and try to be smart about borrow lifetimes. Total mess.
But then rust also has channels. When you read about it, it talks about "messages", which to me means little objects. Like a few bytes little. This is the solution, pretty much everything I write now is just a few tasks that service some channels. They look at what's arrived and if there's something to output, they will put a message on the appropriate channel for another task to deal with. No sharing objects or anything. If there's a large object that more than one task needs, either you put it in a task that sends messages containing the relevant query result, or you let each task construct its own copy from the stream of messages.
And yet I see a heck of a lot of articles about how to Arc or what to do about lifetimes. They seem to be things that the language needs, especially if you are implementing the async runtime, but I don't understand why the average library user needs to focus so much on this.
If you have a service that handles massive amounts of network calls at the core (think linkerd, nginx, etc.), or you want to have a massive amount of lightweight tasks in your game, or working on an embedded software where you want cooperative concurrency, async Rust is an amazing super-power.
Most system/application level things is not going to need async IO. Your REST app is going to be perfectly fine with a threadpool. Even when you do need async, you probably want to use it in a relatively small part of your software (network), while doing most of the things in threads, using channels to pass work around between async/blocking IO parts (aka hybrid model).
Rust community just mindlessly over-did using async literally everywhere, to the point where the blocking IO Rust (the actually better UX one) became a second class citizen in the ecosystem.
Especially visible with web frameworks where there is N well designed async web frameworks (Axum, Wrap, etc.) and if you want a blocking one you get:
tiny_http, absolute bare bones but very well done
rouille - more wholesome, on top of tiny_http, but APIs feel very meh comparing to e.g. Axum
astra - very interesting but immature, and rather barebonesI enjoy Rust, and I love how the compiler helps me solve problems. However, the ecosystem is "async or gtfo", or "just write it yourself if you dont want async lmao", and that's not good enough.
Right now even building a library that support multiple async runtimes is a PITA, I have done it a couple times. So you end up supporting either just tokio and maybe async-std.
https://docs.rs/futures/latest/futures/executor/fn.block_on....
imagine you have an:
async fn do_things() -> Something { /* ... */ }
you can: use futures::executor::block_on;
fn my_normal_code() {
let something = block_on(do_things());
}
but this does get messy if the async code you're running isn't runtime-agnostic :(This is one of the goals of the async working group. Hopefully, when ready, that'll make it possible to swap out async runtimes underneath arbitrary code without issues.
If you’re learning the language, I would suggest starting out with some more vanilla sync code, loops and if statements, get used to the borrowing. Async is clearly still under heavy development, and not just from an implementation level, but also from the level of our philosophical paradigm about what async means and how it ought to work for the user. It’s entirely possible for humanity to have the wrong approach to this issue and maybe someone in this discussion will be able to answer it more effectively.
The compiler really depends on traits, and the ability for traits to handle async is not stable. Many highly intelligent people are hard at work thinking about how to make async rust more correct, readable, and accessible. For example, look here: https://blog.rust-lang.org/inside-rust/2022/11/17/async-fn-i...
I would argue, if the async functionality of traits is not stable in rust, then it is silly for us to attack rust for not having nice async code, because we’re effectively criticizing an early rough draft of what will eventually be a correct and performant and accessible book.
The lifetime of an Arc isn’t unknowable, it’s determined by where and how you hold it.
I think maybe the disconnect in this article is that the author is coming at Rust and trying to force their previous mental models on to it (such as garbage collection) rather than learning how to work with the language. It’s a common trap for anyone trying a new programming language, but Rust seems to trip people up more than most.
I'm currently plumbing through some logic to call a sync method on a struct that implements Future and it's... an interesting challenge.
While we can make zero-cost async abstractions somewhat easy for users, the library developers are the ones who suffer the pain.
Async/await was a terrible idea for fixing JavaScript's lack of proper blocked threading that is currently being bolted onto every language. It splits every language and every library-ecosystem in half and will cause pains for many years to come.
Everyone who worked with multi-threading outside of JavaScript knows that using actors/communicating sequential processes is the best way to do multi-threading.
I recently found an explanation for that in Joe Armstrong's thesis. He argues that the only way to understand multi-threaded programs is writing strictly sequential code for every thread and not muddling all the code for all the threads in one place:
"The structure of the program should exactly follow the structure of the problem. Each real world concurrent activity should be mapped onto exactly one concurrent process in our programming language. If there is a 1:1 mapping of the problem onto the program we say that the program is isomorphic to the problem.
It is extremely important that the mapping is exactly 1:1. The reason for this is that it minimizes the conceptual gap between the problem and the solution. If this mapping is not 1:1 the program will quickly degenerate, and become difficult to understand. This degeneration is often observed when non-CO languages ["non concurrency-oriented", looking at you JavaScript!] are used to solve concurrent problems. Often the only way to get the program to work is to force several independent activities to be controlled by the same language thread or process. This leads to a inevitable loss of clarity, and makes the programs subject to complex and irreproducible interference errors." [0]
[0] https://erlang.org/download/armstrong_thesis_2003.pdf
There is also a good rant against async/await by Ron Pressler who implemented project loom in java: https://www.youtube.com/watch?v=oNnITaBseYQ
The async patterns in Rust, especially with regards to data safety assurances for the compiler, are emblematic of this philosophy. Though there are complexities, the value proposition is a safer concurrency model that requires developers to think deeply about their data and execution flow. I do concur that Rust might not be the go-to for every massively concurrent userspace application, but for systems where robustness and safety are paramount, the trade-offs are justifiable. It's also worth noting that as the ecosystem evolves, we'll likely see more abstractions and libraries that ease these pain points.
Still, diving into the intricacies as this article does, gives developers a better foundational understanding, which in itself is invaluable.
This implies that you can't statically guarantee that a future is cleaned up properly, which means that if you spawn some async work, something may std::mem::forget a future, and then the borrow checker won't know that the references that were transitively handed out by the future are still live.
Rather than sprinkle Arc everywhere, I just use an unsafe crate like this:
https://docs.rs/async-scoped/latest/async_scoped/
This catches 99% of the bugs I would have written in C++, so it's a reasonable compromise. There's been some work to try to implement non-'static futures in a safe way. I'm hoping it succeeds.
The other big problem with rust (but this is on the roadmap to be fixed this year) is that async trait's currently require Box'ed futures, which adds a malloc/free to function call boundaries(!!!)
As for the "just use a channel" advice: I've dealt with large codebases that are structured this way. It explodes your control flow all over the place. I think of channels as the modern equivalent of GOTO. (I do use them, but not often, and certainly not in cases where I just need to run a few things in parallel and then wait for completion.)
Concurrency's correct primitive is Hoare's Communicating Sequential Processes mapped onto green threads. Some languages that have it right are Java (since JDK17 - Java Virtual Threads), Go, Kotlin.
What I'm missing at the end of the article is the author's point: I believe they're advocating for the use of raw threads and manual management of concurrency, and doing away with the async paraphernalia. But, at the same time, earlier in the article they give the example of networking-related tasks as something that isn't so easy to deal with using only raw threads.
So, taking into account that await&co. are basically syntactic sugar + an API standard (iirc, I haven't used Rust so much lately), I wonder about what the alternative is. In particular, it seems to me like the alternative you could have would be everyone rolling their own "concurrency API", where each crate (inconsistently) exposes some sort of `await()` function, and you have to manually roll your async runtime every time. This would obviously also not be ideal.
> Maybe Rust isn’t a good tool for massively concurrent, userspace software. We can save it for the 99% of our projects that don’t have to be.
Personally, I'm a bit more radical than the author. You won't be able to write software like the example correctly. It should just not be done, ever. Machines can still optimize some sanely organized software into the same thing, maybe, if it happens to be a tractable problem (I'm not sure anybody knows). But people shouldn't touch that thing.
What that means is that when I'm writing async code, I have to audit every library I import to make sure that library is guaranteed to yield after a few microseconds of execution, otherwise my own core loops starve. Importing unknown code when using async rust is not safe for any application that needs to know its own threads won't starve.
A safe async language must guarantee that threads will make progress. Rust should change the scheduler so that it can pre-empt any code after that code has hogged a thread for too long.
Rust doesn't have a scheduler, and having one would be a no-go for any sufficiently low level code (e.g. in microcontrollers).
You might be looking for parallelism, not concurrency.
It was used because of ineptitude of languages where it become popular, and its far easier to implement into GC-less languages than message-passing-based asynchronous, but it's just misery to write code in. I'd prefer to suffer Go ineptitudes just to use bastardised message passing called channels there rather than any of the Python/JS/Rust async.
It was created to be an improvement over the Javascript situation, and somehow every language that had a sane structure adopted it as if it was not only good, but the way to do things. This is insane.
I see this repeated everywhere in this thread. async/await originated in C# not JS.
can you tell why it is not how not to do it in your opinion? What are the obvious issues with this approach?
JVM's futures are a joy to work with compared to JS's promises (or Kotlin's coroutines for that matter). While similar, I don't think you can conflate them.
Other times however rust stops me from writing buggy code and where I didn’t quite understand what I was doing. In some sense it can help you understand what your software better (when the problem isn’t an implementation detail).
I get the authors frustration, I often have the same feelings. Sometimes you just want to tell rust to get out of your way.
As an aside, I think there is room for a language similar to golang with sum types and modules and be a joy.
Concurrency is a subtype of parallelism. All concurrency is parallelism, but leaving some aspects of parallelism off the table.
I've worked in both worlds: I've built codes that manage thousands of connections through the ancient select() call on single processes (classic concurrency- IO multiplexing where most channels are not active simultaneously, and the amount of CPU work per channel is small) to synchronous parallelism on enormous supercomputers using MPI to eke out that last bit from Amdahl's law.
Over time I've come to the conclusion that a thread pool (possibly managed by the language runtime) that uses channels for communication and has optimizations for work stealing (to keep queues balanced) and eliminating context switches. Although it does not reach the optimal throughput of the machine (because shared memory is faster than message passing) it's a straightforward paradigm to work with and the developers of the concurrency/parallel frameworks are wise.
But these existential types can only be specified in function return or parameter position, so if you want to name a type for e.g.:
let x = async { };
You can't! Because you can only refer to it as `impl Future<Output = ()>` but that's not allowed in a variable binding! let x = || -> i32 { 1 }; // fine
let x = || -> impl Future<Output = i32> { async { 1 } }; // error: `impl Trait` only allowed in function and inherent method return types, not in closure return types
Unless I'm missing something, sometimes you do have to name the return type of an async closure if it's returning e.g Result<T, Box<dyn Error>>, and use of the ? operator means that the return type can't be inferred without an explicit annotation.I have some quibbles with this article:
"Rust comes at this problem with an “async/await” model"
No, it does not. It allows for that, and there's a big ... community ... around the async stuff, but in reality the language is entirely fine with operating using explicit concurrency constructs. And in fact for most applications I think standard synchronous functions combined with communicating channels is cleaner. I work in code bases that do both, and I find the explicit approach easier to reason about.
In the end, Async is something people ideally reach for only after they hit the wall with blocking on I/O. But in reality they're often reaching for it just because -- either because it's cool... or because some framework they are relying on mandates it.
But I think the pendulum will swing back the other way at some point. I don't think it's fair to tar the whole language with it.
Rust is all about lifetimes and the borrow checker. Async code (a la C#) will introduce overhead to reason about lifetime and it might not be as "fun" as it is with other languages that makes use of GCs and bigger runtimes.
The CSP vs Async/Await discussion is valid, but like in the majority of the cases, the drawbacks and benefits are not language relevant.
In CSP, the concurrent snippets behave just like linear/sequencial code as channels abstracts await a lot of the ugly bits. Sequential code tends to be easier to reason and this might be very important for Rust considering it design.
A good tool for massively concurrent software will as expected depend on the aspects you're evaluating: - Performance: the text does not show benchmarks evaluating Rust as a slow language. - Code/Feature throughput: the overall conclusion from the text if that Async Rust is a complex tool and expose the programmers in many ways to shoot themselves in the foot.
Assuming the "Maybe Rust..." is only talking about Async Rust, the existence of big Async Rust projects is a good counter argument. We also have the whole rest of the Rust language to code massively concurrent, userspace software.
Massively concurrent, userspace software tends to be complex and big to the point that design decisions generally impact way more the language decision.
Rust is a modern language with interesting features to prevent programmers from writing unsafe programs and this is a good head start to many when making those kind of programs, more than whether you want to use Async code or not.
* While the author states that not many apps "need" high concurrency in userspace... I would invert that and say that we may be missing so much performance, new potential applications, etc because highly concurrent code is so hard to get right. One bit of evidence of this (to me at least) is how often in my career I have had to scale things up due to memory or other resource limitations and not CPU. And when it is CPU, so often looking into it more finds bugs with concurrency that are the root cause or at least exacerbate the issue
* While I completely agree that rust is not easy with async and have myself poked around at which magical type things I need to do each time I have touched async rust code, I don't really like the suggestion being to "go use a different language", first, because if you are picking up rust, you (IMHO) should have a very good reason to already have chosen it. Rust is not easy enough or ubiquitous enough that you should be choosing it "just for fun" and your reason for using Rust should be compelling enough that you (right now) are willing to put in the effort to learn async when you need it
* What the other mentions in the body of the article, but I think is more of what my suggestion would be: don't use async unless you need it!. While I would love to see Rust (and think it should) evolve to the point where async is "easy", maybe we instead just need to get more pragmatic in what is taught and written about. I think when people start Rust they want to use all the fanciness, which includes async, and while some of that is just programmers, I think it is also how tutorials, docs, and general communication about a programming language happens where we show the breadth of capability, rather than the more realistic learning path, which leads people to feel like if they don't use async, they aren't doing it right
Finally, I do really hope Rust keeps working on the promise of these zero cost abstractions that can really simplify things... but if that doesn't work, I am at least hopeful of what people can build on top of the rust featureset/toolchain to help make things like async more realistic to be the default without the need for a complex VM/runtime.
I suspect that to take advantage of 1024-thread systems the only sane programming model will be structured concurrency with virtual threads instead of coroutines.
It’s the same progression as we saw in the industry going from unstructured imperative assembly programming to structured programming with modular features.
Both traditional mutexes and to a degree async programming are unstructured and global. They infect the whole codebase and can’t be reasoned about in isolation. This just doesn’t scale.
To your point, the C# guys seem to be interested in experimenting with green threads: https://twitter.com/davidfowl/status/1532880744732758018
It's an amazing combination.
Async functions don't have to always own their arguments. Just the outermost future that is getting spawned on another thread has to. The rest of the async program can borrow arguments as usual. You don't need to spawn() every task — there are other primitives for running multiple futures, with borrowed data, on the same thread.
In fact, this ability for a future to borrow from itself is the reason why Rust has native await instead of using callbacks. Futures can be "self-referential" in Rust, and nothing else is allowed to.
It is data-race free however.
This is the metaverse data overload problem - many creators, little instancing. No art director. No Q/A department. No game polishing. It's quite solveable, though.
Those occasional flashes on screen are the avatar (just a block in this version) moving asynchronously from the camera. That's been fixed.
The trouble is, we actually have tens/hundreds of people, all working on their own. The blessing and curse of open source development
When moving between threads I do what you suggest here and use channels to send signals rather than having a lot of shared state. Sometimes there is a crucial global state something that’s easier to just directly access, but I just write struct that manages all the Arc/RwLock or whatever other exclusive access mechanism I need for the access patterns. From the callers point of view everything is just a simple function call. When writing the struct I need to be thoughtful of sharing semantics but it’s a very small struct and I write it once and move on.
I also don’t understand their concern about making things Send+Sync. In my experience almost everything is easily Send+Sync, and things that aren’t shouldn’t or couldn’t be.
I get that sometimes you just want to wear sweatpants and write code without thought of the details, but most languages that offer that out of the box don’t really offer efficient concurrency and parallelism. And frankly you rarely actually need those things even if the “but it’s cool” itch is driving you. Most of the time a nodejs-esque single threaded async program is entirely sufficient, and a lot of the time Async isn’t even necessary or particularly useful. But when you need all these things, you probably need to hike up your sweatpants and write some actual computer code - because microseconds matter, profiled throughput is crucial, and nothing in life that’s complex is easy and anyone selling you otherwise is lying.
This is a recurring pattern I've started to notice with Rust: most things that repeatedly feel clunky, or noisy, or arduous, can be wrapped in an abstraction that allows your business logic to come back into focus. I've started to think this mentality is essential to any significant Rust project.
Async the keyword doesn’t, but Tokio forces all of your async functions to be multi thread safe. And at the moment, tokio is almost exclusively the only async runtime used today. 95% of async libraries only support tokio. So you’re basically forced to write multi thread safe code even if you’d benefit more from a single thread event loop.
Rust async’s set up is horrid and I wish the community would pivot away to something else like Project Loom.
I write a fair amount of code in Elixir professionally and this isn't how I view it.
There are some specific Elixir/Erlang bits of ceremony you need to do to set up your supervision tree of GenServers, but then once that's done you get to write code that feels like so gle threaded "ignore the rest of the world" code. Some of the function calls you're making might be "send and message and wait for a response" from GenServers etc. but the framework takes care of that.
I wrote some driver code for an NXP tag chip. Driving the inventory process is a bit involved, you have to do a series of things, set up hardware, turn on radio, wait a bit, send data, service the SPI the whole time in parallel. With the right setup for the hardware interface I just wrote the whole thing as a sequence, it was the simplest possible code you could imagine for it. And this at the same time as running a web server, and servicing hardware interrupts that cause it to reload the state of some registers and show them to each connected web session.
I imagine Rust to be a language far more similar to Go, in both use cases and functionality, than JS.
The dream of Smalltalk and true OOP is still alive.
If you say Smalltalk is better OOP I might agree, but calling it "true" is not correct.
When you need the absolute best performance sharing state is sometimes better - but you need a deep understanding of how your CPUs share state. A mutex or atomic write operation is almost always needed (the exceptions are really weird), and those will kill performance so you better spend a lot of time minimizing where you have them.
I would also suggest looking into ringbuffers and LMAX Disruptor pattern.
There is also Red Planet Lab's Rama, which takes the data flow idea and uses it to scale.
As a wise programmer once said, "Do not communicate by sharing memory; instead, share memory by communicating"
(But if you're only firing up a few tasks, why not just use threads? To get a nice wrapper around an I/O event loop?)
(This is assuming you are already switching to communicating using channels or similar abstraction.)
To get easier timers, to make cancellation at all possible (how to cancel a sync I/O operation?), and to write composable code.
There are patterns that become simpler in async code and much more complicated in sync code.
From https://news.ycombinator.com/item?id=37289579 :
> I haven't checked, but by the end of the day, I doubt eBPF is much slower than select() on a pipe()?
Channels have a per-platform implementation.
- "Patterns of Distributed Systems (2022)" (2023) https://news.ycombinator.com/item?id=36504073
Async code can scale essentially infinitely, because it can multiplex thousands of Futures onto a single thread. And you can have millions of Futures multiplexed onto a dozen threads.
This makes async ideal for situations where your program needs to handle a lot of simultaneous I/O operations... such as a web server:
http://aturon.github.io/blog/2016/08/11/futures/
Async wasn't invented for the fun of it, it was invented to solve practical real world problems.
Ultimately, it depends on your data model.
When you can guarantee sole ownership, why not put that exclusive pointer in the message? I’d think that this sort of compile-time lock would be an important advantage for the type system. (I think some VMs actually do this sort of thing dynamically, but I can’t quite remember where I read about it.)
On a multiprocessor, there’s of course a balance to be determined between the overhead of shuffling the object’s data back and forth between CPUs and the overhead of serializing and shuffling the queries and responses to the object’s owning thread. But I don’t think the latter approach always wins, does it? At least I can’t tell why it obviously should.
like “send request to channel A with message 123, make sure to get a response back from channel B exactly for that message”
But green threads were not and are not the right solution for Rust, so it's kind of beside the point. Async Rust is difficult, but it will eventually be possible to use Async Rust inside the Linux kernel, which is something you can't do with the Go approach.
Rust: it turns out that not every concurrency needs to be zero-cost abstraction
But it also praises Go for its implementation, which is also based on a coroutine of a different kind. Stackful coroutines, which do not have any of these problems.
Rust considered using those (and, at first, that was the project's direction). Ultimately, they went to the stackless operation model because stackfull coroutine requires a runtime that preempts coroutines (to do essentially what the kernel does with threads). This was deemed too expensive.
Most people forget, however, that almost no one is using runtime-free async Rust. Most people use Tokio, which is a runtime that does essentially everything the runtime they were trying to avoid building would have done.
So we are left in a situation where most people using async Rust have the worst of both worlds.
That being said, you can use async Rust without an async runtime (or rather, an extremely rudimentary one with extremely low overhead). People in the embedded world do. But they are few, and even they often are unconvinced by async Rust for their own reasons.
However, async Rust is not using stackless coroutines for this reason - it's using stackless coroutines because they achieve a better performance profile than stackful coroutines. You can read all about it on Aaron Turon's blog from 2016, when the futures library was first released:
http://aturon.github.io/blog/2016/08/11/futures/
http://aturon.github.io/blog/2016/09/07/futures-design/
It is not the case that people using async Rust are getting the "worst of both worlds." They are getting better performance by default and far greater control over their runtime than they would be using a stackful coroutine feature like Go provides. The trade off is that it's a lot more complicated and has a bunch of additional moving parts they have to learn about and understand. There's no free lunch.
I think that stackless coroutines are better than stackfull, in particular for Rust. Everything was done correctly by the Rust team.
Again, this is all fair and good, as long as people understand the tradeoff and make good technical decisions around. If they all jump on async bandwagon blind o the obvious limitations, we get where Rust ecosystem is now.
Stackful coroutines don't require a preemptive runtime. I certainly hope that we didn't end up with colored functions in Rust because of such a misconception.
I've used stackful coroutines many times in many codebases. It never required or used a runtime or preemption. I'm not sure why having a runtime that preempts them would even be useful, since it defeats the reason most people use stackful coroutines in the first place.
Yes. I just noticed that Tokio was pulled into my program as a dependency. Again. It's not being used, but I'm using a crate which has a function I'm not using which imports reqwest, which imports h2, which imports tokio.
I ask as someone who uses java and is about to rewrite a bunch of code to be able to chuck the entire async paradigm into the trash can and use a blocking model but on virtual threads where blocking is ok.
What does a good async API look like?
Also how do you prevent it spreading throughout a codebase?
I am trying to design a scalable architecture pattern for multithreaded and async servers. My design is that you have IO threads have asynchronous events into two halves "submit" and "handle". For example, system events from liburing or epoll are routed to other components. Those IO thread event loops run and block on epoll.poll/io_uring_wait_cqe.
For example, if you create a "tcp-connection" you can subscribe to async events that are "ready-for-writing" and "ready-for-reading". Ready-for-writing would take data out of a buffer (that was written to with a regular mutex) for the IO thread to send when EPOLLOUT/io_uring_prep_writev.
We can use the LMAX Disruptor pattern - multiproducer multiconsumer ringbuffers to communicate events between threads. Your application or thread pool threads have their own event loops and they service these ringbuffers.
I am working on a syntax to describe async event firing sequences. It looks like a bash pipeline, I call it statelines:
initialstate1 initialstate2 = state1 | {state1a state1b state1c} {state2a state2b state2d} | state3
It first waits for "initialstate1" and "initialstate2" in any order, then it waits for "state1", then it waits for the states "state1a state1b state1c" and "state2a state2b state2d" in any order.Edit: Of course, since this is what "unstable" means, right?
In the same sense that the lifetime of an object in a GC'd system has a lower bound of, "as long as it's referenced", sure. But that's nearly the opposite of what the borrow checker tries to do by statically bounding objects, at compile time.
> maybe the disconnect in this article is that the author is coming at Rust and trying to force their previous mental models on to it
The opposite actually! I spent about a decade doing systems programming in C, C++, and Rust before writing a bunch of Haskell at my current job. The degree to which a big language runtime and GC weren't a boogeyman for some problem spaces was really eye-opening.
Arc isn't an end-run around the borrow checker. If you need mutable references to the data inside of Arc, you still need to use something like a Mutex or Atomic types as appropriate.
> The degree to which a big language runtime and GC weren't a boogeyman for some problem spaces was really eye-opening.
I have the opposite experience, actually. I was an early adopter of Go and championed Garbage Collection for a long time. Then as our Go platforms scaled, we spent increasing amounts of our time playing games to appease the garbage collector, minimize allocations, and otherwise shape the code to be kind to the garbage collector.
The Go GC situation has improved continuously over the years, but it's still common to see libraries compete to reduce allocations and add complexity like pools specifically to minimize GC burden.
It was great when we were small, but as the GC became a bigger part of our performance narrative it started to feel like a burden to constantly be structuring things in a way to appease the garbage collector. With Rust it's nice to be able to handle things more explicitly and, importantly, without having to explain to newcomers to the codebase why we made a lot of decisions to appease the GC that appear unnecessarily complex at first glance.
Rust will do a lot of invisible memory relocations under the covers. Which can work great in single threaded contexts. However, once you start talking about threading those invisible memory moves are a hazard. The moment shared memory comes into play everything just gets a whole lot harder with the rust async story.
Contrast that with a language like java or go. It's true that the compiler won't catch you when 2 threads access the same shared memory, but at the same time the mental burden around "Where is this in memory, how do I make sure it deallocates correctly, etc" just evaporates. A whole host of complex types are erased and the language simply cleans up stuff when nothing references it.
To me, it seems like GCs simply make a language better for concurrency. They generally solve a complex problem.
These are not the same.
The problem with GC'd systems is that you don't know when the GC will run and eat up your cpu cycles. It is impossible to determine when the memory will actually be freed in such systems. With ARC, you know exactly when you will release your last reference and that's when the resource is freed up.
In terms of performance, ARC offers massive benefits because the memory that's being dereferenced is already in the cache. It's hard to understate how big of a deal this is. There's a reason people like ARC and stay away from GC when performance actually begins to matter. :)
The point about wrangling with Weak suggests that they're trying to build complex ownership structures (which, to be fair, would be easier in to deal with a single thread) which isn't really something easy to express in Rust in general. I use weak smart pointers exceedingly rarely. Outside of the first section (which isn't talking about async Rust specifically, it's just speaking about concurrency generally) channels aren't even mentioned. They're the main thing I use for communication between different parts of my program when writing async code and when interfacing between async and non-async code, plus the other signalling abstractions like Notify, semaphores, etc. Mutexes are slow and bottlenecky and shared state quickly gets complicated to manage, this has been known for ages. I think the problem might be more the `BIG_GLOBAL_STATIC_REF_OR_SIMILAR_HORROR` in the first place.
The comment about nothing stopping you from calling blocking code in an async context is valid, but it's relatively manageable and you can use `tokio::spawn_blocking` or similar when you must do it.
I think it's a fair assumption to say that the author is aware of what Arcs are and how they work. I believe their point is more so that because of how async works in Rust, users have to reach for Arc over normal RAII far more often than in sync code. So at a certain point, if you have a program where 90% of objects are refcounted, you might as well use a tracing GC and not have the overhead of many small heap allocations/frees plus atomic ops.
Perhaps there are in fact ways around Arc-ing things for the author's use cases. But in my (limited) experience with Rust async I've definitely run into things like this, and plenty of example code out there seems to do the same thing [1].
For what it's worth, I've definitely wondered whether a real tracing GC (e.g. [2]) could meaningfully speed up many common async applications like HTTP servers. I'd assume that other async use cases like embedded state machines would likely have pretty different performance characteristics, though.
[0] https://en.wikipedia.org/wiki/Garbage_collection_(computer_s...
[1] https://tokio.rs/tokio/tutorial/shared-state
[2] https://manishearth.github.io/blog/2015/09/01/designing-a-gc...
Fair, but when reading an article like this I have to refer to what's written, not what we think the author knew but didn't write.
…on a server where you can have a ton of RAM. It's superior on client machines because it's friendlier to swapped out memory, which is why Swift doesn't have a GC.
Obviously it's not random. It's statically unknowable.
In many cases this means it's much cheaper than objects in languages with implicit reference counting.
You cannot run scoped fibers, forcing you to "Arc shit up", Pins are unusable without unsafe, and a tiniest change in an async-function could make the future !Send across the entire codebase.
A good candidate for this is Graal. It can compile (JIT/AOT) both WASM and also LLVM bitcode directly so Rust programs can have full hardware/OS access without WASM limitations, and in theory it could allow apps to fully benefit from the work done on Loom and async. The pieces are all there. The main issue is you need to virtualize IO so that it goes back into the JVM, so the JVM controls all the code on the stack at all times. I think Graal can do this but only in the enterprise edition. Then you'd be able to run ~millions of Rust threads.
As fun as it is to hate on JavaScript, it's really interesting to go back and watch Ryan Dahl's talk introducing Node.js to the world (https://www.youtube.com/watch?v=EeYvFl7li9E). He's pretty ambivalent about it being JavaScript. His main goal was to find an abstraction around the epoll() I/O event loop that didn't make him want to tear his eyes out, and he tried a bunch of other stuff first.
I don't think it's a "good" solution in the abstract, but in the concrete of "I have a dynamically-typed scripting language with already over a decade of development and many more years of development that will happen before the event-based stuff is really standard", it's nearly the only choice. Python's gevent was the only other thing I saw that kinda solved the problem, and I really liked it, but I'm not sure it's a sustainable model in the end as it involves writing a package that aggressively reaches into other packages to do its magic; it is a constant game of catch-up.
I do think it's a grave error in the 2020s to adopt async as the only model for a language, though. There are better choices. And I actually exclude Rust here, because async is not mandatory and not the only model; I think in some sense the community is making the error of not realizing that your task will never have more than maybe a hundred threads in it and a 2023 computer will chomp on that without you noticing. Don't scale for millions of concurrent tasks when you're only looking at a couple dozen max, no matter what language or environment you're in. Very common problem for programmers this decade. It may well be the most impactful premature optimization in programming I see today.
JS callbacks are indeed better than C callbacks because you can hold onto some state. Although I guess the capture is implicit rather than explicit, so some people might say it's more confusing.
I'm pretty sure Joyent adopted and funded node.js because they were doing lots of async code in C, and they liked the callback style in JavaScript better. It does match the kind of problems that Go is now being used for, and this was pre-Go.
But anyway it is interesting how nobody really talks about callbacks anymore. Seems like async/await has taken over in most languages, although I sorta agree with the parent that it could have been better if designed from scratch.
Agreed. JavaScript was actually my first language after TurboPascal in 1996.
I was also there listening to the first podcasts when node came out.
JavaScript is a very interesting language, especially with it's prototype memory model. And the eventloop apart from the language is interesting as well. And it's no coincidence Apple went as far as baking optimizations for JavaScript primitive operations into the M1 microcode.
But I still think multithreading is best done by using blocking operations.
NIO can be implemented on top of blocking IO as far as I know but not the other way round.
Also, sidenote, I think JavaScript's only real failure is the lack of a canonical module/import system. That error lead to countless re-implementations of buildsystems and tens of thousands of hours wasted debugging.
but I get it, you can always go back to the promises and callbacks if you want.
I actually think it was a great solution in JS/TS given it's a single threaded event loop. The lower level the language the worse of an abstraction it is though. So I think most of the complaints here about async Rust are valid.
An important distinction to make is that tokio Futures aren't 'static, you can instead only spawn (take advantage of the runtime's concurrency) 'static Futures.
> This implies that you can't statically guarantee that a future is cleaned up properly.
Futures need to be Pin'd to be poll()'d. Any `T: !Unpin` that's pinned must eventually call Drop on it [0]. A type is `!Unpin` if it transitively contains a `PhantomPinned`. Futures generated by the compiler's `async` feature are such, and you can stick this in your own manually defined Futures. This lets you assume `mem::forget` shenanigans are UB once poll()'d and is what allows for intrusive/self-referential Future libraries [1]. The future can still be leaked from being kept alive by an Arc/Rc, but as a library developer I don't think you can/would-care-to reasonably distinguish that from normal use.
[0]: https://doc.rust-lang.org/std/pin/#drop-guarantee
[1]: https://docs.rs/futures-intrusive/latest/futures_intrusive/
Would you prefer not to have internal mutability, not to have `Rc`, or have them but with infectious unsafe trait bounds, or something else?
If an API leaks memory, then I’d like it to be deemed unsafe. That way, leaking a future would be unsafe, so the borrow checker could infer (transitively) that freeing the future means that any references it had are now dead (as it can already infer when a synchronous function call pops returns).
Am I missing something subtle?
Edit: Rc with cycles would be a problem. I rarely intentionally use Rc though (certainly less often than I create a future).
Edit 2: maybe an auto trait could statically disallow Rc cycles somehow?
The fact that a function can perform asynchronous operations matters to me and I want it reflected in the type system. I want to design my system on such a way that the asynchronous parts are kept where they belong, and I want the type system's help in doing that. "May perform asynchronous operations" is a property a calling function inherits from its callee and it is correctly modelled as such. I don't want to call functions that I don't know this about.
Now you can make an argument that you don't want to design your code this way and that's great if you have another way to think about it all that leads to code that can be maintained and reasoned about equally well (or more so). But calling the classes of functions red and blue and pretending the distinction has no more meaning than that is not such an argument. It's empty nonsense.
"We" don't all agree on this.
async doesn't tell you whether the function performs asynchronous operations, despite the name. async is an implementation detail about how the function must be invoked.
As TFA correctly points out, there's nothing stopping you from calling a blocking function inside a future, and blocking the whole runtime thread.
Either way, all of these changes are really annoying to make. We want less of these annoyances, not more.
Futures aren't a fundamental CS mistake, they're a design decision. You may disagree with that decision, but the advantage Rust brings is that you don't need to worry about thread safety once your program actually compiles, at the cost of different code styles.
Neither asynchronous processing design is fundamentally wrong, they both have their strengths and weaknesses.
Why would that ever be an issue? Instances of those classes shouldn't be shared between virtual threads just the same as when using regular threads.
true, but DateTimeFormatter has been available since Java 8, released almost 10 years ago.
VirtualThreads will be available in Java tomorrow
Also: https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDa...
There is also nothing fundamentally bad with cooperative scheduling in scope of a single process.
The vast majority of them were already wrong. They only got more wrong.
You may just be used to knowing what code is "synchronous" and what isn't because it's been shoved into your face and you've adapted your thought process to it. In practice, "everything important is doing something 'asynchronously'" turns out to be the vast majority of what you need, and the vast majority of your mental energy you are dedicated to splitting the world in two is a waste. For the little bit that remains, by all means use something specialized, but it's just not something that everyone, everywhere, needs to be doing all the time, any more than everyone everywhere should be manually allocating registers, or any more than programs need to have line numbers because otherwise how can they work? (One of my favorites because I remember having that conception myself.)
Can you elaborate?
1. inability to read an async result from a sync function, which is a legitimately major architectural limitation.
2. author's opinion how function syntax should look like (fully implicit, hiding how the functions are run).
And from this there is the endless confusion and drama.
The problem 1 is mostly limited to JS. Languages that have threads can "change colour" of their functions at will, so they don't suffer from the dramatic problem described in the article.
But people see languages don't fit the opinion 2, of having magic implicit syntax, and treat it as an equally big deal the dead-end problem 1. But two syntaxes are somewhere between minor inconvenience to actual feature. In systems programming it's very important which type of locks you use, so you really need to know what runs async.
I’m hesitant towards not distinguishing different things anymore and let the underlying system “figure it out”. I’m sure this could work as long as you’re on the happy path, but that’s not the only path there is.
This is like saying C++ allows for templates, and theres a big community around it. Sure, but its the entire community.
I don't think it's "the entire community" at all. Dealing with futures across library calls is a pain and almost every library that can avoid it, will avoid it.
I try to avoid async code because of its annoying pain points and I rarely see any circumstances where spawning a new thread doesn't work. Sure, there's more overhead, and you need some kind of limiting factor to prevent spawning a billion of them, but async isn't really required in most circumstances.
It's like saying Go allows for generics. Very few people and libraries bother with them. Working with them is kind of a pain. They're there jf you want to use them, but you generally don't.
Believe it or not, there's other types of things being built in Rust. Systems work, which I think Rust is more appropriate for.
Maybe async is the most popular concurrency construct there (I have no idea). But the entire population here is small.
FastComments pubsub system in Java takes less than 500mb of heap for like 100k subscribers.
But yes, you have to worry about object field count.
Thanks for the perspective.
let x = || async {
let file = std::fs::read_to_string("foo.txt")?;
Ok::<_, Box<dyn std::error::Error>>(file)
};Your own work would be some CPU-intensive operations you can logically divide and conquer.
Others' work would be waiting for file I/O from the OS, waiting for a DB result set following a query, waiting for a gRPC response, etc.
Conceptually quite distinct, and there are demonstrated advantages and drawbacks to each. Right tool for right job and all that.
article says you can panic if you use the pattern you show. specifically, if you call `my_normal_code()` from an async context.
is the author just talking about a quirk in tokio? or is this sort of wrapping intrinsically dangerous somehow?
Preempting tasks is a single-core simulation of parallelism. I suspect there is confusion about what parallelism and concurrency are here: the terms are often used interchangeably (especially saying "concurrency" instead of "parallelism"), but they are definitely not interchangeable - or even, arguably, related at all. Concurrency, by definition, is concerned with continuations. If you remove continuations (async/await and/or futures/promises - depending on the language choices) then you aren't talking about concurrency any more.
Either way, you can use parallelism is Rust today - just use blocking APIs, locking, and threads. I don't get what the big deal is. You can even use concurrency and parallelism together, just use await/async across multiple threads.
I agree with the premise of the article, but the reasoning in this comment chain is something along the lines of "cats are horrible, because 5." The criticism is foreign to the entire subject matter.
Hoare's later paper introduced buffered channels to CSP.
So one can use it as synchronous passing, or queued passing.
And also with fibers/virtual threads (project loom) you can actually have a million threads using blocking hand-off on one machine. So the performance argument is kind of gone.
Agreed. I don't hate on JS, in fact I think it's the best tool for the several very common use cases it targets, and I'll even defend the way objects work in it (i.e. lets me do what I want with minimal fuss). The import/require drama was annoying, though.
I miss his Twitch streams! https://www.twitch.tv/kunosstefano
I am neither a Rust guy or a graphics guy, but I have some interest in what is missing in the ecosystem.
Yes. [1]
[1] https://www.reddit.com/r/rust_gamedev/comments/13qt6rq/were_...
The thing is, for many people, including me, Rust is actually a more productive language than Python or other dynamic languages. Actually writing Python was an endless source of pain for me - this was the only language where my code did not initially work as expected more times than it did. Where in Rust it works fine from the first go in 99% of cases after it compiles, which is a huge productivity boost. And quite surprisingly, even writing the code in Rust was faster for me, due to more reliable autocomplete / inline docs features of my IDE.
To some, it means getting something minimal working and running as quickly as possible, accepting that there will be bugs, and that a comprehensive test suite will have to be written later to suss them all out.
To others (myself included), it means I don't mind so much if the first running version takes a bit longer, if that means the code is a bit more solid and probably has fewer bugs. And on top of that, I won't have to write anywhere near as many tests, because the type system and compiler will ensure that some kinds of bugs just can't happen (not all, but some!).
And I'm sure it means yet other things to other people!
I should have stated that I'm comparing Rust to typed Python (or TypeScript or typed Racket or whatever). Typed Python gives you a type system that's about a good as Rust's, and the same kinds of autocompletion and inline documentation that you would get with Rust, while also freeing you from the constraints of (1) being forced to type every variable in your program upfront, (2) being forced to manage memory, and (3) no interactive shell/REPL/Jupyter notebooks - Rust simply can't compete against that.
You're experience would likely have been very different if you were using typed Python.
No, it absolutely does not.
Also consider that Python has a type system regardless of whether or not you use typing, and that type system does not change because you've put type annotations on your functions. It does allow you to validate quite a few more things before runtime, of course.
I look at it a little differently: I'm fine with the fact that I'm prototyping my code 10x more slowly (usually the slowdown factor is nowhere near that bad, though; I'd say sub-2x is more common) than in another language because I enjoy the fact that when my code compiles successfully, I know there are a bunch of classes of bugs my code just cannot have, and this wouldn't be the case if I used the so-called "faster development" language.
I also hate writing tests; in a language like Rust, I can get away with writing far fewer tests than in a language like Python, but have similar confidence about the correctness of the code.
Disclaimer: I've sort of bounced off of Rust 3 or so times and while I've created both long-running services in it as well as smaller tools I've basically mostly had a hard time (not enjoying it at all, feeling like I'm paying a lot in terms of development friction for very little gain, etc.) and if you're the type to write off most posts with "You just don't get it" this would probably just be one more on the pile. I would argue that I do understand the value of Rust, but I take issue with the idea that the cost is worth it in the majority of cases, and I think that there are 80% solutions that work better in practice for most cases.
From personal experience: You could be prototyping your code faster and get performance in simpler ways than dealing with the borrow checker by being able to express allocation patterns and memory usage in better, clearer ways instead and avoid both of the stated problems.
Odin (& Zig and other simpler languages) with access to these types of facilities are just an install away and are considerably easier to learn anyway. In fact, I think you could probably just learn both of them on top of what you're doing in Rust since the time investment is negligible compared to it in the long run.
With regards to the upsides in terms of writing code in a performance-aware manner:
- It's easier to look at a piece of code and confidently say it's not doing any odd or potentially bad things with regards to performance in both Odin and Zig
- Both languages emphasize custom allocators which are a great boon to both application simplicity, flexibility and performance (set up limited memory space temporarily and make sure we can never use more, set up entire arenas that can be reclaimed or reused entirely, segment your resources up in different allocators that can't possibly interfere with eachother and have their own memory space guaranteed, etc.)
- No one can use one-at-a-time constructs like RAII/`Drop` behind your back so you don't have to worry about stupid magic happening when things go out of scope that might completely ruin your cache, etc.
To borrow an argument from Rust proponents, you should be thinking about these things (allocation patterns) anyway and you're doing yourself a disservice by leaving them up to magic or just doing them wrong. If your language can't do what Odin and Zig does (pass them around, and in Odin you can inherit them from the calling scope which coupled with passing them around gives you incredible freedom) then you probably should try one where you can and where the ecosystem is based on that assumption.
My personal experience with first Zig and later Odin is that they've provided the absolute most productive experience I've ever had when it comes to the code that I had to write. I had to write more code because both ecosystems are tiny and I don't really like extra dependencies regardless. Being able to actually write your dependencies yourself but have it be such a productive experience is liberating in so many ways.
Odin is my personal winner in the race between Odin and Zig. It's a very close race but there are some key features in Odin that make it win out in the end:
- There is an implicit `context` parameter primarily used for passing around an allocator, a temp-allocator and a logger that can be implicitly used for calls if you don't specify one. This makes your code less chatty and let's you talk only about the important things in some cases. I still prefer to be explicit about allocators in most plumbing but I'll set `context.allocator` to some appropriate choice for smaller programs in `main` and let it go
- We can have proper tagged unions as errors and the language is built around it. This gives you code that looks and behaves a lot like you'll be used to with `Result` and `Option` in Rust, with the same benefits.
- Errors are just values but the last value in a multiple-value-return function is understood as the error position if needed so we avoid the `if error != nil { ... }` that would otherwise exist if the language wasn't made for this. We can instead use proper error values (that can be tagged unions) and `or_return`, i.e.:
doing_things :: proc() ParsingError {
parsed_data := parse_config_file(filename) or_return
...
}
If we wanted to inspect the error this would instead be: // The zero value for a union is `nil` by default and the language understands this
ParsingError :: union {
UnparsableHeader,
UnparsableBody,
}
UnparsableHeader :: struct {
...
}
UnparsableBody :: struct {
...
}
doing_things :: proc() {
parsed_data, parsing_error := parse_config_file(filename)
// `p in parsing_error` here unpacks the tag of the union
// Notably there are no actual "constructors" like in Haskell
// and so a type can be part of many different unions with no syntax changes
// for checking for it.
switch p in parsing_error {
case UnparsableHeader:
// In this scope we have an `UnparsableHeader`
function_that_deals_with_unparsable_header(p)
case UnparsableBody:
function_that_deals_with_unparsable_body(p)
}
...
}
- ZVI or "zero-value initialization" means that all values are by default zero-initialized and have to have zero-values. The entire language and ecosystem is built around this idea and it works terrifically to allow you to actually talk only about the things that are important, once again.P.S. If you want to make games or the like Odin has the absolute best ecosystem of any C alternative or C++ alternative out there, no contest. Largely this is because it ships with tons of game related bindings and also has language features dedicated entirely to dealing with vectors, matrices, etc., and is a joy to use for those things. I'd still put it forward as a winner with regards to most other areas but it really is an unfair race when it comes to games.
On the voodoo ridden land that is software development, we have plenty of clearly harmful fads that are much older than Rust and yet practiced everywhere.
Up to now, rust async has lasted for less time than the NoSQL craziness. I'm hard pressed to think of any large fad that lasted less than it.
Btw, the turnaround time is longer with a database, which often forms the foundation of a system. NoSQL bandwagoning was so destructive in part because of how long it looked like a good idea each time. Same with ORMs.
https://youtu.be/L7XIFC2SawY?si=qN7TNxZi-P05uXVa
It's basically a custom 3D multithreaded OSM renderer, and the assets are a custom binary format. Uses very little network bandwidth.
Hoping to have an update this year that shows the updated graphics. I wrote a UI framework to improve my productivity (live hot reloading of UI components written with HTML with one way data binding. I had to do this because the game is gonna have so many UIs and I got tired of writing them in Java 8 style Java. Soon I can resume work on the game after sidewaysdata.com is doneish (also using the UI library to build the desktop/mobile timing application).
The "many UI" problem is large in Rust. Egui needs far too much Rust code per dialog box. Someone was working on a generator, but I haven't looked in on that project in a while.
Can you tell us which? Go, Haskell and the other usual suspect all have runtime with automatic, transparent preemption.
If coroutines can be preempted then it introduces a requirement for concurrency control that otherwise doesn't need to exist and interferes with dynamic cache locality optimizations. These are some of the primary benefits of using stackful coroutines in this context.
Being able to interrupt a stackful coroutine has utility for dealing with an extremely slow or stuck thread but you want this to be zero-overhead unless the thread is actually stuck. In most system designs, the time required to traverse any pair of sequential yield points is well-bounded so things getting "stuck" is usually a bug.
Letting end-users inject arbitrary code into these paths at runtime does require the ability to interrupt the thread but even that is often handled explicitly by more nuanced means than random preemption. Sometimes "extremely slow" is correct and expected behavior, so you have to schedule around it.
Python also comes with 2 features that seem to be stackless coroutines with attached syntax ceremonies, but one of those 2 features is commonly used with a hefty runtime instead of being used for control flow. JavaScript comes with 2 features named similarly to those of Python, but only one of them seems to be "runtime-free" stackless coroutines.
Could you expand a bit? Why?
And since the cancellation logic runs on the cancellable thread, you can't really cancel a blocking operation. What you can do is to let it run to completion, check that it was canceled, and discard the value.
The cancellation thread is generally the one doing the `select` so it spawns the operation thread(s) and waits for (one of) their results (i.e. through a channel/event). The others which lose the race are sent the cancellation signal and optionally joined if they need to be (i.e. they use intrusive memory).
The compiler already has knowledge that a function is being called as async - what prevents it from ensuring that a runtime is present when it does?
> blocking synchronously on an async task in an async runtime can result in deadlocks from task waiting on runtime IO polling but the waiting preventing the runtime from being polled
What prevents the runtime from preempting a task?
The runtime being a library instead of a language/compiler level feature. Custom runtimes is necessary for systems languages as they can have specialized constraints.
EDIT: Note that it's the presence of a supported runtime for the async operation (e.g. it relies on runtime-specific state like non-blocking IO, timers, priorities, etc.), not only the presence of any runtime.
> What prevents the runtime from preempting a task?
Memory efficient runtimes use stackless coroutines (think state machines) instead of stackful (think green threads / fibers). The latter comes with inefficiencies like trying to guess stack sizes and growing them on demand (either fixing pointers to them elsewhere or implementing a GC) so it's not always desirable.
To preempt the OS thread of a stackful coroutine (i.e. to catch synchronously blocking on something) you need to have a way to save its stack/registers in addition to its normal state machine context which is the worst of both worlds: double the state + the pointer stability issues from before.
This is why most stackful coroutine runtimes are cooperatively scheduled instead, requiring blocking opportunities to be annotated so the runtime can workaround that to still make progress.
Ron Pressler (@pron) from Loom @ Java had an interesting talk on the Java Language Summit just recently, talking about Loom’s solution to the stack copying: https://youtu.be/6nRS6UiN7X0
> The runtime being a library instead of a language/compiler level feature. Custom runtimes is necessary for systems languages as they can have specialized constraints.
Compilers link against dynamic libraries all the time. What prevents the compiler from linking against a hypothetical libasync.so just like any other library? (alternatively, if you want to decouple your program from a particular async runtime, what prevents the language from defining a generic interface that async runtimes must implement, and then linking against that?)
- the C# implementation predates even Promises in JS, so it is not "the same implementation" and your implication that C# was inspired by JS as opposed to the other way around is false. More background: [0]
- Typescript works fine with the JS implementation so any differences aren't for type safety reasons, but largely because C# has a multithreaded event loop unlike JS
Also promises (or "futures" as they're called elsewhere) aren't unique to any language. They're used in lots of places that predate both C# and JS's use, for example the twisted framework in Python.
I also think it's much more common to see it in library / framework code and not in application code.
There is a time and a place for it.
I think I’m used to other languages provided a lot of these abstractions or having some framework that manages it all. The frameworks in rust tend to be pretty low level (with a few notable exceptions) so perhaps that’s where it comes from.
Another thing, for me, was that I came from mostly writing TypeScript, which is the opposite: the base language is breezy without abstractions, and the type system equips you to strongly-type plain data and language features, so you'll have a great time if you stick to those
But yeah, it's been interesting to see how different the answers to these questions can be in different languages!
That makes abstractions far more useful and powerful, since you never need to do a cost-benefit analysis in your head, abstractions are just always a good idea in Rust.
For language designers I point out in my other comment that Anders authored both C# and TS. TS' influence on ES6 is documented publicly.. heck TC39 has an open proposal to add type annotations to JS now!
As for users, any dev that touches frontend has to write JS, unless you're purely a mobile or desktop shop (even then, there's electron). So yes, I think tons of folks willingly write C# and JS/TS. I'm certainly one (though write more Python than both these days). Was I an early adopter of async in Python because of my familiarity of it in C#/JS? You bet I was. Maybe I'm "insane."
I don't think it's quite accurate to point to "invisible memory relocations" as the problem that pinning solves. In most cases, memory relocations in Rust are very explicit, by moving an owned value when it has no live references (if it has any references, the borrow checker will stop you), or calling mem::replace() or mem::swap(), or something along those lines.
Instead, the primary purpose of pinning is to mark these explicit relocations as unsafe for certain objects (that are referenced elsewhere by raw pointer), so that external users must promise not to relocate certain objects on pain of causing UB with your interface. In C/C++, or indeed in unsafe Rust, the same idea can be more trivially indicated by a comment such as /* Don't mess with this object until such-and-such other code is done using it! */. All pinning does is to enforce this rule at compile time for all safe code.
Basically the problem is that async blocks/fns/generators need to create a struct that holds all the local variables within them at any suspension/await/yield point. But local variables can contain references to other local variables, so there are parts of this struct that reference other parts of this struct. This creates two problems:
- once you create such self-references you can no longer move this struct. But moving a struct is safe, so you need some unsafe code that "promises" you this won't happen. `Pin` is a witness of such promise.
- in the memory model having an `&mut` reference to this struct means that it is the only way to access it. But this is no longer true for self referential structs, since there are other ways to access its contents, namely the fields corresponding to those local variables that reference other local variables. This is the problem that's still open.
Because when stackless coroutines run they don’t have access to the stack that existed when they were created. everything that used to be on the stack needs to get packaged up in a struct (this is what `async fn` does). However now everything that used to point to something else on the stack (which rust understands and is fine with) now points to something else within the “impl Future” struct. Hence you have self referential structs.
The new api lets you allocate "memory segments", which are byte arrays/C style structs. Such segments can be passed to native code easily or just used directly, deallocated with or without GC, bounds errors are blocked, use-after-free bugs are blocked, and segments can also be confined to a thread so races are also blocked (all at runtime though).
Unfortunately it only becomes available as a finalized non-preview API in Java 22, which is the release after the next one. In Java 21 it's available but behind a flag.
`Rc` and internal mutability together do allow creating cycles and thus leaking with only safe code. I suggest you to read https://cglab.ca/~abeinges/blah/everyone-poops/ if you haven't done already, it explains the historical reasons for why `std::mem::forget` was changed to be safe.
> Edit 2: maybe an auto trait could statically disallow Rc cycles somehow?
It could be done with an auto and implied trait that is implemented in the opposite case (the type can be safely leaked), but then all the ecosystem must be changed to use `?Leakable` and it would be a pain.
As a bonus, high-availability systems could require NotLeaky at the top of their event loop, precluding runtime memory leaks.
Edit: that wouldn’t work, since the future could be leaked by the caller… will read that reference.
This is used to keep track of task runtime quotas so they can yield as soon as possible afterward.
This is the same technique used in Go and many others for preemption. If you don't add this, futures that don't yield can run forever, stalling the system.
You are right that it is not strictly necessary, but in practice, it is so helpful as a guard against the yielding problem that it's ubiquitous.
> I certainly hope that we didn't end up with colored functions in Rust because of such a misconception.
Misconceptions are everywhere unfortunately!
https://tokio.rs/blog/2020-04-preemption#a-note-on-blocking
> Tokio does not, and will not attempt to detect blocking tasks and automatically compensate
You may be referring to this particular issue in Go https://github.com/golang/go/issues/10958 which I think was somewhat addresses a couple releases back.
This is honestly shocking to hear. I would think that if people had bugs in their programs they would want them to fail loudly so they can be fixed.
Folks would rather have every future time sliced so that other tasks get some CPU time in a ~fair way (after all, there is no concept of task priority in most runtime).
But you're right: it isn't required, and you could sprinkle every loop of your code with yielding statements. But knowing when to yield is impossible for a future. If nothing else is running, it shouldn't yield. If many things are running but the problem space of the future is small, it probably shouldn't yield either, etc.
You simply do not have the necessary information in your future to make an informed decision. You need some global entity to keep track of everything and either yield for you or tell you when you should yield. Tokio does the former, Glommio does the latter.
It gets even more complex when you add IO into the mix because you need to submit IO requests in a way that saturates the network/nvme drives/whatever. So if a future submits an IO request, it's probably advantageous to yield immediately afterward so that other futures may do so as well. That's how you maximize throughput. But as I said, that's a very hard problem to solve.
you cannot convert an function that calls async code into a sync function.
You can only convert an int to a float with significant caveats. It's not a general trivial conversion. More complicated types may not be convertible at all or behave in all sorts of exciting ways (including having arbitrary side effects).
The point is that none of that is different to async functions. Of course you have to know what to do with them for them to be useful, but there is no requirement for them to "infect" calling code.
It's great, but message passing it is not.
If you haven't seen this paper, I bet you'll find at least one or two new bugs that you didn't know about: https://songlh.github.io/paper/go-study.pdf
Rust gives you channels (both synchronous blocking channels and async channels), and they work great, there is nothing stopping you from using them.
I mostly agree. But I would wager that for a significant amount of people their first exposure to "async" is JS and not any number of other languages. And when you try to write async Rust the same way as you might write async JS, things just aren't that pretty.
Every executor (including tokio) provides a `spawn_local` function that spawns Futures on the current thread, so they don't need to be Send:
https://docs.rs/tokio/1.32.0/tokio/task/fn.spawn_local.html
I have used Rust async extensively, and it works great. I consider Rust's Future system to be superior to JS Promises.
There's also a written conversation you can find online where he disqualifies pretty much all of the mainstream languages of being OO.
A lot of people, like you, say that OO == ADTs. Or rather, what ever Simula, C++ and Java are doing. Some will say that inheritance is an integral part of it, other's say it's all about interfaces.
But then there's people who say that Scheme and JavaScript are more object oriented than Java and C#. Or that when we're using channels or actors we're now _really_ doing OOP.
There's people who talk about patterns, SOLID, clean code and all sorts of things that you should be adhering to when structuring OO code.
Then there's people who say that OO is all about the mental model of the user and their ability to understand your program in terms of operational semantics. They should be able to understand it to a degree that they can manipulate and extend it themselves.
It's all very confusing.
This is pretty unlikely. See https://news.ycombinator.com/item?id=36879311.
That's publications though. Alan Kay says he used it in conversation in 1967: http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay...
There's probably also a distinction to be made between "object-oriented" and "object-oriented programming".
Right, including Smalltalk 76 and 80 onwards themselves. Remember Kay's statement "actually I made up the term object-oriented and I can tell you I did not have C++ in mind, so the important thing here is I have many of the same feelings about Smalltalk" (https://www.youtube.com/watch?v=oKg1hTOQXoY&t=636s); the reason he refers to Smalltalk this way in his 1997 talk was likely the fact that Smalltalk-80 has more in common with Simula 67 than his brain child Smalltalk-72. Ingalls explicitly refers to Simula 67 in his 2020 HOPL paper.
> and reject the things that make Smalltalk different
Which would mostly be its dynamic nature (Smalltalk-76 can be called the first dynamic OO language) and the use of runtime constructs instead dedicated syntax for conditions and loops (as it is e.g. the case in Lisp). There are a lot of dynamic OO languages still in use today, e.g. Python. Also Smalltalk-80 descendants are still in use, e.g. Pharo.
I consider the definition e.g. used by IEEE as sufficiently strict, see e.g. https://ethw.org/Milestones:Object-Oriented_Programming,_196..., but - as you say - it's not the defintion used by Kay.
> I have used Rust async extensively, and it works great. I consider Rust's Future system to be superior to JS Promises.
Sure, but it’s a major headache compared to Java VirtualThreads or goroutines
thread_local! exists, and you can just call spawn_local on each thread. You can even call spawn_local multiple times on the same thread if you want.
You can have some parts of your programs be multi-threaded, and then other parts of your program can be single-threaded, and the single-threaded and multi-threaded parts can communicate with an async channel...
Rust gives you an exquisite amount of control over your programs, you are not "stuck" or "locked in", you have the flexibility to structure your code however you want, and do async however you want.
You just have to uphold the basic Rust guarantees (no data races, no memory corruption, no undefined behavior, etc.)
The abstractions in Rust are designed to always uphold those guarantees, so it's very easy to do.
No your not, you spawn a runtime on each thread and use spawn_local on each runtime. This is how actix-web works and it uses tokio under the hood.
I am genuinely asking because I have little formal background in CS so "runtimes" and actual low level differences between , for instance, async and green threads mystifies me. EG What makes them actually different from the "runtime" perspective?
While there are other runtimes that are always single-threaded, you can do it with tokio too. You can use a single threaded tokio runtimes and !Send tasks with LocalSet and spawn_local. There are a few rough edges, and the runtime internally uses atomics where a from-the-ground-up single threaded runtime wouldn't need them, but it works perfectly fine and I use single threaded tokio event loops in my programs because the tokio ecosystem is broader.
https://docs.rs/tokio/1.32.0/tokio/task/fn.spawn_local.html
There is a lot of misinformation in this thread, with people not knowing what they're talking about.
The reason for that is the compiler quality, the design tradeoffs and Go's GC implementation throughput are simply not there for it to ever be a good general purpose systems-programming-oriented language.
Go receives undeserved hype, for use cases C# and Java are much better at due to their superior GC implementations and codegen quality (with C# offering better lower level features like structs+generics and first-class C interop).
But yes, GC is very much not free and is an explicit tradeoff vs compile time + manual memory management.
Making stong statements without a backup in hard facts is a sign of zealotry...
For example, you may be interested in this read: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...
Issues like these simply don't happen with GCs in modern JVM implementations or .NET (not saying they are perfect or don't have other shortcomings, but the sheer amount of developer hours invested in tuning and optimizing them far outstrips Go).
I don’t see how running into an OOM problem is necessarily a problem with the GC. That said, Java is a memory intensive language, it’s a trade off that Java is pretty up front about.
I don’t have a horse in this race but I would be quite surprised if Go’s GC implementation could even hold a candle to the ones found in C# and Java. They have spent literally decades of research and development, and god knows how much money (likely north of $1b), optimizing and refining their GC implementations. Go just simply lacks any of the sort of maturity and investment those languages have.
And in many use cases people are throwing Rust (and especially async Rust) on problems solved just fine with GC languages so the safety argument doesn’t apply there.
why do you believe this becomes the case with rust code?
I've been involved in Java, Python, PHP, Scala, C++, Rust, JS projects in my career. I think I'd notice a 5x speed difference in favor of Python if it existed. But I haven't.
For 1) It's common enough to have multiple runtimes in the same process, each setup possibly differently and running independently of each other. Often known as a "thread-per-core" architecture, this is the scheme used in apps focused on high IO perf like nginx, glommio, actix, etc.
For 2) runtime (libasync.so) implementations would have to cover a lot of aspects they may not need (async compute-focused runtimes like bevy don't need timers, priorities, or even IO) and expose a restrictive API (what's a good generic model for a runtime IO interface? something like io_uring, dpdk, or epoll? what about userspace networking as seen in seastar?). A pluggable runtime mainly works when the language has a smaller scope than "systems programming" like Ponylang or Golang.
As a side note; Rust tries to decouple the scheduling of Futures/tasks using its concept of Waker. This enables async implementations which only concern themselves with scheduling like synchronization primitives or basic sequencers/orchestration to be runtime-agnostic.
So then to tie this back to my earlier question - why does this make a difference between "async declared at function definition site" vs "async declared at function call site"?
Libraries have to be written against a specific async API (tokio vs async-std, to reference the linked Reddit thread) - that makes sense. But that doesn't change regardless of whether your code looks like `async fn foo() {...}` or `async foo();`. The compiler has ahead-of-time knowledge of both cases, as well...
[1] https://old.reddit.com/r/rust/comments/f10tcq/confusion_with...
It's more like "you notice when it happens". You don't know in advance when the last reference will be released (if you did, there would be no point in using reference counting).
> In terms of performance, ARC offers massive benefits because the memory that's being dereferenced is already in the cache.
It all depends on your access patterns. When ARC adjusts the reference counter, the object is invalidated in all other threads' caches. If this happens with high frequency, the cache misses absolutely demolish performance. GC simply does not have this problem.
> There's a reason people like ARC and stay away from GC when performance actually begins to matter.
If you're using a language without GC built in, you usually don't have a choice. When performance really begins to matter, people reach for things like hazard pointers.
A barista knows when a customer will pay for coffee (after they have placed their order). A barista does not know when that customer will walk in through the door.
> (if you did, there would be no point in using reference counting).
There’s a difference between being able to deduce when the last reference is dropped (for example, by profiling code) and not being able to tell anything about when something will happen.
A particular developer may not know when the last reference to an object is dropped, but they can find out. Nobody can guess when GC will come and take your cycles away.
> The cache misses absolutely demolish performance
With safe Rust, you shouldn’t be able to access memory that has been freed up. So cache misses on memory that has been released is not a problem in a language that prevents use-after-free bugs :)
> If you’re using a language without GC built in, you usually don’t have a choice.
I’m pretty sure the choice of using Rust was made precisely because GC isn’t a thing (in all places that love and use rust that is)
Sorry, no chance of deciphering that.
> There’s a difference between being able to deduce when the last reference is dropped (for example, by profiling code) and not being able to tell anything about when something will happen.
> A particular developer may not know when the last reference to an object is dropped, but they can find out.
The developer can figure out when the last reference to the object is dropped in that particular execution of the program, but not in the general sense, not anymore than they can in a GC'd language.
The only instance where they can point to a place in the code and with certainty say "the reference counted object that was created over there is always destroyed at this line" is in cases where reference counting was not needed in the first place.
> With safe Rust, you shouldn’t be able to access memory that has been freed up. So cache misses on memory that has been released is not a problem in a language that prevents use-after-free bugs :)
I'm not sure why you're talking about freed memory.
Say that thread A is looking at a reference-counted object. Thread B looks at the same object, and modifies the object's reference counter as part of doing this (to ensure that the object stays alive). By doing so, thread B has invalidated thread A's cache. Thread A has to spend time reloading its cache line the next time it accesses the object.
This is a performance issue that's inherent to reference counting.
> I’m pretty sure the choice of using Rust was made precisely because GC isn’t a thing (in all places that love and use rust that is)
Wanting to avoid "GC everywhere", yes. But Rust/C++ programs can have parts that would be better served by (tracing) garbage collection, but where they have to make do with reference counting, because garbage collection is not available.
but it also has big disadvantage, that it communicates to actual malloc for memory management, which is usually much less performant than GC from various reasons.
Can you elaborate?
I've seen a couple of malloc implementations, and in all of them, free() is a cheap operation. It usually involves setting a bit somewhere and potentially merging with an adjacent free block if available/appropriate.
malloc() is the expensive call, but I don't see how a GC system can get around the same costs for similar reasons.
What am I missing?
- A moving (and ideally, generational) GC means that you can recompact the heap, making malloc() little more than a pointer bump.
- This also suggests subsequent allocations will have good locality, helping cache performance.
Manual memory management isn't magically pause-free, you just get to express some opinion about where you take the pauses. And I'll contend that (A) most programmers aren't especially good at choosing when that should be, and (B) lots (most?) software cares about overall throughput, so long as max latency stays under some sane bound.
I've seen some benchmarks, but can't find them now, so maybe I am wrong about this.
> free() is a cheap operation. It usually involves setting a bit somewhere and potentially merging with an adjacent free block if available/appropriate.
there is some tree like structure somewhere, which then would allow to locate this block for "malloc()", this structure has to be modified in parallel by many concurrent threads, which likely will need some locks, meaning program operates outside of CPU cache.
In JVM for example, GC is integrated into thread models, so they can have heap per thread, and also "free()" happens asynchronously, so doesn't block calling code. Additionally, malloc approaches usually suffer from memory fragmentation, while JVM GC is doing compactions all the time in background, tracks memory blocks generations, and many other optimizations.
In most languages, current for loop value is always accessed a variable, not a reference. The only languages where it's not the case that I know of are Go and Python (JavaScript used to also have this problem with for(var ...), it was fixed with for(let ...)). So if you don't regularly write Go, it's easy to make this mistake.
It does.
Problem is that there isn't the documentation, examples etc to help navigate the many options.
Why do you think you don't need to know of it? I want to know if the function I'm calling is going to make a network request. Just because I can have a programming language that hides that distinction from me doesn't mean I want that.
Ideally I want to have the fundamental behavior of any function I call encoded in the function signature. So if it's async, I know it's going to reach out to some external system for some period of time.
That has nothing to do with function coloring.
> Ideally I want to have the fundamental behavior of any function I call encoded in the function signature.
There is no distinction of async functions if you don't have function coloring that you can encode in type signatures.
Sure, in the same way that types have nothing to do with enforcing logical correctness of software.
> There is no distinction of async functions if you don't have function coloring that you can encode in type signatures.
What are you trying to say with this statement?
I don't want to be able to call fallible function from an infallible one trivially, I want the compiler to force me to specify what exactly I'm going to do with an error if it happens. Likewise for async-from-sync: there are many ways I could call these: I can either create a single threaded executor and use it to complete the future to completion, or maybe I want to create a multithreaded executor, or maybe I expect the future complete in a single poll and never suspend and I don't even need a scheduler.
This is very false. Managed-memory languages don't require you to even think about lifetimes, let alone write them down.
Yes, I understand that this is for efficiency - but claiming that you have to think about lifetimes everywhere is just wrong, and irrelevant when discussing topics (prototyping/design work/scripting) where you don't care about efficiency.
Mostly you need to think about large and/or important objects, and avoid cycles, and avoid unneeded references to such objects that would live for too long. Such cases are few.
The silver lining is that if you make a mistake and a large object would have to live slightly longer, you won't have to wrangle with the lifetime checker for that small sliver of lifetime. But if you make a big mistake, nothing will warn you about a memory leak, before the prod monitoring does.
Can you provide concrete examples of this? I've literally never had a bug due to the nature of a memory-managed language.
Memory is only one of many types of resources applications use. Memory-managed languages do nothing to help you with those resources, and effectively managing those resources is way harder in those languages than in Rust or C++.
In both languages you have to rely on careful design, and then profile memory use and manage it.
However, Rust requires you to additionally reason about lifetimes explicitly. Again - great for performance, terrible for design, prototyping, and tools in non-resource-constrained environments*.
Sure, but in a lot of cases, these invariants can be trivially explained, or intuitive enough that it wouldn't even need explanation. While in Rust, you can easily spend a full day just explaining it to the compiler.
I remember spending litteral _days_ tweaking intricate lifetimes and scopes just to promise Rust that some variables won't be used _after_ a thread finishes.
Some things I even never managed to be able to express in Rust, even if trivial in C, so I just rely on having a C core library for the hot path, and use it from Rust.
Overall, performance sensitive lifetime and memory management in Rust (especially in multithreaded contexts) often comes down to:
1) Do it in _sane_ Rust, and copy everything all over the place, use fancy smart pointers, etc.
2) Do it in a performant manner, without useless copies, without over the top memory management, but prepare a week of frustrating development and a PhD in Rust idiosyncrasies.
Yup, this is correct - and the reason is because Rust forces you to care about efficiency concerns (lifetimes) everywhere. There's no option to "turn the borrow checker off" - which means that when you're in prototyping mode, you pay this huge productivity penalty for no benefit.
A language that was designed to be good at iteration would allow you to temporarily turn the borrow checker off, punch holes in your type system (e.g. with Python's "Any"), and manage memory for you - and then let you turn those features on again when you're starting to optimize and debug. (plus, an interactive shell and fast compilation times - that's non-negotiable) Rust was never designed to be good at prototyping.
I heard a saying a few years ago that I like - "it's designed to make hardware [rigid, inflexible programs], not software". (it's from Steve Yegge - I could track it down if I cared)
I fully acknowledge that I'm an "old school" system dev who's coming from the C world and not the JS world, so I probably have a certain bias because of that, but I genuinely can't understand how anybody could look at the mess that's Rust's async and think that it was a good design for a language that already had the reputation of being very complicated to write.
I tried to get it, I really did, but my god what a massive mess that is. And it contaminates everything it touches, too. I really love Rust and I do most of my coding in it these days, but every time I encounter async-heavy Rust code my jaw clenches and my vision blurs.
At least my clunky select "runtime" code can be safely contained in a couple functions while the rest of the code remains blissfully unaware of the magic going on under the hood.
Dear people coming from the JS world: give system threads and channels a try. I swear that a lot of the time it's vastly simpler and more elegant. There are very, very few practical problems where async is clearly superior (although plenty where it's arguably superior).
Basically Rust Futures is what Go wishes it could have. Rust made the right choice in waiting and spending the time to design async right.
Colored functions is a debatable problem at best. I consider it a feature not a bug and it makes reasoning about programs easier at the expense of writing additional async/await keywords which is really a very minor annoyance.
On the other hand Go's need of using channels to do trivial and common tasks like communicating the result of an async task together with the lack of RAII and proper cleanup signaling in channels (you can very easily deadlock if nothing is attached on the other end of the channel), plus no compile time race detection - all that makes writing concurrent code harder.
Rust has changed a lot in the past 5 years, people just haven't noticed, so they assume that Rust is still an old outdated language.
Async is just hard. That’s it. It’s fundamentally difficult.
In my experience language implementations of async fall into 2-axes: clarity and control. C# is straightforward-enough (having cribbed its async design off functional languages) but I find it scores low on the “clarity” scale and moderate-high in control, because you could control it, but it was t always clear.
JS is moderate-high clarity, low control: easy to understand, because all the knobs are set for you. Before it got async/await sugar, I’d have said it would have been low clarity, because I’ve seen the promise/callback hell people wrote when given rope.
Python is the bottom of the barrel for both clarity and control. It genuinely has to have the most awful and confusing async design I’ve ever seen.
I personally find Rust scores high in both clarity and control. Playing with the Glommio executor was what really solidified my understanding of how async works however.
What I realized, eventually, is that blocking is a beautiful thing. Embrace the thread of execution going to sleep, as another thread may now execute on the (single core at the time) CPU.
Now you have an organization problem, how to distribute threads across different tasks, some sequential, some parallel, some blocking, some nonblocking. Thread-per-request? Thread-per-connection?
And now a management problem. Spawning threads. Killing threads. Thread pools. Multithreaded logging. Exceptions and error handling.
Totally manageable in mild cases, and big wins in throughput, but scaling limits will present themselves.
I confront many of these tradeoffs in a fun little exercise I call "Miner Mover", implemented in Ruby using many different concurrency primitives here: https://github.com/rickhull/miner_mover
The other day, Intel revealed a processor with 66 thread support per core. 64 of those threads were called "slow", because there's no prefetching and speculative execution, as they are supposed to be waiting (mainly for memory, but networking could be another option). Perhaps very many cheap hardware threads is a way out of this.
Those objects are also virtually no problem in languages like Rust or C++. Those are local objects whose lifetimes are trivial and they are managed automatically with no additional effort from the developer.
The thing is, you think your code is safe and it most likely is, but mathematically speaking, what you are doing is difficult or even impossible to prove correct. It is akin to running an NP complete algorithm on a problem that is easier than NP. Most practical problem instances are easy to solve, but the worst case which can't be ruled out is utterly, utterly terrible, which forces you to use a more general solution than is actually necessary.
Since smart pointers because ubiquitous in c++, I've (personally) had only a handful of memory and lifetime issues. They were all deduceable by looking at where we "escape hatched" and stored a raw ptr that was actually a unique pointer, or something similar. I'll take having one of those every 18 months over throwing away my entire language, toolchain,ecosystem and iteration times.
i can’t think of anything you can do in c that you can’t do in unsafe rust, and that has the advantage that you can both narrow it down to exactly where you need it and only there, and your can test it in miri to find bugs
(In particular, it's very easy to inadvertently trigger the footgun of converting a pointer to a reference, then back to a pointer, so that using the original pointer again can invalidate the new pointer.)
Extremely pointer-heavy code is entirely possible in unsafe Rust, but often it's far more difficult to correctly express what you want compared to C. With that in mind, a tightly-scoped core library in C can make a lot of sense; more lines of unsafe code in either language leave more room for bugs to slip in.
That is not my point.
There is a world between "you can do it" and "you will do it".
Some things in Rust are doable in theory, but end up being so insane to implement that you won't do it in practice. That is my point.
That’s not really true. The standard workaround for this is just to .clone() or Rc<RefCell<>> to unblock yourself, then come back later and fix it.
It is true that this needs to be done with some care otherwise you can end up infecting your whole codebase with the “workaround”. That comes with experience.
It's a "workaround" precisely because the language does not support it. My statement is correct - you cannot turn the borrow-checker off, and you pay a significant productivity penalty for no benefit. "Rc" can't detect cycles. ".clone()" doesn't work for complex data structures.
Frankly I think this is a good thing! And I disagree with your "no benefit" assertion.
I don't like prototyping. Or rather, I don't like to characterize any phase of development as prototyping. In my experience it's very rare that the prototype actually gets thrown away and rewritten "the right way". And if and when it does happen, it happens years after the prototype has been running in production and there's a big scramble to rewrite it because the team has hit some sort of hard limit on fixing bugs or adding features that they can't overcome within the constraints of the prototype.
So I never prototype. I do things as "correctly" as possible from the get-go. And you know what? It doesn't really slow me down all that much. I personally don't want to be in the kind of markets where I can't add 10-20% onto a project schedule without failing. And I suspect markets where those sorts of time constraints matter are much rarer than most people tell themselves.
(And also consider that most projects are late anyway. I'd rather be late because I was spending more time to write better, safer code, than because I was frantically debugging issues in my prototype-quality code.)
Most bugs are elementary logic bugs expressible in every programming language.
Rust programmers don't iterate using unsafe because every single line of unsafe gives you more to think and worry about, not less. But they might iterate using more copying/cloning/state-sharing-with-ref-counting-and-RefCell than necessary, and clean up the ownership graph later if needed.
That's not iteration. That's debugging. "Iteration" includes design work. Rust's requirement to consider memory management and lifetimes actively interferes with design work with effectively zero contributions towards functional correctness (unlike types, which actually help you write less buggy code - but Rust's type system is not unique and is massively inferior to the likes of Haskell and Idris), let alone creating things.
I don't really agree with that. If you've decided on a design in Rust where you're constantly fighting with lifetimes (for example), that's a sign that you may have designed your data ownership wrong. And while it's not going to be the case all the time, it's possible that a similar design in another language would also be "wrong", but in ways that you don't find out until much later (when it's much harder to change).
> Rust's type system is not unique and is massively inferior to the likes of Haskell and Idris
Sure, but few people use Haskell or Idris in the real world for actual production code. Most companies would laugh me out of an interview if I told them I wanted to introduce Haskell or Idris into their production code base. That doesn't invalidate the fact that they have better type systems than Rust, but a language I can't/won't use for most things in most places isn't particularly useful to me.
I can see the argument that Rust encourages by its design a clean implementation to any given algorithm. But no language's design can guide you to finding a good algorithm for solving a given problem - you often need to quickly try out many different algorithms and see which works best for your constraints.
Rust adopted the stackless coroutine model for async tasks based on its constraints, such as having a minimal runtime by default, not requiring heap allocations left and right, and being amenable to aggressive optimizations such as inlining. The function coloring problem ("contamination") is an unfortunate consequence. The Rust devs are currently working on an effects system to fix this. Missing features such as standard async traits, async functions in traits, and executor-agnosticism are also valid complaints. Considering Rust's strict backwards compatibility guarantee, some of these will take a long time.
I like to think of Rust's "async story" as a good analogue to Rust's "story" in general. The Rust devs work hard to deliver backwards compatible, efficient, performant features at the cost of programmer comfort (ballooning complexity, edge cases that don't compile, etc.) and compile time, mainly. Of course, they try to resolve the regressions too, but there's only so much that can be done after the fact. Those are just the tradeoffs the Rust language embodies, and at this point I don't expect anything more or less. I like Rust too, but there are many reasons others may not. The still-developing ecosystem is a prominent one.
Anyway, while I have some issues with async around futur composition and closures, I see people with the kind of super strong reaction here and just feel like I must not be seeing something. To me, it solves the job well, is comprehensible and relatively easy to work with, and remains performant at scale without too much fiddling.
> I fully acknowledge that I'm an "old school" system dev who's coming from the C world and not the JS world, so I probably have a certain bias because of that, but I genuinely can't understand how anybody could look at the mess that's Rust's async and think that it was a good design for a language that already had the reputation of being very complicated to write.
I'm in the same "old school" system dev category as you, and I think that modern languages have gone off the deep end, and I complained about async specifically in a recent comment on HN: https://news.ycombinator.com/item?id=37342711
> At least my clunky select "runtime" code can be safely contained in a couple functions while the rest of the code remains blissfully unaware of the magic going on under the hood.
And we could have had that for async as well, if languages were designed by the in-the-trenches industry developer, and not the "I think Haskell and Ocaml is great readability" academic crowd.
With async in particular, the most common implementation is to color the functions by qualifying the specific function as async, which IMO is exactly the wrong way to do it.
The correct way would be for the caller to mark a specific call as async.
IOW, which of the following is clearer to the reader at the point where `foo` is called?
Option 1: color the function
async function foo () {
// ...
}
...
let promise = foo ();
let bar = await promise;
Option 2: schedule any function function foo () {
// ...
}
let sched_id = schedule foo ();
...
let bar = await sched_id;
Option 1 results in compilation errors for code in the call-stack that isn't async, results in needing two different functions (a wrapper for sync execution), and means that async only works for that specific function. Option 2 is more like how humans think - schedule this for later execution, when I'm done with my current job I'll wait for you if you haven't finished.What if your example code is holding onto a thread that foo() is waiting to use?
Said another way, explain how you solved the problems of just synchronously waiting for async. If that just worked then we wouldn't need to proliferate the async/await through the stack.
Actually, Rust could still learn a lot from these languages. In Haskell, one declares the call site as async, rather than the function. OCaml 5 effect handlers would be an especially good fit for Rust and solve the "colouration" problem.
In the mean time it is a little annoying to use, but I don’t mind designing against it by default. I feel less architecturally constrained if more syntactically constrained.
I've used Rust async extensively for years, and I consider it to be the cleanest and most well designed async system out of any language (and yes, I have used many languages besides Rust).
It really annoys me that something like this isn't built-in: https://github.com/mrkline/channel-drain
We could imagine extending this to arbitrary poll-able things. And now we have futures, kind of.
Any software that does a lot of fine-grained concurrent I/O has this issue. Database engines have been fighting this for many years, since they can pervasively block both on I/O and locking for data model concurrency control.
I guess if someone wants to use futures as if they were goroutines then it's not a bug, but this sort of presupposes that an opinionated runtime is already shooting signals at itself. Fundamentally the language gives you a primitive for switching execution between one context and another, and the premise of the program is probably that execution will switch back pretty quickly from work related to any single task.
I read the blog about this situation at https://tokio.rs/blog/2020-04-preemption which is equally baffling. The described problem cannot even happen in the "runtime" I'm currently using because io_uring won't just completely stop responding to other kinds of sqe's and only give you responses to a multishot accept when a lot of connections are coming in. I strongly suspect equivalent results are achievable with epoll.
"Async" in native code is cargo cult, unless you're trying to run on bare metal without OS support.
There is a reason why so many developers have chosen to do application level scheduling: No operating system has exposed viable async primitives to build this on the OS level. OS threads suck so everyone reinvents the wheel. See Java's "virtual threads", Go's goroutines, Erlang's processes, NodeJS async.
You don't seem to be aware what a context switch on an application level is. It is often as simple as a function call. There is no way that returning to the OS, running a generic scheduler that is supposed to deal with any possible application workload that needs to store all the registers and possibly flush the TLB if the OS makes the mistake of executing a different process first and then restore all the registers can be faster than simply calling the next function in the same address space.
Developers of these systems brag about how you can have millions of tasks active at the same time without breaking any sweat.
Most often when doing async you have a small number of tasks repeated many times, then you spin up one thread per CPU, and "randomly" assign each task as it comes in to a thread.
When doing GUI style programming you have a lot of different tasks and each task is done in exactly one thread.
So just like any other kind of scheduling? "Frequently" is also very subjective, and there are tradeoffs between throughput, latency, and especially tail latency. You can improve throughput and minimum latency by never preempting tasks, but it's bad for average, median, and tail latency when longer tasks starve others, otherwise SCHED_FIFO would be the default for Linux.
>I read the blog about this situation at https://tokio.rs/blog/2020-04-preemption which is equally baffling
You've misunderstood the problem somehow. There is definitely nothing about tokio (which uses epoll on Linux and can use io_uring) not responding in there. io_uring and epoll have nothing to do with it and can't avoid the problem: the problem is with code that can make progress and doesn't need to poll for anything. The problem isn't unique to Rust either, and it's going to exist in any cooperative multitasking system: if you rely on tasks to yield by themselves, some won't.
Yes. Industries that care about latency take some pains to avoid this as well, of course.
> io_uring and epoll have nothing to do with it and can't avoid the problem: the problem is with code that can make progress and doesn't need to poll for anything.
They totally can though? If I write the exact same code that is called out as problematic in the post, my non-preemptive runtime will run a variety of tasks while non-preemptive tokio is claimed to run only one. This is because my `accept` method would either submit an "accept sqe" to io_uring and swap to the runtime or do nothing and swap to the runtime (in the case of a multishot accept). Then the runtime would continue processing all cqes in order received, not *only* the `accept` cqes. The tokio `accept` method and event loop could also avoid starving other tasks if the `accept` method was guaranteed to poll at least some portion of the time and all ready handlers from one poll were guaranteed to be called before polling again.
This sort of design solves the problem for any case of "My task that is performing I/O through my runtime is starving my other tasks." The remaining tasks that can starve other tasks are those that perform I/O by bypassing the runtime and those that spend a long time performing computations with no I/O. The former thing sounds like self-sabotage by the user, but unfortunately the latter thing probably requires the user to spend some effort on designing their program.
> The problem isn't unique to Rust either, and it's going to exist in any cooperative multitasking system: if you rely on tasks to yield by themselves, some won't.
If we leave the obvious defects in our software, we will continue running software with obvious defects in it, yes.
Since the advent of Java in mid-90s I hear about superiority of its VM, yet my observations from the ops PoV claim otherwise. So I suspect a huge hoax...
Hey btw, you're saying "Java is _memory intensive_", like it would magically explain everything. Let's get to that more deeply. Why is it so, dear Watson? Have you compared the memory consumption of the same algo and pretty much similar data structures between languages? Why Java has to be such a memory hog? Why also its class loading is so slow? Are these a qualities of superior VM design and zillions of man-hour invested? huh?
By the way, if the code implementing functionality X needs N times more memory than the other language with gc, then however advanced that gc would be (need to find a proof for that btw), it wouldn't catch up speedwise, because it simply needs to move around more. So simple.
Marketing is not a silver bullet for success and the tech industry is full of examples of exactly that. The truth is that Sun was able to promote Java so heavily because it was found to be useful.
Since the advent of Java in mid-90s I hear about superiority of its VM, yet my observations from the ops PoV claim otherwise.
The landscape of the 90s certainly made a VM language appealing. And compared to the options of that day it's hardly any wonder.
So I suspect a huge hoax...
It's you verses a plurality, if not majority, of the entire enterprise software market. Of course that's not to say that Java doesn't have problems or that the JVM is perfect, but is it so hard to believe that Java got something right? Is it honestly more believable that everyone else is caught up in a collective delusion?
Hey btw, you're saying "Java is _memory intensive_", like it would magically explain everything.
It's not that Java is necessarily memory intensive, but that a lot of Java performance tuning is focused towards optimizing throughput performance, not memory utilization. Cleaning out a large heap occasionally is in general better than cleaning out a smaller one more frequently.
By the way, if the code implementing functionality X needs N times more memory than the other language with gc, then however advanced that gc would be (need to find a proof for that btw), it wouldn't catch up speedwise, because it simply needs to move around more. So simple
It's not so simple. First of all, the choice of a large heap is not mandated by Java, it's a trade off that developers are making. Second of all, GC performance issues only manifest when code is generating a lot of garbage, and believe it or not, Java can be written to vastly minimize the garbage produced. And last of all, Java GCs like Shenandoah have a max GC pause time of less than 1ms for heaps up to 16TiB.
Anyway, at the end of the day no one is going to take Go away from you. Personally I don't have a horse in this race. That said, the fact is that Java GCs are far more configurable, sophisticated, and advanced than anything Go has (and likely ever will). IMO, Go came at a point in time where there was a niche to exploit, but that niche is shrinking.
But I think you are avoiding a direct answer to the question why Java needs so much memory in the first place. You say about "developer's choice for a big heap", first I don't think it is their choice, but the consequence of the fact that such a big heap is needed at all, for a typical code. Why?
Let's code a basic https endpoint using typical popular framework returning some simple json data. Usually stuff. Why it will be consuming 5x - 10x more memory for Java? And, if one says it's just unrealistic microbenchmark, things go worse when coding more real stuff.
Btw,having more knobs for a gc is not necessarily a good thing, if it means that there are no fire-and-forget good defaults. If an engineer needs continously to get his head around these knobs to have a non-crashing app, then we have problem. Or rather - ops have a problem, and some programmers are, unfortunately, disconnected from the ops realm. Have you been working together with ops guys? On prod, ofc?
If you can get away with smart pointers and such, life is beautiful, nothing wrong there!
The debate here is rather for the cases where you cannot afford such things.
If your comments aren't relevant to writing a game engine, then they're not relevant to this thread.
This is false. This "thread" is not "about" anything. The top-level comment was about writing a game engine, and various replies to that thread deviated from that topic to a greater or lesser extent. Nobody has the authority to decide what a thread is "about".
Additionally, the actual article under consideration is about Rust's design in general. That makes my comments more on topic than one about game engines in particular, and so it should be pretty clear that if you're going to assume anything about my comments, then it would not be that they're about game engines.
Case in point, I once wrote a program to take a 360 degree image and rotate it so that the horizon followed the horizontal line along the middle, and it faced north. I wrote it in python first and running it on a 2k image took on the order of 5 minutes. I rewrote it in rust and it took on the order of 200ms.
Could I iterate in Python faster? Yes, but the end result was useless.
This thread, and many other threads about Rust, are filled with people arguing the exact opposite - that Rust is a good, productive language for high-level application development. I agree with you, there's relatively little overlap - that's what I'm arguing for!
These can also be quite narrow: Rc is a zero-cost abstraction for refcounting with both strong and weak references allocated with the object on the heap. You cannot implement something the same more efficiently, but you can implement something different but similar that is both faster and lighter than Rc. You can make a CheapRc that only has strong counts, and that will be both lighter and faster by a tiny amount, or a SeparateRc that stores the counts separately on the heap, which offers cheaper conversions to/from Rc.
We're talking about the comparison between using an abstraction vs not using an abstraction.
When I said "doesn't have a runtime cost", I meant "the abstraction doesn't have a runtime cost compared to not using the abstraction".
If you want your computer to do anything useful, then you have to write code, and that code has a runtime cost.
That runtime cost is unavoidable, it is a simple necessity of the computer doing useful work, regardless of whether you use an abstraction or not.
Whenever you create or use an abstraction, you do a cost-benefit analysis in your head: "does this abstraction provide enough value to justify the EXTRA cost of the abstraction?"
But if there is no extra cost, then the abstraction is free, it is truly zero cost, because the code needed to be written no matter what, and the abstraction is the same speed as not using the abstraction. So there is no cost-benefit analysis, because the abstraction is always worth it.
So I assume you mean "implementation complexity" but that's irrelevant, because that cost only needs to be paid once, and then you put the abstraction into a crate, and then millions of people can benefit from that abstraction.
No abstraction is perfect. Every abstraction, when encountered by a user, requires them to ask "what does this do?", because they don't have the implementation in front of their eyes
This may be an easy question to answer- maybe it maps very obviously to a pattern or domain concept they already know, or maybe they've seen this exact abstraction before and just have to recall it
It may be slightly harder- a new but well-documented concept, or a concept that's intuitive but complex, or a concept that's simple but poorly-named
Or it may be very hard- a badly-designed abstraction, or one that's impossible to understand without understanding the entire system
But the simplest, most elegant, most intuitive abstraction in the world has nonzero cognitive cost. We abstract despite the cost, when that cost is smaller than the cost of not abstracting.
The "zero-cost" phrase is deceptive. There's a non-zero cognitive cost to the author and all subsequent readers. A proliferation of abstractions increases the cost of every other abstraction further due to complex interactions. This is true of in all languages where the community has embraced the idea of abstraction without moderation.
Rust embraces zero to low cost abstraction at the machine performance level, although to get reflective or runtime adaptive abstractions you end up losing some of that zero cost as you need to start boxing and moving things into heaps and using vtables, etc. IMO this is where rust is weakest and most complex.
No, the cognitive cost of a particular abstraction relative to all other abstractions under consideration can be negative.
The option of not using any abstraction doesn’t exist. If you disagree with that then I think we have to go back one step and ask what an abstraction even is.
Developers of anything resembling complex scripts (for the time) had to manually break these cycles by setting to null the attributes of the DOM node that had references to any JS objects.
Douglas Crockford has a little writeup here[0] with a heavy-handed solution, but it was better than doing it by hand if you were worried another developer would come along and add something and forget to remove it.
Other memory managed languages also have to deal with the occasional sharp corners. Most of the time, this can be avoided by knowing to clean up resources properly, but some are easier to fall for than others.
Oracle has a write up on hunting Java memory leaks [1] Microsoft has a similar, but less detailed article here[2]
Of course, sometimes a "leak" is really a feature. One notorious example is variable shadowing in the bad old days of JS prior to the advent of strict mode. I forget the name of the company, but someone's launch was ruined because a variable referencing a shopping cart wasn't declared with `var` and was treated as a global variable, causing concurrent viewers to accidentally get other user's shopping cart data as node runs in a single main thread, and concurrency was handled only by node's event loop.
[0] https://www.crockford.com/javascript/memory/leak.html
[1] https://docs.oracle.com/en/java/javase/17/troubleshoot/troub...
[2] https://learn.microsoft.com/en-us/dotnet/core/diagnostics/de...
To be more precise: this is a bug, that was fixable, in the runtime, not in user applications that would run on top of it.
Assume a well-designed memory-safe language and implementation. What kinds of memory hazards are there?
Notwithstanding the rest of your comment, this doesn't seem like a good example of the problem, since most GCs have a complete view of their memory.
In an ideal world, we could have a GC that reclaimed all unused memory, but that turns out to be impossible because of the halting problem. So, we settle for GCs that reclaim only unreachable memory, which is a strict subset of unused memory. Unused reachable memory is a leak.
On the type system theory, Rust still has quite something to catch up to theorem provers, which even those aren't without issues.
Yes and no. You're gonna write far fewer tests in a language like Rust than in a language like Python. In Python you'll have to write tests to eliminate the possibility of bugs that the Rust compiler can eliminate for you. I would much rather just write logic tests.
> Most bugs are elementary logic bugs expressible in every programming language.
I don't think that's true. I would expect that most bugs are around memory safety, type confusion, or concurrency issues (data races and other race conditions).
In modern C++, memory safety and type confusion aren’t common sources of bugs in my experience. The standard idiomatic design patterns virtually guarantee this. The kinds of concurrency issues that tend to cause bugs can happen in any language, including Rust. Modern C++, for all its deficiencies, has an excellent type safety story, sometimes better than Rust. It doesn’t require the language to provide it though, which is both a blessing and a curse.
I do love Rust and found a number of very valid uses for it but its async story leaves a lot to be desired. I don't enjoy writing it though I do enjoy the results.
Abstractions allow systems to scale. Without them, it would be impossible to work on a system that's 1M lines of code long, because you'd have to read and understand all 1M lines before doing anything.
Why? It isn't solved for async functions, is it? Just because the async is propagated up the call-stack doesn't mean that the call can't deadlock, does it?
Deadlocks aren't solved for a purely synchronous callstack either - A grabbing a resource, then calling B which calls C which calls A ...
Deadlocks are potentially there whether or not you mix sync/async. All that colored functions will get you is the ability to ignore the deadlock because that entire call-stack is stuck.
> If that just worked then we wouldn't need to proliferate the async/await through the stack.
It's why I called it a leaky abstraction.
Maybe I'm misunderstanding what you are saying. I use the word "_implementation_type_" below to mean "either implemented as option 1 or option 2 from my post above."
With current asynchronous implementations (like JS, Rust, etc), any time you use `await` or similar, that statement may never return due to a deadlock in the callstack (A is awaiting B which is awaiting C which is awaiting A).
And if you never `await`, then deadlocking is irrelevant to the _implementation_type_ anyway.
So I am trying to understand what you mean by "it cannot deadlock in this way" - in what way do you mean? async functions can accidentally await on each other without knowing it, which is the deadlock I am talking about.
I think I might understand better if you gave me an example call-chain that, in option 1, sidesteps the deadlock, and in option 2, deadlocks.
Except if you go to a GC language, but then you’re prototyping other types of stuff than you’d probably pick Rust for.
I agree that being able to use `async` inside of traits would be very useful, and hopefully we will get it soon.
> generally needing more capability to existentially quantify Future types without penalty
Could you clarify what you mean by that? Both `impl Future` and `dyn Future` exist, do they not work for your use case?
> Async function types are a mess to write out.
Are you talking about this?
fn foo() -> impl Future<Output = u32>
Or this? async fn foo() -> u32
> More control over heap allocations in async/await futures (we currently have to Box/Pin more often than necessary).I'm curious about your code that needs to extensively Box. In my experience Boxing is normally just done 1 time when spawning the Future.
> Async drop.
That would be useful, but I wouldn't call the lack of it "half-baked", since no other mainstream language has it either. It's just a nice-to-have.
> Better cancellation.
What do you mean by that? All Futures/Streams/etc. support cancellation out of the box, it's just automatic with all Futures/Streams.
If you want really explicit control you can use something like `abortable`, which gives you an AbortHandle, and then you can call `handle.abort()`
Rust has some of the best cancellation support out of any async language I've used.
> Async iteration.
Nicer syntax for Streams would be cool, but the combinators do a good job already, and StreamExt already has a similar API as Iterator.
It'd be very nice to be able to use `impl` in more locations, representing a type which needs not be known to the user but is constant. This is a common occurrence and may let us write code like `fn foo(f: impl Fn() -> impl Future)` or maybe even eventually syntax sugar like `fn foo(f: impl async Fn())` which would be ideal.
Re: Boxing
I find that a common technique needed to get make abstraction around futures to work is the need to Box::pin things regularly. This isn't always an issue, but it's frequent enough that it's annoying. Moreover, it's not strictly necessary given knowledge of the future type, it's again more of a matter of Rust's minimal existential types.
Re: async drop and cancellation.
It's not always possible to have good guarantees about the cleanup of resources in async contexts. You can use abort, but that will just cause the the next yield point to not return and then the Drops to run. So now you're reliant on Drops working. I usually build in a "kind" shutdown with a timer before aborting in light of this.
C# has a version of this with their CancelationTokens. They're possible to get wrong and it's easy to fail to cancel promptly, but by convention it's also easy to pass a cancelation request and let tasks do resource cleanup before dying.
Re: Async iteration
Nicer syntax is definitely the thing. Futures without async/await also could just be done with combinators, but at the same time it wasn't popular or easy until the syntax was in place. I think there's a lot of leverage in getting good syntax and exploring the space of streams more fully.
Golang supports running asynchronous code in defers, similar with Zig when it still had async.
Async-drop gets upgraded from a nice-to-have into an efficiency concern as the current scheme of "finish your cancellation in Drop" doesn't support borrowed memory in completion-based APIs like Windows IOCP, Linux io_uring, etc. You have to resort to managed/owned memory to make it work in safe Rust which adds unnecessary inefficiency. The other alternatives are blocking in Drop or some language feature to statically guarantee a Future isn't cancelled once started/initially polled.
Yeah, there's your misunderstanding, you've got it backwards. The problem being described occurs when I/O isn't happening because it isn't needed, there isn't a problem when I/O does need to happen.
Think of buffered reading of a file, maybe a small one that fully fits into the buffer, and reading it one byte at a time. Reading the first byte will block and go through epoll/io_uring/kqueue to fill the buffer and other tasks can run, but subsequent calls won't and they can return immediately without ever needing to touch the poller. Or maybe it's waiting on a channel in a loop, but the producer of that channel pushed more content onto it before the consumer was done so no blocking is needed.
You can solve this by never writing tasks that can take "a lot" of time, or "continue", whatever that means, but that's pretty inefficient in its own right. If my theoretical file reading task is explicitly yielding to the runtime on every byte by calling yield(), it is going to be very slow. You're not going to go through io_uring for every single byte of a file individually when running "while next_byte = async_read_next_byte(file) {}" code in any language if you have heap memory available to buffer it.
I assumed that users would issue reads of like megabytes at a time and usually receive less. Does the example of reading from a socket in the blog post presuppose a gigabyte-sized buffer? It sounds like a bigger problem with the program is the per-connection memory overhead in that case.
The proposal is obviously not to yield 1 million times before returning a 1 meg buffer or to call read(2) passing a buffer length of 1, is this trolling? The proposal is also not some imaginary pie-in-the-sky idea; it's currently trading millions of dollars of derivatives daily on a single thread.
>I'm not familiar with tokio so I did not know that it maintained buffers in userspace
Most async runtimes are going to do buffering on some level, for efficiency if nothing else. It's not strictly required but you've had an unusual experience if you've never seen buffering.
>filled them before the user called read()
Where did you get this idea? Since you seem to be quick to accuse others of it, this does seem like trolling. At the very least it's completely out of nowhere.
>it could still have read() yield and return the contents of the buffer.
If I call a read_one_byte, read_line, or read(N) method and it returns past the end of the requested content that would be a problem.
>I assumed that users would issue reads of like megabytes at a time and usually receive less.
Reading from a channel is the other easy example, if files were hard to follow. The channel read might implemented as a quick atomic check to see if something is available and consume it, only yielding to the runtime if it needs to wait. If a producer on the other end is producing things faster than the consumer can consume them, the consuming task will never yield. You can implement a channel read method that always yields, but again, that'd be slow.
>The proposal is obviously not to yield 1 million times before returning a 1 meg buffer, is this trolling
No, giving a illustrative example is not trolling, even if I kept the numbers simple to make it easy to follow. But your flailing about with the idea of requiring gigabyte sized buffers probably is.
>Rust embraces abstractions because Rust abstractions are zero-cost. So you can liberally create them and use them without paying a runtime cost.
>you never need to do a cost-benefit analysis in your head, abstractions are just always a good idea in Rust
Again though, and ignoring that, "zero-cost abstraction" can be very narrow and context specific, so you really don't need to go out of your way to find "costly" abstractions in Rust. As an example, if you have any uses of Rc that don't use weak references, then Rc is not zero-cost for those uses. This is rarely something to bother about, but rarely is not never, and it's going to be more common the more abstractions you roll yourself.
A is sychrounously waiting B which is awaiting C which could complete but never gets scheduled because A is holding onto the only thread. Its a very common situation when you mix sync and async and you're working in a single threaded context, like UI programming with async. Of course it can also cause starvation and deadlock in a multithreaded context as well but the single thread makes the pitfall obvious.
That's specifically why I called it a Leaky Abstraction in my first post on this: too many people are confusing a particular implementation of asynchronous function calls with the concept of asynchronous function calls.
I'm complaining about how the mainstream languages have implemented async function calls, and how poorly they have done so. Pointing out problems with their implementation doesn't make me rethink my position.
Besides Javascript, its also a common problem in C# when you force synchronous execution of an async Task. I'm fairly sure its a problem in any language that would allow an async call to wait for a thread that could be waiting for it.
I really can't imagine how your proposed syntax could work unless the synchronous calls could be pre-empted, in which case, why even have async/await at all?
But I look forward to your implementation.
So does Rust. You can run async code inside `drop`.
> If you only want this functionality for rapid iteration/prototyping, which was what you originally said, then leaking memory in those circumstances is not such a problem.
There's use-cases for wanting your language to be productive outside of prototyping, such as scripting (which I explicitly mentioned earlier in this thread[1] - omission here was not intentional), and quickly setting up tools (such as long-running web services) that don't need to be fast, but should not leak memory.
"Use Rust, but turn the borrow checker off" is inadequate.
Maybe you can read the linked post again? The problem in the example in the post is that data keeps coming from the network. If you were to strace the program, you would see it calling read(2) repeatedly. The runtime chooses to starve all other tasks as long as these reads return more than 0 bytes. This is obviously not the only option available.
I apologize for charitably assuming that you were correct in the rest of my reply and attempting to fill in the necessary circumstances which would have made you correct
This is just mundane non-blocking sockets. If the socket never needs to block, it won't yield. Why go through epoll/uring unless it returns EWOULDBLOCK?
It's an implementation issue, because "running on only a single thread" is an artificial constraint imposed by the implementation. There is nothing in the concept of async functions, coroutines, etc that has the constraint "must run on the same thread as the sync waiting call".
An "abstraction" isn't really one when it requires knowledge of a particular implementation. Async in JS, Rust, C#, etc all require that the programmer knows how many threads are running at a given time (namely, you need to know that there is only one thread).
> But I look forward to your implementation.
Thank you :-)[1]. I actually am working (when I get the time, here and there) on a language for grug-brained developers like myself.
One implementation of "async without colored functions" I am considering is simply executing all async calls for a particular host thread on a separate dedicated thread that only ever schedules async functions for that host thread. This sidesteps your issue and makes colored functions pointless.
This is one possible way to sidestep the specific example deadlock you brought up. There's probably more.
[1] I'm working on a charitable interpretation of your words, i.e. you really would look forward to an implementation that sidesteps the issues I am whining about.
I think the major disconnect is that I'm mostly familiar with UI and game programming. In these async discussions I see a lot of disregard for the use cases that async C# and JavaScript were built around. These languages have complex thread contexts so it's possible to run continuations on a UI thread or a specific native thread with a bound GL context that can communicate with the GPU.
I suppose supporting this use case is an implementation detail but I would suggest you dig into the challenge. I feel like this is a major friction point with using Go more widely, for example.
> But I'm glad you're at least on the same page now, about how checking if something is ready and yielding to the runtime are separate things.
I haven't ever said otherwise?
And this would be basically what you have to do in Go anyways - you need to explicitly use defer if you want code to run on destruction, with the caveat that in Go nothing stops you from forgetting to call it, when in Rust I can at least have a deterministic guard that would panic if I forget to call the explicit destructor before the object getting out of scope.
BTW async drop is being worked on in Rust, so in the future this minor annoyance will be gone