Why you might want async in your project(notgull.net) |
Why you might want async in your project(notgull.net) |
I’ve worked deeply in an async rust codebase at a FAANG company. The vast majority chooses a dialect of async Rust which involves arcs, mutices, boxing etc everywhere, not to mention the giant dep tree of crates to do even menial things. The ones who try to use proper lifetimes etc are haunted by the compiler and give up after enough suffering.
Async was an extremely impressive demo that got partially accepted before knowing the implications. The remaining 10% turned out to be orders of magnitude more complex. (If you disagree, try to explain pin projection in simple terms.) The damage to the ecosystem from fragmentation is massive.
Look, maybe it was correct to skip green threads. But the layer of abstraction for async is too invasive. It would have been better to create a “runtime backend” contract - default would be the same as sync rust today (ie syscalls, threads, atomic ops etc – I mean it’s already half way there except it’s a bunch of conditional compilation for different targets). Then, alternative runtimes could have been built independently and plugged in without changing a line of code, it’d be all behind the scenes. We could have simple single-threaded concurrent runtimes for embedded and maybe wasm. Work stealing runtimes for web servers, and so on.
I’m not saying it would be easy or solve all use-cases on a short time scale with this approach. But I do believe it would have been possible, and better for both the runtime geeks and much better for the average user.
Pin projection is the proccess of getting a pinned reference to a struct's field from a pinned reference to the whole struct. Simple concept, but the APIs currently on offer for it (`unsafe` code or macro hackery) are very subpar.
My argument is more along the lines of: modularity is the (only) way to reduce complexity. We already have modular runtimes in other languages (project loom in Java, webassembly etc). Most people should not care about runtimes much. The ecosystem cost of async ended up being high. Thus, runtimes should be an implementation detail for most users.
Doesn’t mean Rome has to be rebuilt. Perhaps the async we have can be saved, but even so it involves biting the apple of actually defining precisely what a runtime is so that crate authors can think of them just like they think of allocators today (ie not at all).
I followed Rust in the very early days and definitely came away with the sense in this article. I would have said (and may have said to some people) that Graydon is really great, but that the exciting things about Rust weren't the things he liked or cared about; basically the expressivity and zero cost abstractions sections of this article.
But reading the article he linked about first class modules, I think that seems pretty good, and I think he's definitely right about making borrowing "second class" without explicit lifetimes (or at least discouraging them more so than the language does today), and about existential types (I'm always surprised I don't see these more in library APIs).
I also had no idea he wanted built in bignums. In pre-1.0 (and pre-cargo) rust, I created a very incomplete library for that, and would have loved to have it built in instead. Also yeah, decimal literals would be excellent.
But I didn't find the async vs. green threads section convincing. The green thread implementation wasn't a great fit at the time it existed, and I haven't seen anything since then that convinces me there was some great solution available to make it work better. Async isn't great in rust, but it's a much better fit, and I think it can be used well. I have hopes that best practices developing over time and maybe language features or changes can push people in a more sane direction of usage (once it becomes more clear what that should be).
Is the runtime something the compiler adds to the binary to make sure it is able to correctly interact with the system it is built for?
It seems like people argue that green threads require a runtime as if async doesn't? I don't understand the arguments on either side. In terms of what code looks like I far prefer being able to just declare green threads like golang does.
Honestly I wish I understood on a deep level, but I've been programming for 17+ years and the fact that I still don't implies to me that I never will.
> I don't understand the arguments on either side. In terms of what code looks like I far prefer being able to just declare green threads like golang does.
Under the hood `async` is sugar over a function such that the function returns a `Future<T>` instead of a `T`. What is done with that future is up to the caller.
In most cases this is handed off to a runtime (your choice of runtime, generally speaking) that will figure out how to execute it. You could also manually poll the future until it's complete, which does happen sometimes if you're manually implementing the Future trait.
If you have no async code you can simply avoid having an async runtime altogether, reducing the required runtime for an arbitrary program.
> I far prefer being able to just declare green threads like golang does
This relies on an implicit runtime. That's fine - lots of Rust libraries that work the way you're suggesting will just assume a runtime exists.
That lets you write:
spawn(async {println!("hello from async");});
And, just like a goroutine, it will be scheduled for execution by the implicit runtime (or it will panic if that runtime is not there).Note that this implicit runtime has to be there or you'll panic. This means that the reasonable behavior would be to always provide such a runtime, which would mean that even "sync" programs would need it. Or otherwise you'd need to somehow determine that no "async" code is ever actually called and statically remove it. That is a major reason why you wouldn't want this model in a language that tries to minimize its runtime.
> but I've been programming for 17+ years and the fact that I still don't implies to me that I never will.
I think it's just a matter of exposure. Try writing in more languages like C, C++, Rust, etc, and dig into these features.
Exactly. RAII works beautifully in regular Rust, so you create references with the static ownership rules and pass them around, before the value is dropped at a deterministic place. This is like the main value prop of Rust.
In async Rust OTOH (in fact regular threads as well) it’s much harder to use references when they normally would make sense. So instead of `&T` and `&mut T` you need `Arc<T>` and `Arc<Mutex<T>>`, respectively.
Then you lose both on performance (the initial blog post claimed that the pervasive arcing is worse than GC) but also UX. Arcs are much easier to leak, for instance.
You can use atomics if the data fits in a machine word. That's a lot faster than a full mutex.
And?
> not to mention the giant dep tree of crates to do even menial things.
Again, And?
I don't really care about having to pull in crates. That has always been how Rust does things - it prefers many small crates over fewer large crates. Async is no different.
And I don't care about Arc either. Writing `let x = blah()` is not much better than `let x = Arc::new(blah())`.
If you're talking about something else, like idk, maintaining mutability across multiple threads, yeah that's going to be more painful. It's also painful in most other languages and is generally avoided for that reason.
> The ones who try to use proper lifetimes etc are haunted by the compiler and give up after enough suffering.
You say "proper lifetimes" as if lifetimes are desirable. In async code they are not - your lifetime is often "arbitrary" and that's what an Arc gives you. The solution is, as mentioned, using an Arc or Box or Mutex.
> Async was an extremely impressive demo that got partially accepted before knowing the implications.
I think this is a totally ignorant characterization of async, which was years in the making, took lessons learned from decades of async in other languages, and was frankly led by some of the most knowledgeable people in regards to these sorts of systems.
> (If you disagree, try to explain pin projection in simple terms.)
The vast majority of people will never have to know what a pin projection is, let alone how it works. It rarely comes up, and virtually only if you're writing libraries. I could explain it but I see no reason to do so here (it is not complicated at all, `Pin` is probably the harder one to explain).
> The damage to the ecosystem from fragmentation is massive.
It's not even noticeable lol like, what? What fragmentation? I've never run into an issue of fragmentation and I've written 100s of thousands of lines of Rust.
> It would have been better to
How nice to sit on the sidelines and throw out a paragraph sized proposal. Everything looks great when you hand wave away the complexity of the problem space.
Async Rust isn't perfect (I frankly don't think there is a "perfect" solution, that should not be contentious I hope) and I welcome criticism, but your post is totally unconstructive and unsubstantial.
That being said I still have a deep dislike of having to read through nested Arc Mutexes or whatever to figure out what the code does in principle before I figure out what is going on in detail with the ownership.
I know there are no perfect solutions and there are trade-offs to be made, but I wish there was a way to have it more readable.
So instead of this:
let s = Arc::new(Mutex::new(Something::new("foo")));
something a bit like this: let s = Something::new("bar").arc()
.mutex();Thinking too much and in particularly going with over complicated solutions from the very start because "might" is just bad engineering.
Also, even if I do need async in a certain place, doesn't mean I need to endure the limitations and complexity of async Rust everywhere in my codebase. I can just spawn a single executor and pass around messages over channels to do what requires async in async runtime, and what doesn't in normal and simpler (and better) blocking IO Rust.
You need async IO? Great. I also need it sometimes. But that doesn't explain the fact that every single thing in Rust ecosystem nowadays is async-only, or at best blocking wrapper over async-only. Because "async is web-scale, and blocking is not web-scale".
Edit: Also the "just use smol" comically misses the problem. Yeah, smol might be simpler to use than tokio (it is, I like it better personally), but most stuff is based on tokio. It's an uphill battle for the same reasons using blocking IO Rust is becoming an uphill battle. Only thing better than using async when you don't want to is having to use 3 flavors (executors) of async, when you didn't want to use any in the first place.
Everything would be perfect and no one would complain about async all the time if the community defaulted to blocking, interoperable Rust, and then projects would pull in async in that few places that do actually need async. But nobody wants to write a library that isn't "web-scale" anymore, so tough luck.
It's not a problem with tokio either. The author's point is specifically about the multi-threaded tokio runtime that allows tasks to be moved between worker threads, which is why it requires the tasks to be Send + 'static. Alternatively you can either a) create a single-threaded tokio runtime instead which will remove the need for tasks to be Send, or b) use a LocalSet within the current worker that will scope all tasks to that LocalSet's lifetime so they will not need to be Send or 'static.
If you go the single-threaded tokio runtime route, that doesn't mean you're limited to one worker total. You can create your own pseudo-multi-threaded tokio runtime by creating multiple OS threads and running one single-threaded tokio runtime on each. This will be similar to the real multi-threaded tokio runtime except it doesn't support moving tasks between workers, which means it won't require the tasks to be Send. This is also what the author's smol example does. But note that allowing tasks to migrate between workers prevents hotspots, so there are pros and cons to both approaches.
Looking back with like 30 years of hindsight it seems to me that Java’s greatest contribution to software reuse was efficient garbage collection; memory allocation is a global property of an application that can’t efficiently be localized as you might want a library to use a buffer it got from the client or vice versa and fighting with the borrow checker all the time to do that is just saying “i choose to not be able to develop applications above a certain level of complexity.”
In my experience (which, admittedly, is far less than the author, a developer of smol!) the answer to "I'm starting to do a lot of things at once" in Rust is usually to spin up a few worker threads and send messages between them to handle jobs, a la Ripgrep's beautiful implementation.
In a way, it seems like async Rust appears more often when you need to do io operations, and not so much when you just need to do work in parallel.
Of course, you surely can use async rust for work in parallel. But it's often easier to keep async out of it if you just need to split up some work across threads without bringing an entire async executor runtime into the mix.
I don't think async/await was poorly implemented in Rust - in fact, I think it avoids a lot of problems and pitfalls that could have happened. The complications arise because async/await is, kind of, ideologically antithetical to Rust's other goal of memory safety and single-writer. Rust really wants to have its cake (compile-time memory safety) and eat it too (async/await). And while you can criticize it, you have to admit they did a pretty good job given the circumstances.
Yep this makes sense to me.
If your workload is CPU-bound then context switching to make progress on 10 tasks concurrently, is going to be slower than doing them sequentially.
But if it’s IO-bound you will spend most of your time waiting, which you could use to make progress on the other tasks.
That's pretty simple. The primary goal of every software engineer is (or at least should be) ... no, not to learn a new cool technology, but to get the shit done. There are cases where async might be beneficial, but those cases are few and far in between. In all other cases a simple thread model, or even a single thread works just fine without incurring extra mental overhead. As professionals we need to think not only if some technology is fun, but how much it actually costs to our employer and about those who are going to maintain our "cool" code when we leave for better pastures. I know, I know, I sound like a grandpa (and I actually am).
If we presuppose that all software eventually develops an async, and we therefore should use async. Would it not stand to reason that greenspun's rule that all software contains a lisp would imply that we must also all use lisp?
The implicit argument doesn't stand alone though. The author goes on to write:
> It happens like this: programs are naturally complicated. Even the simple, Unix-esque atomic programs can’t help but do two or three things at once. Okay, now you set it up so, instead of waiting on read or accept or whatnot, you register your file descriptors into poll and wait on that, then switching on the result of poll to figure out what you actually want to do.
The implication is clear. Even simple programs will eventually require async, and should therefore just use it right now. unix-esque in this paragraph is supposed to evoke ls or cat. Is your program really going to be simpler than cat? No? Then you apparently need async.
Maybe Rust isn’t a good tool for massively concurrent, userspace software - https://news.ycombinator.com/item?id=37435515 - Sept 2023 (567 comments)
Now we have async/await and I'm always happy to see it.
I use it in C# and JS with no friction or mental overhead required.
In C#, I can still use channels or threads if I want to as well. But async/await is great for any I/O heavy code.
> Eventually, two or three sockets becomes a hundred, or even an unlimited amount. Guess it’s time to bring in epoll! Or, if you want to be cross-platform, it’s now time to write a wrapper around that, kqueue and, if you’re brave, IOCP.
This feels like a straw man. Nobody is saying "don't use async; use epoll!". The alternative to async is traditional OS threads. This option is weirdly not mentioned in the article at all.
And yes they have a reputation for being very hard - and they can be - but Rust makes traditional multithreading MUCH easier than in C++. And I would argue that Rust's async is equally hard.
Rust makes traditional threading way easier than other languages, and traditional async way harder than other languages, enough that threads are arguably simpler.
In some ways it's worse because you have to explicitly add them, and I have yet to see any Rust APIs that actually use them (though there is a `cancellation` crate so at least some must be).
In other ways it's better because it gives you control and explicit visibility over the cancellation points.
Yet tokio is the de facto standard and everything links against it. It’s really annoying. Rust should have either put a runtime in the standard library or made it a lot easier to be runtime neutral.
* EDIT: corrected, thanks
EDIT: I originally incorrectly claimed that stjepang also created rather than maintained crossbeam, making the same msitake as I was correcting.
The first transformation is that every async block / fn compiles to a generator where `future.await` is essentially replaced by `loop { match future.poll() { Ready(value) => break value, Pending => yield } }`. ie either polling the inner future will resolve immediately, or it will return Pending and yield the generator, and the next time the generator is resumed it will go back to the start of the loop to poll the future again.
The second transformation is that every generator compiles to essentially an enum. Every variant of the enum represents one region of code between two `yield`s, and the data of that variant is all the local variables that in the scope of that region.
Putting both together:
async fn foo(i: i32, j: i32) {
sleep(5).await;
i + j
}
... essentially compiles to: fn foo(i: i32, j: i32) -> FooFuture {
FooFuture::Step0 { i, j }
}
enum FooFuture {
Step0 { i: i32, j: i32 }
Step1 { i: i32, j: i32, sleep: SleepFuture }
Step2,
}
impl Future for FooFuture {
fn poll(self) -> Poll<i32> {
loop {
match self {
Self::Step0 { i, j } => {
let sleep = sleep(5);
self = Self::Step1 { i, j, sleep };
}
Self::Step1 { i, j, sleep } => {
let () = match sleep.poll() {
Poll::Ready(()) => (),
Poll::Pending => return Poll::Pending,
};
self = Self::Step2;
return Poll::Ready(i + j);
}
Self::Step2 => panic!("already run to completion"),
}
}
}
}"There is a common sentiment I’ve seen over and over in the Rust community that I think is ignorant at best and harmful at worst."
just refuses to read the rest? If you are actually trying to make a point to people that think differently than you, why antagonize them by telling them they don't know what they are talking about?
This is only partly true -- if you want to `spawn` a task on another thread then yes it has to be Send and 'static. But if you use `spawn_local`, it spawns on the same thread, and it doesn't have to be Send (still has to be 'static).
How could I unleash all the processors on my computer on this workload and allow them to correctly avoid repeated calculation of results of shared subtasks?
For example, I’m using an outbox: im::OrdMap<String, Array2<_>> and a situation might arise where one task could avoid repeating work on a subtask because that’s already in progress elsewhere by waiting for the key/value pair (so that process could do something else)
Would it be worth going to async for that?
How could a worker function know if some key in the outbox was already being calculated and it could work on something else?
How would you share an outbox like that across a bunch of rayon processes communicating with async?
(I’ll read smol docs and try to figure it out but this article made a lot of sense, thank you)
The orchestrator can then keep track of what's going on and avoid duplicating tasks and the workers don't need to worry about any global state.
I’m tired of everyone implementing async on their own.
There is simplicity in a avoiding that and having code that gets compiled to something that is straightforward and single threaded.
This means that any crate that uses IO will be bound to a limited number of Runtimes. Everything being Tokio-only is pretty bad (though Tokio itself is great), but here we are...
[0] https://github.com/bluejekyll/trust-dns/pull/1373#issuecomme...
It's more like "I want to be able to put timeouts in my code". 99% of why I want async is so that if something takes too long I can just stop that. That is incredibly hard to do without async.
Now that’s just plainly untrue.
Static lifetimes are also a large part of the rest of Rust's safety features (like statically enforced thread-safety).
A usable Rust-without-lifetimes would end up looking a lot more like Haskell than Go.
RAII works only for the simplest case: when your cleanup takes no parameters, when the cleanup doesn't perform async operations, etc. Rust has RAII but it's unusable in async because the drop method isn't itself async (and thus may block the whole thread if it does I/O)
Thus the complexity of handling memory is greater than that of other resources and the consequences of getting it not 100% right are frequently worse.
Has anyone build a collector that tracks multiple types of resources an object might consume? It seems possible.
The problem is that things like closing a socket are not just generic resources, a lot of the time nonmemory stuff has to be closed at a certain point in the program, for correctness, and you can't just let GC get to it whenever.
Why is that a "problem with GC"?
Abstracting away >90% of resource management (i.e. local memory) is a significant benefit.
It's like saying the "problem with timesharing OS" is that it doesn't address 100% of concurrency/parallelism needs.
I disagree that this criticism applies to Rust. For 99% of the cases, the idiomatic combination of borrow checking, Box and Arc gets back to a unified, global, compiler-enforced convention. I agree that there's a non-trivial initial skill hurdle, one that I also struggled with, but you only have to climb that once. I don't see that there's a limit to program complexity with these mechanisms.
Lol wut. The C++ resource management paradigm is RAII. If you write a library that doesn't use RAII, it's a bad library. Not a fault of the language.
widely used though. not sure if that count for appreciation, but i think it's one of the highest forms.
it's not bad, not not great either. i miss proper sum types, and it really lament the fact that static things are nearly impossible to be mocked which prompts everyone to use DI for everything instead of static.
Your argument is looking at the advantages Java brought to development speed and entirely disregarding runtime speed
It's hip to hate on java, but at least do it from an informed position.
Java is extremely fast, which is why it's so popular for server code where performance matters.
That's not "fun", that's table stakes.
It’s been a long time since I did this in Rust. But why do you not have access to the sockets or at least a set_timeout method? Is it a higher level lib that omits such crucial features?
In Go, the super common net.Conn interface has deadline methods. Not everyone knows their importance but generally you have something like it piped through to the higher layers.
EDIT: Oh I see you replied to my other comment. Please disregard.
Async isn't a lark, it's a workhorse. The goal is not to write sexy code, it's to achieve better utilization (which is to say, save money).
But nobody will sell you just a CPU cycle. They come in bundles of varying size.
I recently heard a successful argument that we should take the pod that's 99% unutilized and double its CPU capacity so it can be 99.9% unutilized, that way we don't get paged when the data size spikes.
When I proposed we flatten those spikes since they're only 100ms wide it was sort down because "implementing a queueing architecture" wasn't worth the developer time.
I suppose you could call it a queueing architecture. I'd call it a for loop.
Exercising judgement about when to use or shirk an abstraction is a lot of what being a software engineer is about.
It adds complexity, but it's at the level where you don't have to think about it. If you're doing something advanced enough to where async is a leaky abstraction, you're probably doing something big enough to where you would want the advantages it offers.
If you're doing something simple, async is just a black box primitive that is pretty easy to use.
Furthermore, async rust can be run single threaded
> When you declare a path operation function with normal def instead of async def, it is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server).
https://fastapi.tiangolo.com/async/#path-operation-functions
OP either meant this, or its variation, such as async_to_sync and sync_to_async. https://github.com/django/asgiref/blob/main/asgiref/sync.py
Ofc this is a python example. I have no idea how it works in different languages.
When writing a Future that will block for 5 seconds you will need to find somewhere to that you can put the code to block for 5 seconds. You don't technically need to even use an executor here.
There's no implication. Read what you quoted instead of digging for quick jabs. "Even the simple, Unix-esque atomic programs can’t help but do two or three things at once. Okay, now you set it up so, instead of waiting on read or accept or whatnot..."
>unix-esque in this paragraph is supposed to evoke ls or cat. Is your program really going to be simpler than cat? No? Then you apparently need async.
cat and ls don't do two or three things at once.
The cancellation crate hasn't been touched since 2016 and requires the running thread have a mechanism to be woken up. If you're in the middle of a read, you won't observe a wakeup unless you use an async-io function that can be timed out or interrupted.
This is no better than async/await. And await is just as obvious of the cancellation points.
That being said, there are also numerous crates for async rust cancellation tokens that can polled in parallel with a read such that you can observed the cancel instantly and switch to a cleanup process instead of immediately cancelling everything
This is where you have a reasonable trade off. I have accepted that async gives me more control over my code. For that I have to accept that blocking can slow down the app. After running async rust in production for over 2 years now I've not seen any blocking tasks block the executor. Maybe I'm just good but my experience is that my colleagues who come from C# generally don't make these mistakes either
Connection pooling is of course not without it's hazards, scaling databases can be very difficult and almost all of the production incidents I've dealt with involve a database running out of a resource (often connections). But for your garden variety web app, it certainly isn't a dichotomy between serializing all concurrent updates or losing atomicity.
This is extremely difficult. I mentioned elsewhere that the only way to kill a thread is through the pthread_cancel API, which is very dangerous.
Languages with larger runtimes can get around this because they have all sorts of things like global locks or implicit cooperative yielding. So they don't ever have to "kill" a thread, they actually just join it when it yields.
What if my work isn't related to socket timeouts? For example, downloading a multi-part file? I may want a very long socket timeout but still have a distinct timeout for the individual chunked operations. It might take 30 seconds to grab an entire file but if one chunk takes >3 seconds I may want to time out.
What if my work involves no IO at all?
The ability to cancel work in progress is extremely important to me.
Yeah that’s an issue. In Go they sync it (thread safe), which in Rust would translate to interior mutability.
> What if I want to do per-request timeouts?
Ah you’re right in http2 there can be multiple concurrent reqs per conn. Go still allows request based timeouts, but I wonder if that’s possible with the limited primitives in std. It’s also true that this is a case where the inner conn should not be exposed.
> I may want a very long socket timeout but still have a distinct timeout for the individual chunked operations.
Right! That’s typically done by extending the deadline for every chunk. Ie the user/caller needs a way to set timeouts.
> The ability to cancel work in progress is extremely important to me.
Yes for sure. I was just curious. Btw which libs are you referring to for network requests? I’d like to see their APIs.
Consider this: https://docs.rs/rusoto_s3/latest/rusoto_s3/trait.S3.html
You're given a trait and, understandably, this trait does not expose any sockets to you (it may not even be backed by sockets).
Also, if your cleanup takes parameters, you can just store them in the struct.
When dealing with async operations they tend to end up at a network boundary and thus the service enters distributed system land.
Now the async drop also has to handle the server crashing before the drop happens and at any time when it happens. Keeping that in mind trying to actually drop something becomes quite meaningless since you need to handle all other cases either way.
Personally I love Rust’s `fn foo(self, …)`, which is just like a regular method but consumes the value.
Deallocate by default is fine, but sometimes you need to run specific destructors (linear type style). I’ve long wished for an opt-out from implicit drop semantics for resource/handle types.
[0]: https://play.rust-lang.org/?version=stable&mode=debug&editio...
Originally, they used pure reference counting GC, with finalizers used to clean up when freed. This was "fine", since RC is deterministic. Everything is freed when the last reference is deleted, nice and simple.
But reference counting can't detect reference cycles, so eventually they added a secondary tracing garbage collector to handle them. But tracing GC isn't deterministic anymore, so this also meant a shift to manual resource management.
That turned out to be embarrassing enough that context managers were eventually introduced to paper over it. But all four mechanisms still exist and "work" in the language today.
And that's just for IO. I mentioned elsewhere that you may want to cancel pure compute work.
You can see my point, I assume, that when your userspace program can cancel tasks natively it's much easier to work with?
> what's the difference between "framework" and async/await runtime,
Sure, in that in both cases you have the threads managed for you. But there's a difference between spawning a raw pthread, which will have no signal handlers/ cleanup hooks, and one managed by a framework where it can add all of those things and more.
Yes, the problem is that your thread would continue to perform work even if you stopped waiting on it.
To gracefully shut a thread down you need yielding of some kind.
Is this a new development? Last time I checked, every library seemed to be tied to a specific runtime (usually tokio).
Otherwise they both schedule a task to be executed with the implicit runtime - in Rust that may be tokio or something else, in Go that would be the Go runtime.
> they both schedule a task to be executed with the implicit runtime
To be pedantic, in rust the runtime is referenced with a global or thread-local variable, but it’s still explicit. This means crate authors can’t spawn tasks without depending on a runtime… unless there’s been recent developments.
https://docs.python.org/3/library/asyncio-task.html#asyncio....
For Python in general, no. For example, as far as I know Jython reuses the JVM's GC (and its unreliable finalizers with it).
It's also easy to introduce accidental cycles. For one, a traceback includes a reference to every frame on the call stack, so storing that somewhere on the stack would create an unintentional cycle!
Now I use close() methods for anything that needs to be closed. If I mess up and there's some obscure bug, hopefully GC will fix it, but it seems too brittle and easy to make mistakes with to rely on.
Also, it does not matter much anymore, the whole std-lib is full of exceptions-to-implement-multiple-return-values.
sealed interface Shape {}
record Square(int x) implements Shape {}
record Rectangle(int l, int w) implements Shape {}
record Circle(int r) implements Shape {}
double getArea(Shape s) {
// Exhaustively checks for all alternatives.
return switch (s) {
case Square(var x) -> x * x;
case Rectangle(var l, var w) -> l * w;
case Circle(var r) -> Math.PI * r * r;
}
}
This is a good article: https://mccue.dev/pages/11-1-21-smuggling-checked-exceptionsRAII is one method of cleanup but it doesn’t work in all situations. One that comes to mind is detecting errors in cleanup and passing them to the caller.
So it’s not right to call every library that doesn’t use RAII “bad.” There are other constraints, as well. Part of the strength of C++ is to give you a choice of paradigms.
Either you write code with good performance, which means that functions do take references and pointers sometimes, in which case you do have all of the usual lifetime issues. This is the proper way to use C++, and it's perfectly workable, but it's by no means automatic. That's the reality that my comment was referencing.
Or you live in a fantasy land where RAII solves everything, which leads to code where everything is copied all the time. I've lived in a codebase like this. It's the mindset that famously caused Chrome to allocate 25K individual strings for every key press: https://groups.google.com/a/chromium.org/g/chromium-dev/c/EU...
> strings being passed as char* (using c_str()) and then converted back to string
> Using a temporary set [...] only to call find on it to return true/false
> Not reserving space in a vector
c_str() isn't there for "good performance" to begin with; it's there for interfacing with C APIs. RAII or not, GC or not, you don't convert to/from C strings in C++ unless you have to.
The other stuff above have nothing to do with C++ or pointers, you'd get the same slowdowns in any language.
The language has come a long way since 2014. Notice what they said the solutions are:
> base::StringPiece [...]
a.k.a., C++17's std::string_view.
My argument was that for efficient code, you need to pass references or pointers, which means you do need to care about lifetimes.
And your argument is that's not true because we now have std::string_view? You do realize that it's just a pointer and a length, right? And that this means you need to consider how long the string_view is valid etc., just as carefully as you would for any other pointer?
> but it’s still explicit.
I guess it just comes down to your definition of explicit. There's a dependency, but from a caller's perspective it's implicit. It doesn't matter though, I think the point is clear enough.
I don't see anybody claiming this. The parent I see you initially replied to said "the C++ resource management paradigm is RAII", not "all lifetime issues are solved by RAII".
> My argument was that for efficient code, you need to pass references or pointers, which means you do need to care about lifetimes.
Of course you do. Nobody claimed you don't need to care about lifetimes. (Even in a GC'd language you still need to worry about not keeping objects alive for too long. See [1] for an example. It's just not a memory safety issue, is all.) The question was whether "every library or API you use" needs to have "a different cleanup convention" for performance reasons as you claimed, for which you cited the Chromium std::string incident as an example. What I was trying to point out was:
> that's not true because we now have std::string_view? You do realize that it's just a pointer and a length, right?
...because it's not merely a pointer and a length. It's both of those bundled into a single object (making it possible to drop them in place of a std::string much more easily), and a bunch of handy methods that obviate the ergonomic motivations for converting them back into std::string objects, hence preventing these issues. (Again, notice this isn't just me claiming this. The very link you yourself pointed to was pointing to StringPiece as the solution, not as the problem.)
So what you have left is just 0 conventions for cleanup, 1 convention for passing read-only views (string_view), 1 convention for passing read-write views (span), and 1 convention for passing ownership (the container). No need to deal with the myriads of old C-style conventions like "don't forget to call free()", "keep calling with a larger buffer", "free this with delete[]", or whatever was there over a decade ago.
> And that this means you need to consider how long the string_view is valid etc., just as carefully as you would for any other pointer?
Again, nobody claimed you don't have to worry about lifetimes.
[1] https://nolanlawson.com/2020/02/19/fixing-memory-leaks-in-we...
OTOH in Rust async model is based on polling. Which means that poll may never block, but instead has to set a wake callback if no data is available. So there is no way to interrupt a rogue task and all async functions should rely on callbacks to wake them (welcome to Windows 3.1, only inside out!). Thread model is much more lax in this sense, e.g. even though my web server (akka-http) is based on futures, nothing prevents me from blocking inside my future, in most cases I can get away with it. As I understand it's not possible in Rust async model, I can only use non-blocking async functions inside async function. So in reality you don't interrupt or clean up anything in Rust when a timeout happens, you simply abandon execution (i.e. stop polling). I wonder what happens with resources if there were allocated.
You can block, you're just going to block all other futures that are executed on that same underlying thread. But all sorts of things block, for loops block.
This is the same as Java, I believe. Akka also has special actors called IO Workers that are designed for blocking work - Rust has the same thing with `spawn_blocking`, which will place the work onto a dedicated threadpool.
> So in reality you don't interrupt or clean up anything in Rust when a timeout happens
You don't interrupt, you are yielded to. It's cooperative.
> I wonder what happens with resources if there were allocated.
When a Future is dropped its state is dropped as well, so they are all freed.
Having said that, Erlang exists and doing well, so async is as good as any model designed for special cases. But this discussion basically answers the question
> Why don’t people like async?
Because not everybody (which means a majority of developers) needs this complexity. And the upward poisoning means that I can't block in my function if my web server is based on async, which affects everybody who is using it.
This is the case in every language.
> So no, if you want to use async you shouldn't block.
Everything blocks. The dosage makes the poison.
> For loops? Nope, not if they take long time for the same reason
You would want to add a yield in your loop, yes. Async loops `while let Some(msg) = stream.next().await` will work well for this.
> And the upward poisoning means that I can't block in my function if my web server is based on async, which affects everybody who is using it.
To be clear, you can definitely block as much as you want in those frameworks, you just need to understand that you'll block the entire thread that's running your various futures. That's not that big of a deal, you'd have the exact same issue with a synchronous framework. Blocking in an OS thread still blocks all work on that thread, of course.