> Unfortunately, Tokio is notoriously difficult to learn due to its sophisticated abstractions.
IMO, this is largely due to the current state of the docs (which are going to be rewritten as soon as some API changes land).
The docs were written at a point where we were still trying to figure out how to present Tokio, and they ended up focusing on the wrong things.
The Tokio docs currently focus on a very high level concept (`Service`) which is an RPC like abstraction similar to finagle.
The problem is that, Tokio also includes a novel runtime model and future system and the docs don't spend any time explaining this.
The next iteration of Tokio's docs is going to focus entirely at the "tokio-core" level, which is the reactor, runtime model, and TCP streams.
tl;dr, I think the main reason people have trouble learning Tokio is because the current state of the docs are terrible.
> Aren’t abstractions supposed to make things easier to learn?
Tokio's goal is to provide as ergonomic abstractions as possible without adding runtime overhead. Tokio will never be as "easy" as high level runtimes simply because we don't accept the overhead that comes with them.
The abstractions are also structured to help you avoid a lot of errors that tend to be introduced in asynchronous applications. For example, Tokio doesn't add any implicit buffering anywhere. A lot of other async libraries hide difficult details by adding unlimited buffering layers.
My takeaway from working with Tokio is that it's a fairly low-level abstraction and doesn't do much to address the challenges of building networked _applications_. And this is OK.
We'll need higher-level layers that use Tokio, however, to address more specific use cases. I'll point to the nascent tower-grpc[1] library as something in this direction. I hope to see more things like this fall out of our work on Conduit[2].
Aren’t abstractions supposed to make things easier to learn? Something about the idea of “complex abstractions” seems wrong.
(Edit: this is not a criticism of Tokio, it’s a criticism of the OP’s characterization of “sophisticated abstractions” which IMO should reduce complexity)
Not always. Some abstractions are designed to make it easier to solve hard problems correctly (than without the abstraction).
For example, consider Rust's memory model. Many people criticize that model as difficult to learn. By comparison, you might argue that C's memory model is simpler to learn. Yet, the C approach to allocating, using, and freeing memory is highly error-prone. C programs historically have frequently had mistakes such as use-after-free errors, or buffer under/overflow/reuse errors. The high-profile OpenSSL Heartbleed vulnerability was an example of a weakness in C's memory model and memory handling abstractions [1].
Rust's memory model may be more difficult to learn than C's, but once learned, they are abstractions that provide an advantage in building correct software, by ruling out certain classes of mistakes. (GC in languages like C# and Java and Go can also prevent these mistakes, but comes with a runtime cost. Rust aims to provide zero-cost abstractions.)
Building correct async IO programs using kernel abstractions is difficult for similar reasons as it's difficult to write correct programs with C's memory model. It's especially difficult if you want the async IO program to be portable across multiple OS/kernels. I have not used Tokio, but I would guess that its Rust-powered abstractions will make it difficult or impossible to leak memory or sockets, or to fail to handle error cases that might arise handling async IO.
[1] https://www.seancassidy.me/diagnosis-of-the-openssl-heartble...
vs Rust where I bash my head against it for 2 days then give up. I'm not smart enough for Rust, oh well.
Writing memory safe code without Rust is harder than using Rust’s abstractions to do the same task. If you agree with that then my comment stands.
Of course, schedulers are just complicated. Most of the time you don't think about how complicated your scheduler is, because its either an OS primitive in the kernel or a language primitive in your language's runtime. But since tokio is a library - and modular - it gets criticism for being complex that in my opinion is unfair.
[] To be more concrete: a future is essentially a state machine representing the stack state at any yield point; it can't (currently) contain lightweight references into itself because they'd be invalidated when the future is move around. This means using borrowing in futures programs is often infeasible today. Solutions are in the works.
If Tokio’s abstractions are seemingly increasing complexity, maybe they aren’t sophisticated abstractions.
This is a criticism of the OP, not Tokio.
Tokio doesn't subscribe the the Larry Wall philosophy of making the easy things easy and the hard things possible. It seems more focused on making the hard things as easy as possible without much regard for the easy things.
* Before anyone attacks this...yes, you can accomplish simple tasks in Tokio, but it requires learning a lot more concepts than should be necessary to accomplish that simple task.
I think the quote is to be interpreted in terms of, if you've only ever seen blocking IO, and have never seen async IO or deferred computation, you have to learn some things first, but this isn't unique to Tokio in particular.
Does it mean it only uses a single thread for IO notifications?
If yes, the performance won’t be exceptionally great, especially on servers with many CPU cores and fast network interfaces.
The underlying OS APIs (both epoll, kqueue and iocp) do support multithreaded asynchronous IO, so that’s not some platform limitation.
Generally speaking, how to optimize concurrency for a network based application is pretty use case specific.
tl;dr, you can fully take advantage of many core systems w/ Tokio.
Does that library support such use case? Or does it imply 1-to-many relation between reactors and files/sockets? The latter doesn’t scale well.
This approach in Rust, as well as Rust's approach to memory management and other things, allow it to run without a runtime, which allows it to work for anything C does.
https://gist.github.com/tevino/3a4f4ec4ea9d0ca66d4f
This is description (from dotGo2017) of how netpoll works in Go under the hood if anyone is interested:
Rust had green threading in the initial phases but this was voted out.
And of course there is a huge amount of abstraction to build this up. It's just all hidden in the language runtime, so people can say with a straight face that there is "no abstraction and no cruft."
So...
> High I/O systems usually spawn multiple reactors, e.g. one reactor per CPU core.
Depends on what you are calling multiple reactors. If you mean a loop that responds to events and run tasks, then yes. For example, you can plug in [this](http://github.com/carllerche/futures-pool) as the task executor and get a multi threaded, work stealing, scheduler.
Or, maybe you are talking about OS level selectors (epoll), in which case you are going to run up against OS limitations.
Yes, about them, but I’m not sure what OS limitations do you mean?
For Windows, it works fine from the very first NT 3.51 version of IOCP.
For Linux it indeed didn’t work in the very first version of epoll, but they fixed that adding EPOLLONESHOT flag, and recently EPOLLEXCLUSIVE flag for accept(), allowing to implement proper scaling of IO readiness notifications across multiple CPU cores.
However, I personally advise against it as I have found that deferring to the OS for scheduling results in poor thread affinity (your state gets bounced around threads unnecessarily).
You generally get better throughput by either running multiple fully isolated reactors (the seastar approach) or running a single reactor which only flags tasks as ready to be executed, and then use a work stealing thread pool to do the actual execution.
Both of these structures are easy to achieve with Tokio.
> Currently, only one thread can be dispatching a given event_base at a time. If you want to run events in multiple threads at once, you can either have a single event_base whose events add work to a work queue, or you can create multiple event_base objects.
Early Rust, pre-1.0, did have a runtime and a green-threads mechanism; they ripped it out because they recognized that they couldn't go everywhere they wanted to go if they kept it. And if they hadn't done that, I believe Rust would have been far less successful than it has been.
You'd have to write the lowest-level of it in a language that wasn't Go. That language could be C, or Rust, or a pseudo-Go that didn't have many of the features people expect (including garbage collection and goroutines).
I implemented in the last years dozens of async/event-driven networking libraries, and wanted to do the same in Rust about 4 years ago. I guess I even started to work on the first async IO libraries for it (https://github.com/Matthias247/revbio). However I also got frustrated quite quickly with it, since the ownership model makes the typical solutions for asynchronous programming super hard (e.g. callbacks or callback interfaces). Therefore I also gave up on that (and also temporarily on the language).
However some years and evolution of Rust later my viewpoint on this changed a bit:
- Asynchronous I/O is generally messy and leaves a lot of room for errors. E.g. each callback that is involved might cause reentrancy problems or invalidations, object lifetimes might not be well-defined or matching expectations, etc. While most languages still allow it, it's hard to get fully right. Especially when manual memory management is involved. Async I/O plus multithreading is mostly a recipe for disaster.
- Rust just puts these facts directly in our face, and wants to you to go the extra route to provide that things are working directly. It's far from easy to figure out how to do this in a sane way. I think the tokio authors did an awesome job on finding some primitive abstractions with the poll model that allows for async I/O and uses Rusts type system for safety guarantees. It's a little bit akward to use without syntactic sugar like async/await or coroutines, but I think that is in the nature of the problem.
- Trying to do async IO probably shows off the domain which is the most inconvenient in Rust (besides similar problems like object-oriented and callback-driven UI frameworks). Therefore it shouldn't be used as a general point of measurement how easy or complex Rust is to use.
I'm not the only one to make such comments. They made sure to try and address this in v2 of the book. It still sucks.
In the rare case where the lifetime check is too expensive you're free to turn the types into pointers and use an unsafe block, which gives you the exact same constraints as you have in C++.
Some problems are worth the time to solve them.
I find myself thinking more about data ownership, allocations, and type structure. C++ is more about testing correctness about memory ownership and trying to break existing assumptions as I code. Different skillsets entirely!
You may be able to produce C++ faster, but i’d choose maintaining a rust codebase any day. Memory models require a lot of energy to maintain correctly, and rust does the heavy lifting for you.
That said, when you need it Rust does the right thing both in keeping you from blowing off your foot and making cross-platform development a breeze.
Rust simply makes memory-correctness something the compiler can check. Correct C++ programs are the same as correct Rust programs, the compiler simply isn’t enforcing it.
and the latter half of that statement is where things get problematic. Prove it to the compiler has repeatedly been too hard.
the real problem is telling when it's impossible, e.g. in recurrent data structures, like linked lists and trees. that takes practice.
fixing that felt good. the preceding 24 hours, not so much.
Data races are particularly nasty because any print statements or debug tracing can trigger a memory fence/reorder and make the problem disappear.
Nothing more frustrating then adding a printf only to see the issue no longer manifest.
However, for example, on the average desktop or mobile app, it's not clear to me(yet!) it is worth the pain of writing in rust.