Rewriting Rust(josephg.com) |
Rewriting Rust(josephg.com) |
I don't think any language helps verifying that., and even in the ones that require it by spec, it's unclear if it's happening. Maybe you didn't really wrote a tail-recursive function because of a helper that you expected to be inlined. I guess it's easy to notice if you try to blow the stack in a unit test though.
Yeah, it seems like a pretty easy feature to add. The compiler can pretty easily calculate the maximum stack size for every (bounded) call stack. It seems super useful to compute & expose that information - especially for embedded devices.
This sounds bad, but I wonder how many features have taken this long to include in other languages. Is this really as out of step as it sounds?
There is the move-fast-break-things mentality, but is that how you want to design a language?
Seems like we are missing some middle ground step, where there are good features, maybe even done, and stable, but they aren't getting worked into the main language.
Maybe a decision making problem.
I still wish the Python core team had abandoned the Python 3 experiment and gone with Python 2.x for life, warts and all. I learned to work with the warts, including the Unicode ones. I think a lot of us did.
Is frustration with Rust on the rise? I just started using Rust few month ago and absolutely love it. I can't tell what's going on with the Rust foundation so I can only judge by reading sentiments. Nothing would kill my vibe harder than knowing smart people thinks the language isn't doing great :(
I am not sure what the OP is using, but with LSP I do get the error message in my editor (nvim) before any compiling (though am pretty sure some checking in happening in the background).
> Compile-time Capabilities
Not sure how this makes any sense when Rust compiles to multiple targets. Should all libraries become aware of all the "capabilities" out there. Also, this already can be implemented using features and keep things minimal.
> Comptime
I can't make sense of what the OP issue is here.
> Make if-let expressions support logical AND. Its so simple, so obvious, and so useful. This should work: if let Some(x) = some_var && some_expr { }
The example makes no sense.
Great article apart from that.
The traditional idea "compiled language" usually means a language designed for mostly batch compilation -> the compiler is not a part of the (potential) execution runtime. "compile time" and "run time" are not the same. In Lisp it is allowed to be the same.
[1]: https://gavinhoward.com/2024/05/what-rust-got-wrong-on-forma...
javascript:(function(){var newSS, styles='* { background: white ! important; color: black !important } :link, :link * { color: #0000EE%20!important%20}%20:visited,%20:visited%20*%20{%20color:%20#551A8B%20!important%20}';%20if(document.createStyleSheet)%20{%20document.createStyleSheet("javascript:'"+styles+"'");%20}%20else%20{%20newSS=document.createElement('link');%20newSS.rel='stylesheet';%20newSS.href='data:text/css,'+escape(styles);%20document.getElementsByTagName("head")[0].appendChild(newSS);%20}%20})();Some one just has to do it.
My wishlist:
* allow const fns in traits
* allow the usage of traits in const exprs. This would allow things like using iterators and From impls in const exprs, which right now is a huge limitation.
* allow defining associated type defaults in traits. This can already be worked around using macros somewhat effectively (see my supertrait crate) but real support would be preferable.
* allow eager expanding of proc macro and attribute macro input, perhaps by opting in with something like `#[proc_macro::expand(tokens)]` on the macro definition. Several core "macros" already take advantage of eager expansion, we peasants simply aren't allowed to write that sort of thing. As a side note, eager expansion is already possible for proc and attribute macros designed to work _within proc macro crates_, for example this which I believe is the first time this behavior was seen in the wild: https://github.com/paritytech/substrate/blob/0cbea5805e0f4ed...
* give build.rs full access to the arguments that were passed to cargo for the current build. Right now we can't even tell if it is a `cargo build` or a `cargo doc` or a `cargo test` and this ruins all sorts of opportunities to do useful things with build scripts
* we really need a `[doc-dependencies]` section in `Cargo.toml`
* give proc macros reliable access to the span / module / path of the macro invocation. Right now there are all sorts of projects that hack around this anyway by attempting to locate the invocation in the file system which is a terrible pattern.
* allow creating custom inner attributes. Right now core has plenty of inner attributes like `#![cfg(..)]` etc, and we have the syntax to define these, we simply aren't allowed to use custom attribute macros in that position
* revamp how proc macro crates are defined: remove the `lib.proc-macro = true` restriction, allowing any crate to export proc macros. Facilitate this by adding a `[proc-macro-dependencies]` section to `Cargo.toml` that separately handles proc-macro-specific dependencies. Proc macros themselves would have access to regular `[dependencies]` as well as `[proc-macro-dependencies]`, allowing proc macro crates to optionally export their parsing logic in case other proc macro crates wish to use this logic. This would also unblock allowing the use of the `$crate` keyword within proc macro expansions, solving the age old problem of "how do I make my proc macro reliably refer to a path from my crate when it is used downstream?"
* change macro_rules such that `#[macro_export]` exports the macro as an item at the current path so we can escape from this ridiculousness. Still allow the old "it exports from the root of the current crate" behavior, just deprecate it.
And there's a lot of things that are weird or clunky
I honestly don't "get" the "no classes, just struct methods thing" and while, sure, C++ is kinda like that, but the ergonomics are weird. I'd much rather have the class/methods declaration as most languages do
Lifetimes are good but the implementation is meh. Most cases could do with a default lifetime.
Copy/borrow strictness is good to think about but in most cases we don't care? Copy should probably the default and then you borrow in special cases
That phone couldn't even send MMS.... You had to jailbreak it to be able to do normal stuff that the phones could do for ages back then.
Languages like C++ and python are wildly successful and don’t think anyone would call them perfect.
The dependence point is valid but not sure that is easily solvable in general. Doesn’t seem like a rust issue. See npm and python pip - blind trust is par for the course except in very rigorous environments
Very popular lang that is actually very nicely designed and has very good ecosystem (compilers, tools like package manager, std lib)
I know I used to crush hard on Python and also got worried when there were dissonances within the Python Foundation. But as you progress, I assume the goings-on in certain language communities will take a back-seat to thinking deeply about how to solve the problems you are professionally tasked with. At least that's my experience.
As for Rust: It's gonna be around for a while. For the past months, I've been hearing a lot of chatter about how companies are using Rust for the first time in production settings and how their developers love it.
A lot of the complaints I see are not super well thought through. For example, a lot of people complain about async being too explicit (having a different "color" than non-async functions), but don't consider what the ramifications of having implicit await points actually are.
Even in this otherwise fine article, some of those desired Fn traits are not decidable (halting problem). There's a bit of a need to manage expectations.
There are definitely legitimate things to be desired from the language. I would love a `Move` trait, for example, which would ostensibly be much easier to deal with than the `Pin` API. I would love specialization to land in some form or another. I would love Macros 2.0 to land, although I don't think the proc-macro situation is as bad as the author presents it.
The current big thing that is happening in the compiler is the new trait solver[0], which should solve multiple problems with the current solver, both cases where it is too conservative, and cases where it contains soundness bugs (though very difficult to accidentally trigger). This has been multiple years in the making, and as I understand it, has taken up a lot of the team's bandwidth.
I personally like to follow the progress and status of the compiler on https://releases.rs/. There's a lot of good stuff that happens each release, still.
[0]: https://rustc-dev-guide.rust-lang.org/solve/trait-solving.ht...
To which many sensible people respond “I don’t want to think about monads either, but is the pain point really that bad?”
arg: impl Iterator<Item: Debug>It would probably just be TS.
Some are things that will never be stable, because they're not a feature; as an example, https://github.com/rust-lang/rust/issues/90418
Yeah. This is someone who's frustrated that he doesn't wake up to headlines that read "Hey babe, new Rust feature just dropped".
If that's what he's looking for, he should probably switch to the Javascript ecosystem.
Smart people will always do that, I've found it's better to ignore the chatter and focus on your own experience.
https://github.com/rust-lang/cargo/issues/2644
Its a clusterfuck of people misdirecting the discussion, the maintainers completely missing the point, and in the end its still not even been allowed to start.
Cargo can download-only, it cant build only dependencies. If you, for whatever reason (ignoring the misleading docker examples) want to build your dependencies separately from your main project build, you are sol unless you want to use a third party dependency to do so.
I actually have quite an opposite view: I think the Rust core team is 100% correct to make it very hard to add new "features" to the PL, in order to prevent the "language surface" from being bloated, inconsistent and unpredictable.
I've seen this happen before: I started out as a Swift fan, even though I have been working with Objective-C++ for years, considered it an awesome powerhouse and I did not really need a new PL for anything in particular in the world of iOS development. With time, Swift's insistence on introducing tons of new language "features" such as multiple, redundant function names, e.g., "isMultiple(of:)", multiple rules for parsing curly braces at al. to make the SwiftUI declarative paradigm possible, multiple rules for reference and value types and mutability thereof, multiple shorthand notations such as argument names inside closures, etc. - all that made me just dump Swift altogether. I would have to focus on Swift development exclusively just to keep up, which I was not willing to do.
Good ideas are "dime a dozen". Please keep Rust as lean as possible.
For example, you can write functions which return an impl Trait. And structs can contain arbitrary fields. But you can't write a struct which contains a value returned via impl Trait - because you can't name the type.
Or, I can write if a && b. And I can write if let Some(x) = x. But I can't combine those features together to write if let Some(x) = x && b.
I want things like this to be fixed. Do I want rust to be "bigger"? I mean, measured by the number of lines in the compiler, probably yeah? But measured from the point of view of "how complex is rust to learn and use", feature holes make the language more complex. Fixing these problems would make the language simpler to learn and simpler to use, because developers don't have to remember as much stuff. You can just program the obvious way.
Pin didn't take much work to implement in the standard library. But its not a "lean" feature. It takes a massive cognitive burden to use - to say nothing of how complex code that uses it becomes. I'd rather clean, simple, easy to read rust code and a complex borrow checker than a simple compiler and hard to use language.
> Pin didn't take much work to implement in the standard library. But its not a "lean" feature. It takes a massive cognitive burden to use - to say nothing of how complex code that uses it becomes. I'd rather clean, simple, easy to read rust code and a complex borrow checker than a simple compiler and a horrible language.
Your commentary on Pin in this post is even more sophomoric than the rest of it and mostly either wrong or off the point. I find this quite frustrating, especially since I wrote detailed posts explaining Pin and its development just a few months ago.
https://without.boats/blog/pin/ https://without.boats/blog/pinned-places/
You should have a look at Scala 3. Not saying that I'm perfectly happy with the direction of the language - but Scala really got those foundations well and made it so that it has few features but they are very powerful and can be combined very well.
Rust took a lot of inspiration from Scala for a reason - but then Rust wants to achieve zero-cost abstraction and do high-performance, so it has to make compromises accordingly for good reasons. Some of those compromises affect the ergonomics of the language unfortunately.
I'll give an example - async traits. On the surface it seems fairly simple to add? I can say async fn, but for the longest time I couldn't say async fn inside a trait? It took years of work to solve all the thorny issues blocking this in a stable, backwards compatible way and finally ship it [1]. There is still more work to be done but the good news is that they're making good progress here!
You pointed out one feature that Rust in Linux needs (no panics), but there are several more [2]. That list looks vast, because it is. It represents years of work completed and several more years of work in the Rust and Rust for Linux projects. It might seem reasonable to ask why we can't have it right now, but like Linus said recently "getting kernel Rust up to production levels will happen, but it will take years". [3] He also pointed out that the project to build Linux with clang took 10 years, so slow progress shouldn't discourage folks. The important thing is that the Rust project maintainers have publicly committed to working on it right now - "For 2024H2 we will work to close the largest gaps that block support (for adopting Rust in the kernel)". [4]
You dream of a language that could make bold breaking changes and mention Python 2.7 in passing. The Python 2/3 split was immensely painful and widely considered to be a mistake, even among the people who had advocated for it. The Rust project has a better mechanism for small, opt-in, breaking changes - the Edition system. That has worked well for the last 9 years and has led to tremendous adoption - more than doubling every year [5]. IMO there's no reason to fix what isn't broken.
I guess what I'm saying is, patience is the key here. Each release might not bring much because it only represents 6 weeks of work, but the cumulative effect of a year's worth of changes is pretty fantastic. Keep the faith.
[1] - https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-trait...
[2] - https://github.com/Rust-for-Linux/linux/issues/2
[3] - https://lwn.net/SubscriberLink/991062/b0df468b40b21f5d/
[4] - https://blog.rust-lang.org/2024/08/12/Project-goals.html
[5] - https://lib.rs/stats
If you want an edge over the people who are writing RFCs, don't write an RFC. Write a complete, production-ready implementation of your idea, with documentation and test cases, which can be cleanly merged into the tree.
Please by all means provide an implementation, but do write the RFC first. (Or in some cases smaller processes, such as the ACP process for a small standard-library addition.) Otherwise you may end up wasting a lot of effort, or having to rewrite the implementation. We are unlikely to accept a large feature, or even a medium feature, directly from a PR without an RFC.
I'd argue that this makes them pretty useless: if you just want a value that you can use like any other, then you can define a function that returns it and be done with it. Now we have another way to do it, and in theory it could do more, but that RFC has been stale for several years, nobody seems to be working on it, and I believe it's not even in nightly.
If the support would actually be good, we could just get rid of all the support crates we have in cryptography libraries (like the generic_array and typenum crates).
That said, I agree that the Rust team should be careful about adding features.
What is this way? I have been fighting with this problem for quite some time recently.
Alternatively: Rust is already the Wagyu of somewhat-mainstream PLs, don't keep adding fat until it's inedible.
Good ideas are rare and precious by definition.
At its core its a pretty simple app. I watches for file changes, and re-runs the compiler. The implementation is less than 1000 lines of code. But what happens if I vendor the dependencies? It turns out, the deps add up to almost 4 million lines of Rust code, spread across 8000+ files. For a simple file-watcher.
I'm not excited about Rust because of cool features, I'm excited because it's a whole new CLASS of language (memory safe, no GC, production ready). Actually getting it into the places that matter is way more interesting to me than making it a better language. That's easier to achieve if people are comfortable that the project is being steered with a degree of caution.
Originally Rust was written in OCaml, but eventually it got rewritten in Rust
E.g. corutines are stuck because they have some quite hard to correctly resolve corner cases, i.e. in the compiler there isn't a full implementation you could "just turn on" but a incomplete implementation which works okay for many cases but you really can't turn on on stable. (At least this was the case last time I checked.) Similar function traits have been explicitly decided to not be stabilized like that for various technical reasons but also due to them changing if you involve future features. (Like async corotines.) Sure the part about return values not being associated types is mostly for backward compatibility but it's also in nearly all situations just a small ergonomics drawback.
And sure there are some backward compatibility related designs which people have loved to do differently if they had more time and resources at the point the decision was made. But also most of them are related to the very early rust times when the team still was much smaller and there where less resources for evaluating important decisions.
And sure having a break which changes a bunch of older decisions now that different choices can be made and people are more experienced would be nice. BUT after how catastrophic bad python2->python3 went and similar experiences in other languages many people agree that having some rough corners is probably better and making a rust 2.0. (And many of this things can't be done through rust editions!)
In general if you follow the rust weekly newletter you can see that decisions for RFC acceptance, including for stabilization are handled every week.
And sure sometimes (quite too often) things take too long, but people/coordination/limited-time problems are often harder to solve then technical problem.
And sure some old features are stuck (corotines) and some but also many "feature gates" aren't "implemented stuck features" (but e.g. things which aren't meant to be ever stabilized, abandoned features, some features have multiple different feature gates etc.)
Edit: nevermind, comment is here too: https://news.ycombinator.com/item?id=41655268
The author of the linked comment did extensive analysis on the synchronization primitives in various languages, then rewrote Rust's synchronization primitives like Mutex and RwLock on every major OS to use the underlying operating system primitives directly (like futex on Linux), making them faster and smaller and all-around better, and in the process, literally wrote a book on parallel programming in Rust (which is useful for non-Rust parallel programming as well): https://www.oreilly.com/library/view/rust-atomics-and/978109...
> Features like Coroutines. This RFC is 7 years old now.
We haven't been idling around for 7 years (either on that feature or in general). We've added asynchronous functions (which whole ecosystems and frameworks have arisen around), traits that can include asynchronous functions (which required extensive work), and many other features that are both useful in their own right and needed to get to more complex things like generators. Some of these features are also critical for being able to standardize things like `AsyncWrite` and `AsyncRead`. And we now have an implementation of generators available in nightly.
(There's some debate about whether we want the complexity of fully general coroutines, or if we want to stop at generators.)
Some features have progressed slower than others; for instance, we still have a lot of discussion ongoing for how to design the AsyncIterator trait (sometimes also referred to as Stream). There have absolutely been features that stalled out. But there's a lot of active work going on.
I always find it amusing to see, simultaneously, people complaining that the language isn't moving fast enough and other people complaining that the language is moving too fast.
> Function traits (effects)
We had a huge design exploration of these quite recently, right before RustConf this year. There's a challenging balance here between usability (fully general effect systems are complicated) and power (not having to write multiple different versions of functions for combinations of async/try/etc). We're enthusiastic about shipping a solution in this area, though. I don't know if we'll end up shipping an extensible effect system, but I think we're very likely to ship a system that allows you to write e.g. one function accepting a closure that works for every combination of async, try, and possibly const.
> Compile-time Capabilities
Sandboxing against malicious crates is an out-of-scope problem. You can't do this at the language level; you need some combination of a verifier and runtime sandbox. WebAssembly components are a much more likely solution here. But there's lots of interest in having capabilities for other reasons, for things like "what allocator should I use" or "what async runtime should I use" or "can I assume the platform is 64-bit" or similar. And we do want sandboxing of things like proc macros, not because of malice but to allow accurate caching that knows everything the proc macro depends on - with a sandbox, you know (for instance) exactly what files the proc macro read, so you can avoid re-running it if those files haven't changed.
> Rust doesn't have syntax to mark a struct field as being in a borrowed state. And we can't express the lifetime of y.
> Lets just extend the borrow checker and fix that!
> I don't know what the ideal syntax would be, but I'm sure we can come up with something.
This has never been a problem of syntax. It's a remarkably hard problem to make the borrow checker able to handle self-referential structures. We've had a couple of iterations of the borrow checker, each of which made it capable of understanding more and more things. At this point, I think the experts in this area have ideas of how to make the borrow checker understand self-referential structures, but it's still going to take a substantial amount of effort.
> This syntax could also be adapted to support partial borrows
We've known how to do partial borrows for quite a while, and we already support partial borrows in closure captures. The main blocker for supporting partial borrows in public APIs has been how to expose that to the type system in a forwards-compatible way that supports maintaining stable semantic versioning:
If you have a struct with private fields, how can you say "this method and that method can borrow from the struct at the same time" without exposing details that might break if you add a new private field?
Right now, leading candidates include some idea of named "borrow groups", so that you can define your own subsets of your struct without exposing what private fields those correspond to, and so that you can change the fields as long as you don't change which combinations of methods can hold borrows at the same time.
> Comptime
We're actively working on this in many different ways. It's not trivial, but there are many things we can and will do better here.
I recently wrote two RFCs in this area, to make macro_rules more powerful so you don't need proc macros as often.
And we're already talking about how to go even further and do more programmatic parsing using something closer to Rust constant evaluation. That's a very hard problem, though, particularly if you want the same flexibility of macro_rules that lets you write a macro and use it in the same crate. (Proc macros, by contrast, require you to write a separate crate, for a variety of reasons.)
> impl<T: Copy> for Range<T>.
This is already in progress. This is tied to a backwards-incompatible change to the range types, so it can only occur over an edition. (It would be possible to do it without that, but having Range implement both Iterator and Copy leads to some easy programming mistakes.)
> Make if-let expressions support logical AND
We have an unstable feature for this already, and we're close to stabilizing it. We need to settle which one or both of two related features we want to ship, but otherwise, this is ready to go.
> But if I have a pointer, rust insists that I write (*myptr).x or, worse: (*(*myptr).p).y.
We've had multiple syntax proposals to improve this, including a postfix dereference operator and an operator to navigate from "pointer to struct" to "pointer to field of that struct". We don't currently have someone championing one of those proposals, but many of us are fairly enthusiastic about seeing one of them happen.That said, there's also a danger of spending too much language weirdness budget here to buy more ergonomics, versus having people continue using the less ergonomic but more straightforward raw-pointer syntaxes we currently have. It's an open question whether adding more language surface area here would on balance be a win or a loss.
> Unfortunately, most of these changes would be incompatible with existing rust.
One of the wonderful things about Rust editions is that there's very little we can't change, if we have a sufficiently compelling design that people will want to adopt over an edition.
> The rust "unstable book" lists 700 different unstable features - which presumably are all implemented, but which have yet to be enabled in stable rust.
This is absolutely an issue; one of the big open projects we need to work on is going through all the existing unstable features and removing many that aren't likely to ever reach stabilization (typically either because nobody is working on them anymore or because they've been superseded).
Make it 70% of Rust in 10% of the code, similarly to what QBE[0] is doing with LLVM.
You'd probably be able to achieve that if you remove macros and some of the rarely-used features.
I've had a lot of talks with my management about that. For context, I'm on the Cargo team and have authored 11 RFCs (10 approved, 1 pending).
I feel like a lot of the pacing feels slow because:
- As the project matures, polishing whats there takes up a lot of effort
- Conversely, hitting local maximas where things are "just good enough" that individuals and companies don't feel the need to put effort to doing the last leg of work.
- Lack of coordinated teams (formerly Mozilla) doubling down on an idea to hash it out. Hopefully [Project Goals](https://rust-lang.github.io/rfcs/3614-project-goals.html) will help a little in this direction.
- As the project has grown, we've specialized a lot more, making it harder to develop a cross-team feature. It takes finesse to recruit someone from another team to help you finish out a cross-team feature. It also doesn't help we've not done a good job developing the cross-team communication channels to make up for this specialization. Again, Project Goals are trying to improve this. In-person conferences starting back up has also been a big help.
As for RFCs, we've been moving in the direction of choosing the level of process thats appropriate for a decision. Unsure how something will look? You just need approval from 2 members of the relevant team to start a nightly only experiment to flesh out the idea in preparation for an RFC. In Cargo, many decisions don't need wide input and are just team votes on an Issue. RFCs drag out when their isn't a someone from the team shepherding it through the process, the RFC covers too much and needs to be shrunk to better focus the conversation, too much is unknown and instead an experiment is needed, or its cross-team and you need to know how to navigate the process to get the vote done (we want to improve this). As for things being approved but not completed, thats a "we need more help" problem usually.
You know, I would LOVE working on Rust (not just with Rust) and be a part of some of the core team(s).
But my impression is that nobody truly has any powerful agency over things and even if you formulate a near-perfect and a PR to go with it, things would still end with several smarter people than me saying "Oh this looks really neat, we should ponder it more and test it further and merge it!" and then it never happens.
That, plus I am not sure how is the job stability situation there.
* Supports Unions (TypeScript, Flow, Scala3, Hare)
* Supports GADTs
* Capable of targeting both preemptive userland concurrency (go, erlang, concurrent Haskell, concurrent OCaml) and cooperative (tinygo, nodejs, async-python, async-rust) without code changes
* Easily build without libc (CGO_ENABLED=0)
* No Backwards compatibility promise - This eliminates geriatrics
* Cleaner syntax, closer to Go, F#, or Python
* Graph-based Borrow Checker
* Add `try-finally` or `defer` support, `Drop` is too limiting, Async drop could help.
* Fix Remaining MIR Move Optimizations and Stack Efficiency
* Culture for explicit allocator passing like Zig
* `.unwrap()` is removedBut then I thought about it more. Whatever you call it - Pin or Move - the point is to say "this struct contains a borrowed field". But we never needed Pin for local variables in functions - even when they're borrowed - because the borrow checker understands whats going on. The "Pin" is implicit. Pin also doesn't describe all the other semantics of a borrowed value correctly - like how borrowed values are immutable.
I suspect if the borrow checker understood the semantics of borrowed struct fields (just like it does with local variables), then we might not need Pin or Move at all.
If we could go back in time and have the rust project decide to never implement async, I wonder what rust would look like today. There's a good chance the language & compiler would be much nicer as a result.
If withoutboats is right [1], then Rust would never have received the industry backing to be as successful as it is now.
[1]: https://without.boats/blog/why-async-rust/ especially the section "Organizational considerations"
The decision for `async` handed a lot of power to Amazon et al.
Capabilities to IO can be done by letting IO functions interrupt and call an effect handler, and the caller can specify the effect handler and do access control in there.
The whole Pin situation only exists because async/await was an afterthought and didn't work well with the existing language. async/await is an instance of effects.
I'm excited to start playing with a language that has a good effect system. I am hoping on Ante, but would also like to try Roc at some point.
-- Maybe you are living a million lifetimes in parallel right now and this one is the one devoted to working on compilers? Get to it! :-)
I think this is probably where all proposed whitelist/capability proposal discussions end. It's going to be too many crates that are in that category for it to be useful.
A good first step (not sure if it's already taken tbh) would be to at least sandbox build execution. So that an attacker can't execute arbitrary code when your app is compiled.
The one point that stuck out for me is the comptime section. It approaches the topic from a security and supply-chain attacks angle, which is a way I never thought about it.
I think Rust might quickly run into the “negative trait” problem trying to get that working, while embracing an effect system like Purescripts might get you the goods in a “principled” way. Though I haven’t thought about this deeply.
Don't get me wrong: I'd like coroutines and a lot of other unstable/hidden features done as well. Function traits sound great, and I'd also like the whole Pin stuff to be easier (or gone?).
But please, "Lets just extend the borrow checker and fix that" sounds very demeaning. Like no one even tried? I am by far no expert, but I am very sure that its not something you "just" go do.
I like most of the proposed features and improvements, I mostly share the critique on the language, but I do not thing the "why not just fix it?" attitude is helpful or warranted. Theres tons of work, and only so much people & time.
There was a good blog post recently on Pin ergonomics, which I hope will lead somewhere good. It's not like they don't know that these things are difficult, and it's not like they're not trying to fix them, but generalised coroutines (for example) in the presence of lifetimes are absolutely monumentally difficult to get right, and they just can't afford to get it wrong. It's not like you can just nick the model from C#'s, because C# has a garbage collector.
As someone who has dabbled in compiler writing (i.e. I may be totally wrong), I believe that from a technical standpoint, modifying the borrow checker as proposed in the article (w.r.t. self-referential structs) is actually something you can "just do". The issues that come up are due to backwards compatibility and such, meaning it cannot be done in Rust without a new Rust edition (or by forking the compiler like in the article).
It's a bit restricted on how much you can do because they do promise compatibility with older crates, but it seems to be working out pretty well and that compatibility promise is part of why it does work.
Even if we put aside safety issues, each crate brings ~10 more dependencies by default (i.e. without any features turned on), which bloats compile times. Maybe it's better to be able to shard 3rd party crates, and not update them automatically at all?
The closest to a solution we have is dependency scanning against known CVEs.
Having per-crate permissions is, I think, the only way languages can evolve past this hell hole we call supply chain attacks. It’s not a silver bullet, there will be edge cases that can be bypassed and new problems it creates. But if it reduces the scope of where supply chains can attack and what they can do, then that’s still a massive win.
I also think you probably only need to restrict your dependencies. If you have a dep tree like this:
a
|-b
|-c
Then if crate a decides b isn't trusted, c would inherit the same trust rules. This would allow crates to be refactored, but keep the list of rules needed in big projects to a minimum. You just have to add explicit rules for sub-crates which need more permissions. Thats probably not a big deal in most cases.(You might still, sometimes, want to be able to configure your project to allow privileged operations in c but not b. But thats an edge case. We'd just need to think through & add various options to Cargo.toml.)
well... :-(
Actually, it's obvious that some authors might "turn evil" dumbly, by abusing some kind of priviledged permissions. By chance, these kinds of supply-chain risks are "easily" identified because
1) the permissions are an "easy" risk indicator, so you can priorize either to pin the version library (after validating it) or validate the new version
2) not so many libraries will use these permissions so you "have time" to focus on them
3) in these libraries, the permissions will tell you what system call/bad effects is possible, so will allow you to narrow even more the scope of investigation
So, IMHO, permissions are not really the end of all but only a tiny step.
The real problem is "how can human-size be used to subvert the program ?" For example: what is happening if the returned size "forget" or "add" 100 bytes to files bigger than 1 KB ? As a remininder, STUXNET was about some speed a tiny bit faster than planned and shown...
> ast_nodes: Vec<&'Self::source str>,
Oh, that would be neat to replace the https://github.com/tommie/incrstruct I wrote for two-phase initialization. Unlike Ouroboros and self_cell, it uses traits so the self-references can be recreated after a move. Whether it's a good idea, I don't know, but the magic Ouroboros applies to my struct feels wrong. But I say that as someone coming from C++.
> if let Some(x) = some_var && some_expr { }
Coming from Go, I was surprised that something like
if let Some(x) = some_var; expr(x) { }
isn't a thing.As I said in the post you can also write this:
if let (Some(x), true) = (my_option, expr) {
But then it doesn't short-circuit. (expr is evaluated in all cases, even when the optional is None).Both approaches are also weird. It'd be much better to just fix the language to make the obvious thing work.
The same thing in C++17:
if (auto x = something();
expr1(x)) {}
else if (expr2(x)) {}
It's really neat to have the variable scoped to the if/else-clause.C/C++ are the only widely used languages without a popular npm-style package manager, and as a result most libraries are self-contained or have minimal, and often optional dependencies. efsw [1] is a 7000 lines (wc -l on the src directory) C++ FS watcher without dependencies.
The single-header libraries that are popular in the game programming space (stb_* [2], cgltf [3], etc) as well as of course Dear ImGui [4] have been some of the most pleasant ones I've ever worked with.
At this point I'm convinced that new package managers forbidding transitive dependencies would be an overall net gain. The biggest issue are large libraries that other ones justifiably depend on - OpenSSL, zlib, HTTP servers/clients, maybe even async runtimes. It's by no means an unsolvable problem, e.g. instead of having zlib as a transitive dependency, it could:
1. a library can still hard-depend on zlib, and just force the user to install it manually.
2. a library can provide generic compress/decompress callbacks, that the user can implement with whatever.
3. the compress/decompress functionality can be make standard
[1] https://github.com/SpartanJ/efsw
[2] https://github.com/nothings/stb
The mainstream game programming doesn't use C at all. (Source: I had been a gamedev for almost a decade, and I mostly dealt with C# and sometimes C++ for low-level stuffs.) Even C++ is now out of fashion for at least a decade, anyone claiming that C++ is necessary for game programming is likely either an engine developer---a required, but very small portion of all gamedevs---or whoever haven't done significant game programming recently.
Also, the reason that single-header libraries are rather popular in C is that otherwise they will be so, SO painful to use by the modern standard. As a result, those libraries have to be much more carefully designed than normal libraries either in C or other languages and contribute to their seemingly higher qualities. (Source: Again, I have written sizable single-header libraries in C and am aware of many issues from doing so.) I don't think this approach is scalable in general.
If you ignore the OS, then sure. Most C/C++ codebases aren't really portable however. They're tied to UNIX, Windows or macOS, and often some specific version range of those, because they use so many APIs from the base OS. Include those and you're up to millions of lines too.
1. This doesn't mean that C++'s fragmented hellscape of package management is a good thing.
2. "inevitably"? No. This confuses the causation.
3. This comment conflates culture with tooling. Sure, they are related, but not perfectly so.
This only works for extremely simple cases. Beyond toy example, you have to glue together two whole blown APIs with a bunch of stuff not aligning at all.
[Edit] And for completeness, Microsoft's Windows crate is 630 thousand lines, though that goes way beyond simple bindings, and actually provides wrappers to make its use more idiomatic.
Composition is an essential part of software development, and it crosses package boundaries.
How would banishing inter-package composition be a net gain?
Also I am no expert, but I think file-watchers are definitely not simple at all, especially if they are multi-platform.
Language files blank comment code
-------------------------------------------------------------------------------
C 4 154 163 880
Bourne Shell 2 74 28 536
C/C++ Header 4 21 66 70
Markdown 1 21 0 37
YAML 1 0 0 14
-------------------------------------------------------------------------------
SUM: 12 270 257 1537
-------------------------------------------------------------------------------
including a well-designed CLI.entr supports BSD, Mac OS, and Linux (even WSL). So that's several platforms in <2k lines of code. By using MATHEMATICS and EXTRAPOLATION we find that non-WSL Windows file-watching must take four million minus two thousand equals calculate calculate 3998000 lines of code. Ahem.
Though to be fair, cargo watch probably does more than just file-watching. (Should it? Is it worth the complexity? I guess that depends on where you land on the worse-is-better discussion.)
Forgive me if I'm making a very bold claim, but I think cross-platform file watching should not require this much code. It's 32x larger than the Linux memory management subsystem.
Since everyone depends on the standard library this will just mean everyone will depend on even more lines of code. You are decreasing the number of nominal dependencies but increasing of much code those amount to.
Moreover the moment the stdlib's bundled dependency is not enough there are two problems:
- it can't be changed because that would be a breaking change, so you're stuck with the old bad implementation;
- you will have to use an alternative implementation in another crate, so now you're back at the starting situation except with another dependency bundled in the stdlib.
Just look at the dependency situation with the python stdlib, e.g. how many versions of urllib there are.
I don't really know much about Rust, but I got curious and had a look at the file watching apis for windows/linux/macos and it really didn't seem that complicated. Maybe a bit fiddly, but I have a hard time imagining how it could take more than 500 lines of code.
I would love to know where the hard part is if anyone knows of a good blog post or video about it.
And since xz we know resourceful and patient attackers are reality and not just "it might happen".
Sorry but sprawling transitive micro-dependencies are not sustainable. It's convenient and many modern projects right now utilize it but they require a high-trust environment and we don't have that anymore, unfortunately.
All code is built on mountains of dependencies that by their nature will do more than what you are using them for. For example, part of cargo watch is to bring in a win32 API wrapper library (which is just autogenerated bindings for win32 calls). Of course that thing is going to be massive while watch is using only a sliver of it in the case it's built for windows.
The standard library for pretty much any language will have millions of lines of code, that's not scary even though your apps likely only use a fraction of what's offered.
And have you ever glanced at C++'s boost library? That thing is monstrously big yet most devs using it are going to really only grab a few of the extensions.
The alternative is the npm hellscape where you have a package for "isOdd" and a package for "is even" that can break the entire ecosystem if the owner is disgruntled because everything depends on them.
Having fewer larger dependencies maintained and relied on by multiple people is much more ideal and where rust mostly finds itself.
The is-odd and is-even packages are in no way situated to break the ecosystem. They're helper functions that their author (Jon Schlinkert) used as dependencies in one of his other packages (micromatch) 10 years ago, and consequently show up as transitive dependencies in antiquated versions of micromatch. No one actually depends on this package indirectly in 2024 (not even the author himself), and very few packages ever depended on it directly. Micromatch is largely obsolete given the fact that Node has built in globbing support now [1][2]. We have to let some of these NPM memes go.
[1] https://nodejs.org/docs/latest-v22.x/api/path.html#pathmatch...
[2] https://nodejs.org/docs/latest-v22.x/api/fs.html#fspromisesg...
This used to be true 5-10 years ago. The js ecosystem moves fast and much has been done to fix the dependency sprawl.
It seems that most dependencies of cargo-watch are pulled from three direct requirements: clap, cargo_metadata and watchexec. Clap would pull lots of CLI things that would be naturally platform-dependent, while cargo_metadata will surely pull most serde stuffs. Watchexec does have a room for improvement though, because it depends on command-group (maintained in the same org) which unconditionally requires Tokio! Who would have expected that? Once watchexec got improved on that aspect however, I think these requirements are indeed necessary for the project's goal and any further dependency removal will probably come with some downsides.
A bigger problem here is that you can't easily fix other crates' excessive dependencies. Watchexec can be surely improved, but what if other crates are stuck at the older version of watchexec? There are some cases where you can just tweak Cargo.lock to get things aligned, but generally you can't do that. You have to live with excessive and/or duplicate dependencies (not a huge problem by itself, so it's default for most people) or work around with `[patch]` sections. (Cargo is actually in a better shape given that the second option is even possible at all!) In my opinion there should be some easy way to define a "stand-in" for given version of crate, so that such dependency issues can be more systematically worked around. But any such solution would be a huge research problem for any existing package manager.
That, and the maven repository is moderated. Unlike crates.io.
Crates.io is a real problem. No namespaces, basically unmoderated, tons of abandoned stuff. Version hell like you're talking about.
I have a hard time taking it at all seriously as a professional tool. And it's only going to get worse.
If I were starting a Rust project from scratch inside a commercial company at this point, I'd use Bazel or Buck or GN/Ninja and vendored dependencies. No Cargo, no crates.io.
I wish crates that used Windows stuff wouldn't enable it by default.
The fact that nothing has changed in the NPM and Python worlds indicates that market forces pressure the decision makers to prefer the more risky approach, which prioritizes growth and fast iteration.
whether those factors impact how you view the result of linecount is subjective
also as one of the other commenters mentioned, cargo watch does more than just file watching
Other than people who care about relatively obscure concerns like distro packaging, nobody is impeded in their work in any practical way by crates having a lot of transitive dependencies.
That sounds like a massive security problem to me. All it would take is one popular crate to get hacked / bribed / taken over and we're all done for. Giving thousands of strangers the ability to run arbitrary code on my computer is a profoundly stupid risk.
Especially given its unnecessary. 99% of crates don't need the ability to execute arbitrary syscalls. Why allow that by default?
This more than any other issue is I think what prevents Rust adoption outside of more liberal w.r.t dependencies companies in big tech and web parts of the economy.
This is actually one positive in my view behind the rather unwieldy process of using dependencies and building C/C++ projects. There's a much bigger culture of care and minimalism w.r.t. choosing to take on a dependency in open source projects.
Fwiw, the capabilities feature described in the post would go a very long way towards alleviating this issue.
And people are still calling it "obscure concerns"...
I write in Clojure and I take great pains to avoid introducing dependencies. Contrary to the popular mantra, I will sometimes implement functionality instead of using a library, when the functionality is simple, or when the intersection area with the application is large (e.g. the library doesn't bring as many benefits as just using a "black box"). I will work to reduce my dependencies, and I will also carefully check if a library isn't just simple "glue code" (for example, for underlying Java functionality).
This approach can be used with any language, it just needs to be pervasive in the culture.
Node has improved greatly in last two years. They always had native JSON support. Now have native test runner, watch, fetch, working on permission system à la deno, added WebSockets and working on native SQLite driver. All of this makes it a really attractive platform for prototyping which scales from hello world without any dependencies to production.
Good luck experimenting with Rust without pulling half the internet with it.
E: and they’re working on native TS support
If I am making a small greenhouse i can buy steel profiles and not care about what steel are they from. If I am building a house I actually want a specific standardized profile because my structure's calculations rely on that. My house will collapse if they dont. If I am building a jet engine part I want a specific alloy and all the component metals and foundry details, and will reject if the provenance is not known or suitable[1].
If i am doing my own small script for personal purposes I dont care much about packaging and libraries, just that it accomplishes my immediate task on my environment. If I have a small tetris application I also dont care much about libraries, or their reliability. If I have a business selling my application and I am liable for its performance and security I damn sure want to know all about my potential liabilities and mitigate them.
[1] https://www.usatoday.com/story/travel/airline-news/2024/06/1...
Some of us have licensing restrictions we have to adhere to.
Some of us are very concerned about security and the potential problems of unaudited or unmoderated code that comes in through a long dependency chain.
Hard learned lessons through years of dealing with this kind of thing: good software projects try to minimize the size of their impact crater.
It's entirely possible to use Rust with other build systems, with vendored dependencies.
Crates.io is a blight. But the language is fine.
One thing is to decide to vendor everything - that's your prerogative - but it's very likely that pulling everything in also pulls in tons of stuff that you aren't using, because recursively vendoring dependencies means you are also pulling in dev-dependencies, optional dependencies (including default-off features), and so on.
For the things you do use, is it the number of crates that is the problem, or the amount of code? Because if the alternative is to develop it in-house, then...
The alternative here is to include a lot of things in the standard library that doesn't belong there, because people seem to exclude standard libraries from their auditing, which is reasonable. Why is it not just as reasonable to exclude certain widespread ecosystem crates from auditing?
(Putting aside the question weather or not that pulls in dev dependencies and that watchin files can easily have OS specific aspecects so you might have different dependencies on different OSes and that neither lines and even less files are a good measurement of complexity and that this dependencies involve a lot of code from features of dependencies which aren't used and due to rust being complied in a reasonable way are reliable not included in the final binary in most cases. Also ignoring that cargo-watch isn't implementing file watching itself it's in many aspects a wrapper around watchexec which makes it much "thiner" then it would be otherwise.)
What if that is needed for a reliable robust ecosystem?
I mean, I know, it sound absurd but give it some thought.
I wouldn't want every library to reinvent the wheel again and again for all kinds of things, so I would want them to use dependencies, I also would want them to use robust, tested, mature and maintained dependencies. Naturally this applies transitively. But what libraries become "robust, tested, mature and maintained" such which just provide a small for you good enough subset of a functionality or such which support the full functionality making it usable for a wider range of use-case?
And with that in mind let's look at cargo-watch.
First it's a CLI tool, so with the points above in mind you would need a good choice of a CLI parser, so you use e.g. clap. But at this point you already are pulling in a _huge_ number of lines of code from which the majority will be dead code eliminated. Through you don't have much choice, you don't want to reinvent the wheel and for a CLI libary to be widely successful (often needed it to be long term tested, maintained and e.g. forked if the maintainers disappear etc.) it needs to cover all widely needed CLI libary features, not just the subset you use.
Then you need to handle configs, so you include dotenvy. You have a desktop notification sending feature again not reason to reinvent that so you pull in rust-notify. Handling path in a cross platform manner has tricky edge cases so camino and shell-escape get pulled in. You do log warnings so log+stderrlog get pulled in, which for message coloring and similar pull in atty and termcolor even through they probably just need a small subset of atty. But again no reason to reinvent the wheel especially for things so iffy/bug prone as reliably tty handling across many different ttys. Lastly watching files is harder then it seems and the notify library already implements it so we use that, wait it's quite low level and there is watchexec which provides exactly the interface we need so we use that (and if we would not we still would use most or all of watchexecs dependencies).
And ignoring watchexec (around which the discussion would become more complex) with the standards above you wouldn't want to reimplement the functionality of any of this libraries yourself it's not even about implementation effort but stuff like overlooking edge cases, maintainability etc.
And while you definitely can make a point that in some aspects you can and maybe should reduce some dependnecies etc. this isn't IMHO changing the general conclusion: You need most of this dependencies if you want to conform with standards pointed out above.
And tbh. I have seen way way way to many cases of projects shaving of dependencies, adding "more compact wheel reinventions" for their subset and then ran into all kinds of bugs half a year later. Sometimes leading to the partial reimplementations becoming bigger and bigger until they weren't much smaller then the original project.
Don't get me wrong there definitely are cases of (things you use from) dependencies being too small to make it worth it (e.g. left pad) or more common it takes more time (short term) to find a good library and review it then to reimplement it yourself (but long term it's quite often a bad idea).
So idk. the issue is transitive dependencies or too many dependencies like at all.
BUT I think there are issues wrt. handling software supply chain aspects. But that is a different kind of problem with different solutions. And sure not having dependencies avoid that problem, somewhat, but it's just replacing it IMHO with a different as bad problem.
I'm curious as I don't know Go but it often gets mentioned here on HN as very lightweight.
(A quick googling finds https://pkg.go.dev/search?q=watch which makes me think that it's not any different?)
All that, despite JS being much older than rust, and much more widely used. Javascript also has several production implementations - which presumably all need to agree to implement any new features.
Javascript had a period of stagnation around ES5. The difference seems to be that the ecmascript standards committee got their act together.
History repeated itself, and now Typescript has even more popularity than CoffeeScript ever did, so if the ecma committee is still on their act, they're probably working on figuring out how to adopt types into Javascript as well.
More relevant to this argument, is the question if a similar endeavor would work for Rust. Are the features you're describing so life changing that people would work in a transpiled language that had them? For CoffeeScript, from my perspective at least, it was just the arrow functions. All the sugar on top just sealed the deal.
The assumption that "[Rust] stagnation" is due to some kind of "Rust committee inefficiencies" might be incorrect.
Rust and Ada have similar goals and target use cases, but different advantages and strengths.
In my opinion, Rust's biggest innovations are 1) borrow checking and "mutation XOR sharing" built into the language, effectively removing the need for manual memory management or garbage collection, 2) Async/Await in a low-level systems language, and 3) Superb tooling via cargo, clippy, built-in unit tests, and the crates ecosystem (in a systems programming language!) Rust may not have been the first with these features, but it did make them popular together in a way that works amazingly well. It is a new class of language due to the use of the borrow checker to avoid memory safety problems.
Ada's strengths are its 1) powerful type system (custom integer types, use of any enumerated type as an index, etc.), 2) perfect fit for embedded programming with representation clauses, the real-time systems annex, and the high integrity systems annex, 3) built-in Design-by-Contract preconditions, postconditions, and invariants, and 4) Tasking built into the language / run-time. Compared to Rust, Ada feels a bit clunky and the tooling varies greatly from one Ada implementation to another. However, for some work, Ada is the only choice because Rust does not have sufficently qualified toolchains yet. (Hopefully soon . . .)
Both languages have great foreign function interfaces and are relatively easy to use with C compared to some other programming languages. Having done a fair bit of C programming in the past, today I would always choose Rust over C or C++ when given the choice.
Also, Zig might be a nice modern language, but it is not an option if you're aiming for memory safety.
One could also argue Rust's unsafe blocks will be harder to reason about bugs in than Zig code. And if you don't need any unsafe blocks it might not be an application best suited to Zig or Rust.
> I’m not saying the author is wrong here, just pointing out how a complex language somehow needs to be even more complicated. Spoiler: it doesn’t.
True. But I think a lot of rust's complexity budget is spent in the wrong places. For example, the way Pin & futures interact adds a crazy amount of complexity to the language. And I think at least some of that complexity is unnecessary. As an example, I'd like a rust-like language which doesn't have Pin at all.
I suspect there's also ways the borrow checker could be simplified, in both syntax and implementation. But I haven't thought enough about it to have anything concrete.
I don't think there's much we can do about any of that now short of forking the language. But I can certainly dream.
Rust won't be the last language invented which uses a borrow checker. I look forward to the next generation of these ideas. I think there's probably a lot of ways to improve things without making a bigger language.
Unfortunately that attracts the worst types. And their crapness and damage potential is sometimes not realised until it’s way too late.
I see some drama associated with Rust, but it's usually around people resisting its usage or adoption (the recent kerfuffle about Rust for Linux, for example), and not really that common within the community. But I could be missing something?
Zig is great, but it just isn't production ready.
https://news.ycombinator.com/item?id=36122270 https://news.ycombinator.com/item?id=29343573 https://news.ycombinator.com/item?id=29351837
The Ashley "Kill All Men" Williams drama was pretty bad. She had a relationship with a core Rust board member at the time so they added her on just because. Any discussion about her addition to the board was censored immediately, reddit mods removed and banned any topics and users mentioning her, etc.
Also, Zig is set to release 1.0 beta in November.
Specializations allow unsound behavior in safe Rust, which is exactly what nightly was supposed to catch.
There are only two kinds of languages: the ones people complain about and the ones nobody uses.
Much of Rust's (and almost every other large programming language) drama are problems of scale, not implementation. The more funding you wish for will indubitably create more drama.
(This is not a diss on Zig at all, I love its approach!)
Ah, like Scala you mean?
We've had a lot of talk about sandboxing of proc-macros and build scripts. Of course, more declarative macros, delegating `-sys` crate logic to a shared library, and `cfg(version)` / `cfg(accessible)` will remove a lot of the need for user versions of these. However, that all ignores runtime. The more I think about it, the more cackle's "ACLs" [0] seem like the way to go as a way for extensible tracking of operations and auditing their use in your dependency tree, whether through a proc-macro, a build script, or runtime code.
I heard that `cargo-redpen` is developing into a tool to audit calls though I'm imagining something higher level like cackle.
> I always find it amusing to see, simultaneously, people complaining that the language isn't moving fast enough and other people complaining that the language is moving too fast.
I think people complain that rust is a big language, and they don't want it to be bigger. But keeping the current half-baked async implementation doesn't make the language smaller or simpler. It just makes the language worse.
> The main blocker for supporting partial borrows in public APIs has been how to expose that to the type system in a forwards-compatible way that supports maintaining stable semantic versioning
I'd love it if this feature shipped, even if it only works (for now) within a single crate. I've never had this be a problem in my crate's public API. But it comes up constantly while programming.
> Sandboxing against malicious crates is an out-of-scope problem. You can't do this at the language level; you need some combination of a verifier and runtime sandbox.
Why not?
If I call a function that contains no unsafe 3rd party code in its call tree, and which doesn't issue any syscalls, that function can already only access & interact with passed parameters, local variables and locally in-scope globals. Am I missing something? Because that already looks like a sandbox, of sorts, to me.
Is there any reason we couldn't harden the walls of that sandbox and make it usable as a security boundary? Most crates in my dependency tree are small, and made entirely of safe code. And the functions in those libraries I call don't issue any syscalls already anyway. Seems to me like adding some compile-time checks to enforce that going forward would be easy. And it would dramatically reduce the supply chain security risk.
Mind explaining your disagreement a little more? It seems like a clear win to me.
I can't disagree more.
In fact, I think that the current state of async Rust is the best implementation of async in any language.
To get Pin stuff out of the way: it is indeed more complicated than it could be (because reverse compatibility etc), but when was the last time you needed to write a poll implementation manually? Between runtime (tokio/embassy) and utility crates, there is very little need to write raw futures. Combinators, task, and channels are more than enough for the overwhelming majority of problems, and even in their current state they give us more power than Python or JS ecosystems.
But then there's everything else.
Async Rust is correct and well-defined. The way cancellation, concurrent awaiting, and exceptions work in languages like JS and Python is incredibly messy (eg [1]) and there are very few people who even think about that. Rust in its typical fashion frontloads this complexity, which leads to more people thinking and talking about it, but that's a good thing.
Async Rust is clearly separated from sync Rust (probably an extension of the previous point). This is good because it lets us reason about IO and write code that won't be preempted in an observable way, unlike with Go or Erlang. For example, having a sync function we can stuff things into thread locals and be sure that they won't leak into another future.
Async Rust has already enabled incredibly performant systems. Cloudflare's Pingora runs on Tokio, processing a large fraction of internet traffic while being much safer and better defined than nginx-style async. Same abstractions work in Datadog's glommio, a completely different runtime architecture.
Async Rust made Embassy possible, a genuine breakthrough in embedded programming. Zero overhead, safe, predictable async on microcontrollers is something that was almost impossible before and was solved with much heavier and more complex RTOSes.
"Async Rust bad" feels like a meme at this point, a meme with not much behind it. Async Rust is already incredibly powerful and well-designed.
[1]: https://neopythonic.blogspot.com/2022/10/reasoning-about-asy...
I believe you are proposing a language-based security (langsec), which seemed very promising at first but the current consensus is that it still has to be accompanied with other measures. One big reason is that virtually no practical language implementation is fully specified.
As an example, let's say that we only have fixed-size integer variables and simple functions with no other control constructs. Integers wrap around and division by zero yields zero, so no integer operation can trap. So it should be easy to check for the infinite recursion and declare that the program would never trap otherwise, right? No! A large enough number of nested but otherwise distinct function calls would eventually overflow the stack and cause a trap or anything else. But this notion of "stack" is highly specific to the implementation, so the provable safety essentially implies that you have formalized all such implementation-specific notions in advance. Possible but extremely difficult in practice.
The "verifier and runtime sandbox" mentioned here is one solution to get around this difficulty. Instead of being able to understand the full language, the verifier is only able to understand a very reduced subset and the compiler is expected (but not guaranteed) to return something that would pass the verifier. A complex enough verifier would be able to guarantee that it is safe to execute even without a sandbox, but a verifier combined with a runtime sandbox is much simpler and more practical.
This applies to both suggestions ("fork" and "don't use it").
> The real problem is "how can human-size be used to subvert the program ?" For example: what is happening if the returned size "forget" or "add" 100 bytes to files bigger than 1 KB ? As a remininder, STUXNET was about some speed a tiny bit faster than planned and shown...
I read this argument in a similar vein to the argument against rust's unsafe blocks. "Look, C code will always need some amount of unsafe. So why bother sandboxing it?"
But in practice, having explicit unsafe blocks has been a massive win for safety in the language. You can opt out of it at any time - but most people never need to!
A 90% solution doesn't solve the problem entirely. But it does solve 90% of the problem. And thats pretty bloody good if you ask me! Sure - my safe rust decompression library could still maliciously inject code in files that it decompresses. But having checks like this would still reduce the security surface area by a huge amount.
Less implicit trust in random crate authors is a good thing. I don't want thousands of crate authors to be allowed to execute totally arbitrary code on my machine! The current situation is ridiculous.
As a frequent contributor to a number of crates, this isn‘t really true. Also, most popular crates actively deny use of unsafe.
It'd be good to track capabilities needed by libraries, so similarly to unsafe code, risky portions needing careful review are constrained and highlighted in some way.
The solution was `Pin<T>` et. al., which gives a way to make some value immovable in memory.
An equivalent yet simpler version of this system could be integrated into the borrow checker (this was a proposed solution for Rust), but as I said before, it would not be backwards-compatible, hence the need for `Pin`.
a
|-b
| |-c
|
|-c
|-d
|-c
I may have read not carefully, but what happens if you allow crate X to write files, and it gets compromised? Should we set restrictions on per-call base instead?I see we may catch those situations when a crate starts reading/writing when it hadn't, or in an unexpected place, if we set restrictions per call, but this only limits the attack surface, not eliminates it.
...It may actually make 3rd party libraries such a big bureaucratic pain, that users will minimize their usage.
Yeah, these are important details to figure out. But lets not let perfect be the enemy of good here. We're arguing about what brand of lock to buy for the back door of the house, when the front door is currently wide open.
After all - I think most programs will allow 1 or, usually 0 crates to write files anyway. Limiting the attack surface to 1% of where it is today is a huge win, even if its not perfect.
When it comes to files, the privileged operation should really be opening a file (at some path). Writing to a file handle you've been given is relatively a much safer operation. Files should usually be (only) opened by the main process. Then the opened file handle can be handed to any functions / structs in 3rd party crates which need access.
Game programming changed a lot, parent is talking about stuff older than 10 yrs
There was a lot of PC gaming in C/C++, and "Engine" were developed together with games for the most part. Think all the Doom and Quake saga
That's what he's talking about
I've also been thinking that we should go over our stabilized RFCs and add an appendix to all of them documenting how the current implementation diverges from the original proposal.
Stages let you build consensus at more points. You could imagine RFCs go through stages of consensus:
First a proposal phase, where agreement is found that this is a problem worth solving and the broad strokes of the idea. This would be the similar to the summary and motivation sections of current RFCs.
Then an implementation phase, where a design is sketched out. I think the timing between this stage and the next is one of the more interesting/weak parts of this proposal, but the point is mostly that these things don't have to be strictly done in order, just that you make the points at which consensus to move forward is found in order.
Next, there'd be a design review phase. This would be a specific proposal based on feedback from the implementation, kind of like the Detailed Design section of the current RFC process.
Finally, you'd accept the design, and the RFC would move into a terminal "it's done" stage.
Anyway, just throwing that out there. You all still have a ton of stuff to do, of course, but this is something I really wish I could have actually pushed for back in the day. I think it would make it easier to move on bigger things, and give outsiders a much better idea of how close an RFC is to becoming reality.
RFCs are useful in a multi-vendor situation. They are permanent documents the constitute a kind of standard. For instance internet RFCs, or Scheme SRFIs.
If the rest RFC is something you jettison once it implemented, then it's actually something that belongs in a bug database as a feature request, which can be marked closed when done (after which nobody ever looks at it again except as a historic reference). Someone implementing the same thing in a new implementation would not be looking at it, but at the real documentation (which was written as part of implementing the ticket, and maintained afterward).
For multiple reasons. First, because it's not actually obvious from a PR what the overall design and big picture is. And second, because we want to decide about that design and big picture before someone does too much implementation work, lest that work need redoing because the design changed (or lest that work go to waste because we rejected the RFC).
What you're describing is a problem with how Cargo does vendoring, and yes, it's awful. It should not be called vendoring, it is just "local mirroring", which is not the same thing.
But Rust can work just fine without Cargo or Crates.io.
As for Zig, I hope they make it. I think I kind of see why people are excited about it, but fundamentally the reason I'm not super hyped is that it doesn't seem to really enable anything new. It's far more expressive than C, but it doesn't make it easier to manage inherent complexity (to my understanding - haven't played with it a lot).
For example, the async problem still exists and still has absolutely no viable path forward, or even MVP approach.
I don’t know anything that hasn’t been made simpler and easier over time.
We have hundreds of languages made to please the corporate overlords.
Can't we just have one language that's actually nice to use?
Nah, you still have those dependencies, they're just integrated in your interpreter. That has advantages (you're now only trusting a single source) and disadvantages (you always get all the goodies and the associated risks with that, even if you don't need them).
Also, if you just have a really well defined problem, its easy to just whip out 10-50 lines to solve the issue and be done with it
Ashley is just one out of many, unfortunately. Other former and current top contributors share similar qualities. Those qualities tend to trigger unnecessary explosions like last year's https://www.reddit.com/r/rust/comments/13vbd9v/on_the_rustco....
They’re much better.
I don't know what you had in mind for "sponsored" but others would disagree and say both C and C++ were "sponsored by AT&T Bell Labs" because the people who created them (Dennis Ritchie, Bjarne Stroustrup) were employees of AT&T. Analogous to Rob Pike, et al. of Go Language being employed/sponsored at Google.
Some solutions can definitely make a situation net-worse/net-more-complex, and occasionally we get advances in theory which can change the watermark on complexity, but it never _goes away_.
Ever implemented a cyclic linked list or graph in Rust?
Author here. We could make it a language problem by having the language sandbox dependencies by default. Seems like an easy win to me. Technical solutions are almost always easier to implement than social solutions.
> It's throwing the baby and bathwater into lava.
Is it really so controversial to want to be able to limit the access that utility crates like humansize or serde have to make arbitrary syscalls on my computer?
Seems to me like we could get pretty far with just compile-time checks - and that would have no impact whatsoever on the compiled code (or its performance).
I don't understand your criticism.
And that's not even getting into the problem that it's a fairly controversial feature, since people are worried about terrible, hard to track specialisation trees. (See, inheritance.)
There is already a proposal for how to prevent unsound specializations [0], but it requires a lot of support from the trait solver, hence why I said it's blocked on it.
[0]: https://smallcultfollowing.com/babysteps/blog/2018/02/09/max...
> Javascript has a quite different use-case audience than Rust.
Eh. That sounds like a "just so" explanation to me. Linus Torvalds doesn't work on the rust compiler.
I think I could make much more convincing arguments that javascript should move slower than rust - given there's so many large language runtime projects. (V8, Safari, Javascript, Node, Deno, Bun, etc etc). But evidently, that isn't the case.
I'm open to the reason for rust's slow development being that the language developers want the language to move slowly. Thats fine. But, I personally don't want that. I've been waiting for generators to ship for 7 years. Or TAIT to appear - again, for years. I'd much rather rust to move faster.
Of course I attribute all of this to the process & team which makes these decisions. What else is there? What else has any affect on the development of rust?
Some of the "cutting-edge PL features" I want are things like function effects - which would allow you to (at compile time) mark that a function cannot panic.
This is something the linux kernel has been asking for from rust for years. I think our interests are aligned.
Which is not exactly the same as wanting everybody to rewrite everything in Rust, but I suppose it's the sort of thing that annoys nineteen999.
There are also a lot of devs rewriting things in Rust for their own entertainment or whatever, which I think is the main source of the "rewrite everything in Rust" meme.
There are still plenty of constrained environments, architectures not yet supported, a lack of mature libraries for 2D/3D graphics amongst other things, that make Rust not a good fit yet for many projects where C/C++ already works. When Rust gets there and it and it's community matures a bit, we will all cheer. Until then ... we'll just get back to work.
Fundamentally, I don't want my type to be generic over all implementations. I want a concrete type. I want the type returned by one specific, often private, function.
But, nope. Not today. Maybe with TAIT, whenever that ships.
Could this problem be reformulated to use associated types, rather than full generics?
I want to make another struct which slowly consumes that impl Iterator. So, the second struct is struct MyStruct { iter: (iterator returned from that function above) }.
Unfortunately, the iterator type doesn’t have a name. So currently that means it’s impossible to put this iterator in a struct.
This makes anything returned via impl Iterator (and impl Future and so on) second class citizens in rust. You are very limited in the ways you can use these objects compared to normal values.
My code is filled with hand written iterators that I could construct much more easily with map/filter/fold and friends. But I make them the long way simply so the resulting type has a name. The current situation is very silly.
Hahahaha hard disagree. Last year I implemented the braid protocol (a custom streaming protocol using HTTP) in javascript in less than an hour and about 30 lines of code. Then I spent 2 weeks trying to do the same thing in rust - writing hundreds of lines of code in the process and I couldn't get it to work. Eventually I gave up.
I got it working recently - but only by borrowing some wild tricks from reading the source code of tokio, that I never would have thought of on my own.
> To get Pin stuff out of the way: it is indeed more complicated than it could be (because reverse compatibility etc), but when was the last time you needed to write a poll implementation manually?
Last week, while writing a simple networked database application. Again I needed to produce an async stream, and thats impossible using async fn.
In my experience, that kind of difference boils down to a combination of three things.
- Comparing apples and oranges. For example, Box makes pinning trivial (you can just move in and out of Pin no problem), but oftentimes people new to Rust try to prematurely optimise and eliminate a single pointer lookup. If that's the case, were you really writing the same thing in JS and in Rust?
- An extension to the previous point, the behaviour is usually different. What would happen in your JS implementation if two streams were awaited concurrently, one received a message, and the other had to be cancelled? What if one threw an exception? In Rust, you're forced to think about those things from the start. In JS, you're coding the happy path.
- Trying to reproduce the exact same architecture even if it's awkward of inefficient. For example, it's really really easy to use a stream wrapper [1] to produce a stream from a channel, but then the architecture gets very different.
> Again I needed to produce an async stream, and thats impossible using async fn
I strongly recommend a channel instead. There's also async_stream [2], but channels are simpler and cleaner.
Over two years of writing embedded, web, and CLI rust I didn't have to write a raw future once.
[1] https://docs.rs/tokio-stream/latest/tokio_stream/wrappers/in...
Often. Pin and Poll contribute to the problem of having a two-tiered ecosystem: people who can use async and people who can contribute to async internals. That's a problem I'd love to see fixed.
This is one of the reasons we've spent such a long time working on things like async-function-in-trait (AFIT), so that traits like AsyncRead/AsyncBufRead/AsyncWrite/etc can use that rather than needing Pin/Poll. (And if you need to bridge to things using Poll, it's always possible to use Poll inside an async fn; see things like https://doc.rust-lang.org/std/future/fn.poll_fn.html .)
Pointer lookups are cheap-ish, but allocating can be extremely expensive if you do it everywhere. I've seen plenty of lazy, allocation & clone heavy rust code end up running much slower than the equivalent javascript. I assume for this reason.
But in this case, I couldn't get it working even when putting Box<> all over the place.
> What would happen in your JS implementation if two streams were awaited concurrently, one received a message, and the other had to be cancelled? What if one threw an exception? In Rust, you're forced to think about those things from the start. In JS, you're coding the happy path.
I implemented error handling in the javascript code. That was easy - since async generators in javascript support try-catch. Javascript doesn't support concurrent execution - so that problem doesn't exist there.
Did multithreading contribute to javascript being easier to write than rust? Who cares? I had a problem to solve, and javascript made that trivial. Rust made it a total nightmare.
I didn't know about the stream wrappers when I started coding this up. That was how I eventually found an the answer to this problem: I read that code then adapted their approach.
And by the way, have you read the code in those wrappers? Its wild how they glue manual Future implementations and async functions together (with some clever Boxes) to make it work. It blew my mind how complex this code needs to be in order for it to work at all.
> Over two years of writing embedded, web, and CLI rust I didn't have to write a raw future once.
I'm happy for you, and I wish I had the same experience. Streams are bread and butter for my work (CRDTs, distributed systems and collaborative editing). And at this rate? Proper support for streams in rust is probably a decade away.
[1] https://rust-lang.github.io/rfcs/2497-if-let-chains.html
Me too. But that often makes my logic worse. Its quite common to want to share the same code in the else branch, for example:
if let Some(val) = opt {
if expr {
// ....
} else {
do_complex_stuff();
}
} else {
do_complex_stuff();
}
I don't want to copy+paste that complex code in two places. And you can fix that in turn with a named block & named break statements or a function - but .... whyyyyy? Why is this even a thing? The solution is so obvious. The RFC you linked to fix it is fine, and 6 years old now. My sister has a child in grade school who was born after that RFC was drafted.Yes, you can work around this hole in the language any number of ways with ugly code. But I don't want to do that. I want to use a good language.
Just look at how many downloads some of those packages have today.
Look at the dependency tree for a next or nuxt app.
What the js world did is make their build systems somewhat sane, whatwith not needing babel in every project anymore.
Looks ok to me: https://npmgraph.js.org/?q=next
Ironically, most of the dependencies are actually Rust crates used by swc and turbopack [1][2]. Try running `cargo tree` on either of those crates, it's enlightening to say the least. And of course, Node has a built in file watcher, and even the most popular third party package for file watching (Chokidar) has a single dependency [3].
[1] https://github.com/vercel/next.js/blob/07a55e03a31b16da1d085...
[2] https://github.com/swc-project/swc/blob/b94a0e1fd2b900b05c5f...
React is a dependency of every next application, but I don’t see it there.
Maybe next, the singular package, has fewer dependencies than a project made with next.
I posted this in some other thread:
I am not a Rust expert but the thing with the standard libraries is that it only has peer dependencies with itself and they are all synced to the same version. Meaning if you only use the std lib you:
1) Will never include two different versions of the same peer dependency because of incompatible version requirements.
2) Will usually not have two dependencies relying on two different peer-dependencies that do the same thing. This can still happen for deprecated std lib features, but tends to be a much lesser issue.
These two issues are usually the ones that cause dependency size explosion in projects.
And it's not a cut and dry issue to add. Function effects would add a lot of cognitive load to the developer along with more implicit bounds which increases accidental API break changes. You talk about the compiler implicitly adding the bounds to functions, but what happens when I now add a line in my function that allocates when before it didn't? I just broke my API unless I was also defensively testing all implicit bounds. And if I was testing all implicit bounds, can the language no longer add new bounds? Reversing that and requiring the callee to defensively declare all bounds is a borderline non-starter because it'd such a huge burden to write any function or refactor anything.
That's probably the least convincing of your examples. My understanding is that effects systems can get complicated fast, and there's no consensus yet on what a good general purpose implementation should look like, never mind a specific implementation for Rust.
This means that if a bundled dependency in the stdlib is even found to have some design issue that require breaking changes to fix then you're out of luck. As you said the stdlib could deprecate the old version and add a new one, but then you're just making problem 2) worse by forcing everyone to include the old deprecated dependency too! Or you could use a third-party implementation, especially if the stdlib doesn't have the features you need, but even then you will still be including the stdlib version in your dependency graph!
Ultimately IMO bundling dependencies in the stdlib just makes the problem worse over time, though it can raise awareness about how to better handle them.
Most dependency management systems do that, but large projects often end up pulling multiple different major versions of (often very large) dependencies.
> 2) worse by forcing everyone to include the old deprecated dependency too!
Like I said I am no expert on Rust, but I assume that Rust can eliminate stdlib dead-code from the runtime? So unused deprecated features shouldn't be included on every build? Also deprecated features often are modified to use the new implementation under the hood which reduces code duplication problem.
> Bundling dependencies in the stdlib "solves" the problem by making new major versions impossible.
Yes, which is a feature. For example Go is very annoying about this not only on the stdlib. https://go.dev/doc/go1compat a lot of 3rd party libs follow this principle as well.
I bring Go a lot but I actually don't like the language that much, but it gets some pragmatic things right.
I am not saying everything should be in the stdlib, but I tend to think that the stdlib should be fairly big and tackle most common problems.
AFAIK what Maven does is an exclusion of dependency edges, which is technically an unsafe thing to do. Cargo [patch] is a replacement of dependency vertices without affecting any edges. (Maven surely has a plugin to do that, but it's not built-in.) They are different things to start with.
Also I believe that the edge exclusion as done by Maven is (not just "technically", but) really unsafe and only supported due to the lack of better alternatives. Edges are conceptually dependent to the incoming vertex, so it should be that vertex's responsibility to override problematic edges. An arbitrary removal of edges (or vertices) is much harder to track and many other systems have related pains from that.
What I'm proposing here is therefore the extension of Cargo's vertex replacement: you should be able to share such replacements so that they can be systematically dealt. If my transitive dependencies contain some crate X with two different versions 1.0 and 2.0 (say), I should be able to write an adapter from 2.0 to 1.0 or vice versa, and ideally such adapter should be available from the crate author or from the community. I don't think Maven did try any such systematic solution.
> That, and the maven repository is moderated. Unlike crates.io.
Only the central repository is moderated by Maven. Maven is not much better than Cargo once you have more remote repositories.
> Crates.io is a real problem. No namespaces, basically unmoderated, tons of abandoned stuff. Version hell like you're talking about.
Namespace is not a solution for name squatting: namespace is just yet another identifier that can be squatted. If you are worried about squatting, the only effective solution is sandboxing, everything else is just moving the goal post.
The very existence of remote repositories also means that you can't always moderate all the stuffs and get rid of abandoned stuffs. You have to trust repositories, just like that you have to trust crates with crates.io today.
[1] https://www.sonatype.com/blog/the-history-of-maven-central-a...
The problems crates.io struggles with have never been an issue with Maven, regardless of how creatively you try to redefine words.
That's a fact. Deal with it.
Sorry I have a problem with "just" word in tech.
Rust makes it easy to use third-party dependencies, and if you don't want to use third-party dependencies, then you're no worse off than in C.
A C++ library author is much more likely to just implement a small feature themselves rather than look for another 3rd party library for it. Adding dependencies to your library is a more involved and manual process, so most authors would do it very selectively.
Saying that - a C++ library might depend on Boost and its 14 million LOC. Obviously it's not all being included in the final binary.
This is conflating Javascript and Rust. Unlike Javascript, Rust does not have a culture of "microdependencies". Crates that get pulled in tend to be providing quite a bit more than "just a small feature", and reimplementing them from scratch every time would be needlessly redundant and result in worse code overall.
[1] https://nextjs.org/docs/getting-started/installation#manual-...
But there's no reason such a "feature" requires bundling dependencies in the stdlib. As you mention 3rd party Go libs manage to do this perfectly fine.
> but I tend to think that the stdlib should be fairly big and tackle most common problems.
I tend to disagree with this, because the way to tackle those common problems with likely change in the future, but the stdlib will be stuck with it for eternity. I would rather have some community-standard 3rd party crate that you can replace in the future when it will grow old. See also "Where modules go to die" https://leancrew.com/all-this/2012/04/where-modules-go-to-di...
A challenging architectural problem that several of us are trying to get someone nerdsniped into: inverting the dependency tree, such that you first check what symbols exist in a large crate like windows, then go to all the crates depending on it and see what they actually consume, then go back and only compile the bits needed for those symbols.
That'd be a massive improvement to compilation time, but it's a complicated change. You'd have to either do a two-pass compilation (first to get the symbol list, then again to compile the needed symbols) or leave that instance of the compiler running and feed the list of needed symbols back into it.
I'm not buying this, sorry. Yes, typos and other deceptive things are possible, but having this authority data would allow tools to then use this signal. Not having it seems strictly worse.
JS does support concurrent execution, Promise.all is an example. Without it, JS async would make little sense. The problem very much exists there, and try-catch is only a surface-level answer. As you can see here [1], the interaction of cancellation and async in JS is at least just as (or more) complex than in Rust.
By the way, multithreading has little to do with Pin. I presume you're thinking of Send bounds.
"To work at all" is very dismissive. Those wrappers are complex, but very well abstracted, well defined, and robust, the complexity is essential. Again, look at [1], JS async is hardly less complex, but also much more vague and ill-defined.
The apples-to-apples comparison I’m making here is: “I sit down at my computer with the goal of solving this problem using code. How long before I have a robust solution using the tool at hand?”. Of course the internals of rust and JavaScript’s Future/promise implementations are different. And the resulting performance will be different. That’s what makes the comparison interesting.
It’s like - you could say it’s an apples to oranges comparison to compare walking and driving. They’re so different! But if I want to visit my mum tomorrow, I’m going to take all those variables into account and decide. One of those choices will be strictly better for my use case.
Rust came off terribly in the comparison I made here. I love rust to bits in other ways, but dealing with async streams in rust is currently extremely difficult. Even the core maintainers agree that this part of the language is unfinished.
There are countless obscure holes in rustc, LLVM, and linkers, because they were never meant to be a security barrier against the code they compile. This doesn't affect normal programs, because the exploits are impossible to write by accident, but they are possible to write on purpose.
---
Secondly, it's not 1000 crates from 1000 people. Rust projects tend to split themselves into dozens of micro packages. It's almost like splitting code across multiple .c files, except they're visible in Cargo. Many packages are from a few prolific authors and rust-lang members.
The risk is there, but it's not as outsized as it seems.
Maintainers of your distro do not review code they pull in for security, and the libraries you link to have their own transitive dependencies from hundreds of people, but you usually just don't see them: https://wiki.alopex.li/LetsBeRealAboutDependencies
Rust has cargo-vet and cargo-crev for vetting of dependencies. It's actually much easier to review code of small single-purpose packages.
For compile time, there’s a big difference between needing the attacker to exploit the compiler vs literally just use the standard API (both in terms of difficulty of implementation and ease of spotting what should look like fairly weird code). And there’s a big difference between runtime rust vs compile time rust - there’s no reason that cargo can’t sandbox build.rs execution (not what josephg brought up but honestly my bigger concern).
There is a legitimate risk of runtime supply chain attacks and I don’t see why you wouldn’t want to have facilities within Rust to help you force contractually what code is and isn’t able to do when you invoke it as a way to enforce a top-level audit. Even though rust today doesn’t support it doesn’t make it a bad idea or one that can’t be elegantly integrated into today’s rust.
But beyond that, if you don't review the code, then the rest matters very little. Sandboxed build.rs can still inject code that will escape as soon as you test your code (I don't believe people are diligent enough to always strictly isolate these environments despite the inconvenience). It can attack the linker, and people don't even file CVEs for linkers, because they're expected to get only trusted inputs.
Static access permissions per dependency are generally insufficient, because an untrusted dependency is very likely to find some gadget to use by combining trusted deps, e.g. use trusted serde to deserialize some other trusted type that will do I/O, and such indirection is very hard to stop without having fully capability-based sandbox. But in Rust there's no VM to mediate access between modules or the OS, and isolation purely at the source code level is evidently impossible to get right given the complexity of the type system, and LLVM's love for undefined behavior. The soundness holes are documented all over rustc and LLVM bug trackers, including some WONTFIXes. LLVM cares about performance and compatibility first, including concerns of non-Rust languages. "Just don't write weirdly broken code that insists on hitting a paradox in the optimizer" is a valid answer for LLVM where it was never designed to be a security barrier against code that is both untrusted and expected to have maximum performance and direct low-level hardware access at the same time.
And that's just for sandbox escapes. Malware in deps can do damage in the program without crossing any barriers. Anything auth-adjacent can let an attacker in. Parsers and serializers can manipulate data. Any data structure or string library could inject malicious data that will cross the boundaries and e.g. alter file paths or cause XSS.
Can you give some examples? What ways are there to write safe rust code & do nasty things, affecting other parts of the binary?
Is there any reason bugs like this in LLVM / rustc couldn't be, simply, fixed as they're found?
They can be fixed, but as always, there’s a lot of work to do. The bug that the above package relies on has never been seen in the wild, only from handcrafted code to invoke it, and so is less of a priority than other things.
And some fixes are harder than others. If a fix is going to be a lot of work, but is very obscure, it’s likely to exist for a long time.
Seems like awareness about this threat vector is becoming more widespread, but I don't hear much discuss trickling through the grapevine re: solutions.
The harder bit is annotating things - while you can protect against std::fs, it’s likely harder to guarantee that malicious code doesn’t just call syscalls directly via assembly. There’s too many escapes possible which is why I suspect no one has particularly championed this idea.
So? Panics or traps from stack overflows don't allow 3rd party code to write to arbitrary files on my filesystem. Nor does integer overflow.
Maybe there's some clever layered attack which could pull off something like that. But, fine! Right now the state is "anyone in any crate can trivially do anything to my computer". Limiting the granted permission to only allowing panics, infinite loops, integer overflows and stack overflows sounds like a big win to me!
If people do figure out ways to turn a stack overflow in safe rust into RCE, well, that was already a soundness hole in the language. Lets fix it.
But that was about the general language-based security, and you are correct that this particular case wouldn't matter much for Cargo. I only used this example in order to show that fully verifying language-based security is very hard in general. Even Coq, a well-known proof verifier with a solid type theory and implementation, suffered from some bug that allowed `false` to be proved [1]. It's just that hard---not really feasible.
If you want to prevent stack overflows, the compiler can calculate the maximum stack space needed by any call tree. (Well, so long as the code in question isn't recursive - but again, that could be enforced at compile time.)
That seems like something that could be checked statically. Alternatively, the kernel could dynamically allocate exactly the right amount of stack space for its own threads.
The issue of build-time security is somewhat separate, and it actually seems easier to tackle strongly. There have been proposals floated around to make proc macros use wasm and run in a sandbox, and IMO Rust should absolutely move in this direction.
This is one of the value propositions of Roc
- it's perfectly possible to be a successful user of the async ecosystem as it is now while building great software;
- this two-tiered phenomenon is not unique to Rust, JS and Python struggle with it just as much (if not more due to less refined and messier design). As an example, [1] is elegant, but complex, and I'm less sure it's correct compared to a gnarly async Rust future, because the underlying async semantics are in flux.
Of course I'd love for the remaining snags (like AFIT) to go away, and simplified Pin story or better APIs would be great, but this negativity around async Rust is just wrong. It's a massive success already and should be celebrated.
[1]: https://github.com/florimondmanca/aiometer/blob/master/src/a...
Absolutely; to be clear, I think async Rust has been a massive success, and has a lot of painfully rough edges. The rough edges don't invalidate the massive success, and the massive success doesn't invalidate the painfully rough edges.
Yes, I do want a good programming language. I agree its relatively minor compared to the other issues I talked about in the blog post.
In what way is this at odds with other aspects of the language? If its at odds with the rest of the language, why is this feature being added?
The line has to be drawn somewhere. And that line is much more reasonable when you can trust large trillion dollar backed standard libraries from the likes Go or .NET, in contrast to a fragmented ecosystem from other languages.
What good is vendoring 4 million lines of code if I have to review them anyway at least once? I'd rather have a strong MSFT/GOOGL standard library which I can rely upon and not have to audit, thank you very much.
Rust may not have "left pad" type micro-dependencies, but it definitely has a dependency culture. Looking at `cargo tree` for the medium size project I'm working on, the deepest dependency branch goes to 12 layers deep. There's obviously a lot of duplicates - most dependency trees eventually end with the same few common libraries - but it does require work to audit those and understand their risk profile and code quality.
It absolutely does by the C/C++ standards. Last time I checked the zed editor had 1000+ dependencies. That amount of crates usually results in at least 300-400 separately maintained projects by running 'cargo supply-chain'. This is an absurd number.
Not to mention C/C++ dependency situation is a low bar to clear.
- ok, a single ext4 file inode changes, and its filename matches my hardcoded string
- oh, you don’t want to match against just changes to “package.json” but you want to match against a regex? voila, now you need a regex engine
- what about handling a directory rename? should that trigger matches on all files in the renamed directory?
- should the file watcher be triggered once per file, or just every 5ms? turns out this depends on your use case
- how do symlinks fit into this story?
- let’s say i want to handle once every 5ms- how do i actually wait for 5ms? do i yield the thread? do i allow other async contexts to execute while i’m waiting? how do those contexts know when to execute and when to yield back to me? now you have an async runtime with timers
- how does buffering work? are there limits on how many file change events can be buffered? do i dynamically allocate more memory as more file changes get buffered? now you need a vector/arraylist implementation
And this is before you look at what this looks like on different platforms, or if you want polling fallbacks.
Can you do it with less dependencies? Probably, if you start making hard tradeoffs and adding even more complexity about what features you activate - but that only adds lines of code, it doesn’t remove them.
What you describe is ideologically nice, but in practice it’s over-optimizing for a goal that most people don’t really care about.
- proper CLI support, with help messages, subcommands and so on
- support for reading cargo's metadata
- logging
- support for dotenv files
- proper shell escaping support
- and it seems also support for colored terminal writing.
Moreover both watchexec and cargo-watch end up depending on winapi, which includes binding for a lot of windows API, some which might be needed and some which not be.
This could also be worse if the offial windows crate by Microsoft was used (or maybe it's already used due to some dependency, I haven't checked), since that's gigantic.
Meaning if you only use the std lib you:
1) Will never include two different versions of the same peer dependency because of incompatible version requirements.
2) Will usually not have two dependencies relying on two different peer-dependencies that do the same thing. This can still happen for deprecated std lib features, but tends to be a much lesser issue.
These two issues are usually the ones that cause dependency size explosion in projects.
FWIW I checked out the nightly toolchain, and it looks like the stdlib is less than 400k SLoC. So literally 10x smaller.
BSD, Mac OS and Linux share the same interface that approximates POSIX---so it only supports a single platform with different variants. Its CLI is not well-designed, it's just a fixed unconditional terminal sequence that even doesn't look at $TERM and its options have no long counterpart (probably because it couldn't use getopt_long which is a GNU extension). And cargo-watch actually parses the `cargo metadata` JSON output (guess what's required for parsing JSON in C) and deals with ignore patterns which are consistent in syntax (guess what's required for doing that besides from fnmatch).
And I'm not even meant to say that the supposed figure of 4M LoC is all required. In fact, while the problem itself does exist, I don't think that figure is accurate at all, given the massive `windows` crate was blindly counted towards. I guess the faithful reproduction of cargo-watch without any external library will take about 20--50K lines of code in Rust and in C. But doing it in C would be much more painful and you will instead cut requirements.
Certainly nothing on the order of MLOC. Ditto for other features you listed.
> it's just a fixed unconditional terminal sequence
Are you referring to the clear feature? Yes, it's fixed. It's also pretty standard in that regard. It's optional so if it breaks (probably on xterm because it's weird but that's about it) you don't have to use it and can just issue a clear command manually as part of whatever you're running in the TTY it gives you. Honestly I don't think the feature is even really needed. I highly doubt cargo-watch needs to do anything with TERM so I am not sure why you mention it (spamming colours everywhere is eye candy not a feature).
But more importantly, this is just a convenience feature and not part of the "CLI". Not supporting long options isn't indicative of a poorly designed CLI. However, adding long option support without any dependencies is only a couple of hundred lines of C.
> And cargo-watch actually parses the `cargo metadata` JSON output
Which is unnecessary and entirely cargo specific. Meanwhile you can achieve the same effect with entr by just chaining it with an appropriate jq invocation. entr is more flexible by not having this feature.
> (guess what's required for parsing JSON in C)
Not really anywhere near as many lines as you seem to think.
> deals with ignore patterns which are consistent in syntax (guess what's required for doing that besides from fnmatch).
Again, entr doesn't deal with ignore patterns because it allows the end user to decide how to handle this themselves. It takes a list of filenames via stdin. This is not a design problem, it's just a design choice. It makes it more flexible. But again, if you wanted to write this in C, it's only another couple of hundred lines.
From my experience doing windows development, windows support probably isn't as painful as you seem to think.
All in all, I imagine it would take under 10k to have all the features you seem to care about AND nothing non-eye-candy would have to be cut (although for the eye candy, it's not exactly hideously difficult to parse terminfo. the terminfo crate for rust is pretty small (3.2k SLOC) and it would actually be that small (or smaller) if it didn't over-engineer the fuck out of the problem by using the nom, fnv, and phf crates given we're parsing terminfo not genetic information and doing it once at program startup not 10000 times per second).
Yes, I think trying to golf the problem is probably not appropriate. But 4M LoC is fucking ridiculous by any metric. 1M would still be ridiculous. 100k would also be ridiculous 50k is still pretty ridiculous.
You are correct, but that's about the only divergence matters in this context. As I've noted elsewhere, you can't even safely use `char*` for file names in Windows; it should be `wchar_t*` in order to avoid any encoding problem.
> Are you referring to the clear feature? Yes, it's fixed. It's also pretty standard in that regard.
At the very least it should have checked for TTY in advance. I'm not even interested in terminfo (which should go die).
> spamming colours everywhere is eye candy not a feature
Agreed that "spamming" is a real problem, provided that you don't treat any amount of color as spamming.
> Which is unnecessary and entirely cargo specific. Meanwhile you can achieve the same effect with entr by just chaining it with an appropriate jq invocation. entr is more flexible by not having this feature.
Cargo-watch was strictly designed for Cargo users, which would obviously want to watch some Cargo workspace. Entr just happens to be not designed for this use case. And jq is much larger than entr, so you should instead consider the size of entr + jq by that logic.
> Not really anywhere near as many lines as you seem to think.
Yeah, my estimate is about 300 lines of code with a carefully chosen set of interface. But you have to ensure that it is indeed correct yourself, and JSON is already known for its sloppily worded standard and varying implementation [1]. That's what is actually required.
[1] https://seriot.ch/projects/parsing_json.html
> Yes, I think trying to golf the problem is probably not appropriate. But 4M LoC is fucking ridiculous by any metric. 1M would still be ridiculous.
And that 4M LoC is fucking ridiculous because it includes all `#[cfg]`-ignored lines in various crates including most of 2.2M LoC in the `windows` crate. That figure is just fucking incorrect and not relevant!
> 100k would also be ridiculous 50k is still pretty ridiculous.
And for this part, you would be correct if I didn't say the "faithful" reproduction. I'm totally sure that some thousand lines of Rust code should be enough to deliver a functionally identical program, but that's short of the faithful reproduction. This faithfulness issue actually occurs in many comparisons between Rust and C/C++; even the simple "Hello, world!" program does a different thing in Rust and in C because Rust panics when it couldn't write the whole text for example. 50K is just a safety margin for such subtle differences. (I can for example imagine some Unicode stuffs around...)
Do these OSs share file watch interfaces? Linux itself has, last I checked, three incompatible file watch APIs.
Rebuild project if a source file is modified or added to the src/ di‐
rectory:
$ while sleep 0.1; do ls src/*.rb | entr -d make; done
Though I shudder to think of the amount of code needed to rewrite that in a compiled language while sticking to the principle of not reinventing anything remotely wheel-shaped. (Btw the libinotify/src is like 2.3kloc, inotify cli <1.2kloc.)You joke, but Windows support is the main (probably the only?) reason why cargo-watch is huge. Rust ecosystem has some weird shit when interacting with Windows.
And the bugs should simply get fixed.
Maven Central people nuked the artifact that may have caused confusion, and if the owners try anything like that again, it's likely their domain will be banned from publishing.
If that's the hill you want to die on, good luck.
Hence the requirement to also limit / ban `unsafe` in untrusted code. I mean, if you can poke raw memory, the game is up. But most utility crates don't need unsafe code.
> Package scope is typically too coarse - a package might export multiple different pieces of related functionality and you’d want to be able to use the “safe” parts you audited
Yeah; I'm imagining a combination of "I give these permissions to this package" in Cargo.toml. And then at runtime, the compiler only checks the call tree of any functions I actually call. Its fine if a crate has utility methods that access std::fs, so long as they're never actually called by my program.
I think you’d be surprised by how much code has a transitive unsafe somewhere in the call chain. For example, RefCell and Mutex would need unsafe and I think you’d agree those are “safe constructs” that you would want available to “utility” code that should haven’t filesystem access. So now you have to go and reenable constructs that use unsafe that should be allowed anyway. It’s a massively difficult undertaking.
Having easier runtime mechanisms for dropping filesystem permissions would definitely be better. Something like you are required to do filesystem access through an ownership token that determines what you can access and you can specify the “none” token for most code and even do a dynamic downgrade. There’s some such facilities on Linux but they’re quite primitive - it’s process wide and once dropped you can never regain that permission. That’s why the model is to isolate the different parts into separate processes since that’s how OSes scope permissions but it’s super hard and a lot of boilerplate to do something that feels like it should be easy.
RefCell and Mutex have safe wrappers. If you stick to the safe APIs of those types, it should be impossible to read / write to arbitrary memory.
I think we just don't want untrusted code itself using unsafe. We could easily allow a way to whitelist trusted crates, even when they appear deep in the call tree. This would also be useful for things like tokio, and maybe pin_project and others.
To me, this sounds as if the Pin concept is so difficult to understand that it's hard to even formulate correct criticism about it.
I get that Pin serves a critical need related to generators and async, and in that it was a stroke of genius. But you as the creator of Pin might not be the right person to judge how difficult Pin is for the more average developers among us.
True. How long should that process take? A month? A year? Two years?
I ask because this feature has been talked about since I started using rust - which (I just checked) was at the start of 2017. Thats nearly 8 years ago now.
6 years ago this RFC was written: https://rust-lang.github.io/rfcs/2497-if-let-chains.html - which fixes my issues. But it still hasn't shipped.
Do I have too high expectations? Is 6 years too quick? Maybe, a decade is a reasonable amount of time to spend, to really talk through the options? Apparently 433 people contributed to Rust 1.81. Is that not enough people? Do we need more people, maybe? Would that help?
Yes, I do feel piqued by the glacial progress. I don't care about the || operator here - since I don't have any instinct for what that should do. And complex match expressions are already covered by match, anyway.
Rust doesn't do the obvious thing, in an obvious, common situation. If you ask me, this isn't the kind of problem that should take over 6 years to solve.
> Your commentary on Pin in this post is even more sophomoric than the rest of it and mostly either wrong or off the point. I find this quite frustrating, especially since I wrote detailed posts explaining Pin and its development just a few months ago.
If I'm totally off base, I'd appreciate more details and less personal insults.
I've certainly given Pin an honest go. I've used Pin. I've read the documentation, gotten confused and read everything again. I've struggled to write code using it, given up, then come back to it and ultimately overcame my struggles. I've boxed so many things. So many things.
The thing I've struggled with the most was writing a custom async stream wrapper around a value that changes over time. I used tokio's RwLock and broadcast channel to publish changes. My Future needed a self-referential type (because I need to hold a RwLockGuard across an async boundary). So I couldn't just write a simple, custom struct. But I also couldn't use an async function, because I needed to implement the stream trait.
As far as I can tell, the only way to make that code work was to glue async fn and Futures together in a weird frankenstruct. (Is this a common pattern? For all the essays about Pin and Future out there, I haven't heard anyone talk about this.) I got the idea from how tokio implements their own stream adaptor for broadcast streams[1]. And with that, I got this hairy piece of code working.
But who knows? I've written hundreds of lines of code on top of Pin. Not thousands. Maybe I still don't truly get it. I've read plenty of blog posts, with all sorts of ideas about Pin being about a place, or about a value, or a life philosophy. But - yes, I haven't yet, also read the 9000 words of essay you linked. Maybe if I do so I'll finally, finally be enlightened.
But I doubt it. I think Pin is hard. If it was simple, you wouldn't have written 9000 words talking about it. As you say:
> Unfortunately, [pin] has also been one of the least accessible and most misunderstood elements of async Rust.
Pin foists all its complexity onto the programmer. And for that reason, I think its a bad design. Maybe it was the best option at the time. But if we're still talking about it years later - if its still confusing people so long after its introduction - then its a bad part of the language.
I also suspect there are way simpler designs which could solve the problems that pin solves. Maybe I'm an idiot, and I'm not the guy who'll figure those designs out. But in that case, I'd really like to inspire smarter people than me to think about it. There's gotta be a simpler approach. It would be incredibly sad if people are still struggling with Pin long after I'm dead.
[1] https://github.com/tokio-rs/tokio/blob/master/tokio-stream/s...
The state of async Rust is not better because no one hired me to finish it past the MVP. I have solutions to all of your problems (implementing a stream with async/await, making Pin easier to use, etc). Since I am not working on it the project has spun its wheels on goofy ideas and gotten almost no work done in this space for years. I agree this is a bad situation. I've devoted a lot of my free time in the past year to explaining what I think the project should do, and its slowly starting to move in that direction.
My understanding is that if let chaining is stalled because some within the project want to pretend there's a solution where a pattern matching operator could actually be a boolean expression. I agree that stalling things forever on the idea that there will magically be a perfect solution that has every desirable property in the future is a bad pattern of behavior that the Rust project exhibits. Tony Hoare had this insightful thing to say:
> One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies.
> The first method is far more difficult. It demands the same skill, devotion, insight, and even inspiration as the discovery of the simple physical laws which underlie the complex phenomena of nature. It also requires a willingness to accept objectives which are limited by physical, logical, and technological constraints, and to accept a compromise when conflicting objectives cannot be met. No committee will ever do this until it is too late.
But appreciation does little to temper my frustration. Watching the rust project spin its wheels has dulled any enthusiasm I might have once had for its open, consensus based processes. I could get involved - but I worry I'd be yet another commenter making long issue threads even longer. I don't think Rust has a "not enough cooks in the kitchen" shaped problem.
I love that quote. I agree with it - at some point, like with Pin and the 'foo.await' vs 'await foo' discussion - you just have to pick an answer, any answer, and move forward. But the siren song of that "simple and elegent" solution still calls. Alan Kay once made a similar observation. He pointed out that it took humanity thousands of years (and two geniuses) to invent calculus. And now we teach it to 8th grade children. How remarkable. Clearly, the right point of view is worth 50 IQ points.
I look forward to reading your blog posts on the topic. I suspect there's lots of workable solutions out there in the infinite solution space. Research is always harder and slower than I think it should be. And this is very much a research question.
You seem very convinced that replacing Pin with Move would be a mistake. Maybe! I wouldn't be surprised if the Move vs Pin question is a red herring. I suspect there's an entirely different approach which would work much better - something like, as I said in my post, attacking the problem by changing the borrow checker. Something like that. Maybe that wouldn't be viable for rust. Thats fine. There will be more languages following in its footsteps. I want them to be as good as possible.
And I swear, there's a better answer here somewhere.
I can feel it.
God bless you
That's very intriguing. Do you have any examples? Willing to learn more.
If you want a feature that everyone complains about, like Pin or async rust, yes, that is how long that process should take.
If you don't want a feature that everyone uses as their stock example for why language designers are drooling morons, and the feature has any amount of complexity to it, then the process should probably take over a decade.
There's a commonality to the features you're complaining about, and it's things where the desire to push a MVP that satisfied some, but not all, use cases overrode the time necessary to fully understand the consequences of decisions not just to implement the feature but its necessary interactions with other features, present and future.
I do appreciate the irony, though, of you starting about complaining about Rust moving too slowly before launching into detailed criticism of a feature that most agree is (at least in part) the result of Rust moving too quickly.
Is Pin the result of moving too quickly? Maybe.
Personally, I’m not convinced that it’s generally possible to explore the design space properly by having long conversations. At some point, you have to ship. Figure out if it’s a good idea with your feet. Just like pin did.
I don’t claim to be smarter than anyone on the rust team who worked on this feature before it was launched. Only, now it’s launched and people have used it, I think we should go back to the drawing board and keep looking for other approaches.
As someone who has worked a lot to get if let chains stabilized (but so far not achieved the goal), there is surprisingly few blockers: only an ICE. But the ICE fix requires doing some breaking changes, so it's being phased in as part of the 2024 edition. The alternative to doing breaking changes would be to make if let chains different from if let, which wouldn't be nice.
Hopefully we'll have stable if let chains soon-ish. But note that nowadays on Rust, it's mostly volunteers working on the language, so things might not be as fast any more.
In any case, writing a language from scratch is going to be ten times more involved than targeting nightly Rust where if let chains are available.
?? Then why did the language team put it on the 2024 roadmap? Am I looking at something different? (Specifically on under the 'Express yourself more easily' (1) goal, which links to the RFC issue (2)).
It certainly looks like the implementation is both complete and unblocked, and actively used.
It looks more like the issue is (despite being put on the roadmap and broadly approved as a feature), being argued about because of the alternative proposal for 'is' syntax.
ie. If you want to generalize then yes, there are features which are difficult to implement (yeah, I'll just make a Move trait... yeah... No. It's not that easy).
BUT.
That's not a problem.
A lot of clever folk can work through issues like that and find solutions for that kind of problem.
The real problem is that RCFs like this end up in the nebulous 'maybe maybe' bin, where they're implemented, have people who want them, have people who use them, have, broadly the approval of the lang team (It's on the roadmap).
...but then, they sit there.
For months. Or years. While people argue about it.
It's kind of shit.
If you're not going to do it, make the call, close the RFC. Say "we're not doing this". Bin the code.
Or... merge it into stable.
Someone has to make the call on stuff like this, and it's not happening.
This seems to happen to a fair few RFCs to a greater or less extent, but this one is particularly egregious in my opinion.
[1] - https://lang-team.rust-lang.org/roadmaps/roadmap-2024.html#t... [2] - https://github.com/rust-lang/rust/issues/53667
Also, why would pinned be a syntactic sugar for Pin and not the other way around?
Async is a good example of a complex feature that needs a fairly detailed blog post to understand the nuances. Pretty much any language with coroutines of some sort will have 1 or many blog posts going into great detail explaining exactly how those things work.
Similarly, assuming Rust added HKT, that would also require a series of blog posts to explain as the concept itself is foreign to most programmers.
Async is a great example of this problem. It is way more cumbersome in Rust then it could be, in a different universe where Rust concurrency made different choices.
Sometimes you need knowledge to understand things.
Here are some relevant posts:
https://without.boats/blog/a-four-year-plan/
https://without.boats/blog/poll-next/
Yes, this is true. But I think the overhead of writing that kind of code would not be as enormous as 30k lines or anything in that order.
> At the very least it should have checked for TTY in advance. I'm not even interested in terminfo (which should go die).
Maybe. It's an explicit option you must pass. It's often useful to be able to override isatty decisions when you want to embed terminal escapes in output to something like less. But for clear it's debatable.
I would say it's fine as it is.
Also, if isatty is "the very least" what else do you propose?
> Agreed that "spamming" is a real problem, provided that you don't treat any amount of color as spamming.
I treat any amount of color as spamming when alternative options exist. Colours are useful for: syntax highlighting, additional information from ls. Not for telling you that a new line of text is available for you to read in your terminal.
There are many things where colours are completely superfluous but are not over-used. I still think that colours should be the exception not the rule.
> Cargo-watch was strictly designed for Cargo users, which would obviously want to watch some Cargo workspace. Entr just happens to be not designed for this use case. And jq is much larger than entr, so you should instead consider the size of entr + jq by that logic.
Yes jq is larger than entr. But it's not 3.9M SLOC. It also has many features that cargo-watch doesn't. If you wanted something cargo specific you could just write something specific to that in not very much code at all. The point is that the combination of jq and entr can do more than cargo-watch with less code.
> and JSON is already known for its sloppily worded standard and varying implementation [1]. That's what is actually required.
I hope you can agree that no number of millions of lines of code can fix JSON being trash. What would solve JSON being trash is if people stopped using it. But that's also not going to happen. So we are just going to have to deal with JSON being trash.
> And for this part, you would be correct if I didn't say the "faithful" reproduction. I'm totally sure that some thousand lines of Rust code should be enough to deliver a functionally identical program, but that's short of the faithful reproduction. This faithfulness issue actually occurs in many comparisons between Rust and C/C++; even the simple "Hello, world!" program does a different thing in Rust and in C because Rust panics when it couldn't write the whole text for example. 50K is just a safety margin for such subtle differences. (I can for example imagine some Unicode stuffs around...)
Regardless of all the obstacles. I put my money on 20k max in rust with everything vendored including writing your own windows bindings.
But neither of us has time for that.
In terms of UX it's just moving the burden to the user, who may not be aware of that problem or even the existence of `-c`. The default matters.
> I treat any amount of color as spamming when alternative options exist. Colours are useful for: syntax highlighting, additional information from ls. Not for telling you that a new line of text is available for you to read in your terminal.
I'm a bit more lenient but agree on broad points. The bare terminal is too bad for UX, which is why I'm generous about any attempt to improve UX (but not color spamming).
I'm more cautious about emojis than colors by the way, because they are inherently colored while you can't easily customize emojis themselves. They are much more annoying than mere colors.
> It also has many features that cargo-watch doesn't. If you wanted something cargo specific you could just write something specific to that in not very much code at all. The point is that the combination of jq and entr can do more than cargo-watch with less code.
I think you have been sidetracked then, as the very starting point was about cargo-watch being apparently too large. It's too large partly because of bloated dependencies but also because dependencies are composed instead of being inlined. Your point shifted from no dependencies (or no compositions as an extension) to minimal compositions, at least I feel so. If that's your true point I have no real objection.
> I hope you can agree that no number of millions of lines of code can fix JSON being trash. What would solve JSON being trash is if people stopped using it. But that's also not going to happen. So we are just going to have to deal with JSON being trash.
Absolutely agreed. JSON only survived because of the immense popularity of JS and good timing, and continues to thrive because of that initial momentum. It's not even hard to slightly amend JSON to make it much better... (I even designed a well-defined JSON superset many years ago!)
The default is no clear.
It could be the default is to clear and then I would agree that an isatty check would be necessary. But an isatty check for an explicit option here would be as weird as an isatty check for --color=always for something like ls.
> The bare terminal is too bad for UX
I think it depends on the task and the person. You wouldn't see me doing image editing, 3d modelling, audio mastering, or web browsing in a terminal. But for things which do not suffer for it (a surprising number of tasks) it's strictly better UX than a GUI equivalent.
> emojis
Yes, I dislike these. I especially remember when suddenly my terminal would colour emojis because someone felt it was a good idea to add that to some library as a default. :(
> I think you have been sidetracked then, as the very starting point was about cargo-watch being apparently too large. It's too large partly because of bloated dependencies but also because dependencies are composed instead of being inlined. Your point shifted from no dependencies (or no compositions as an extension) to minimal compositions, at least I feel so. If that's your true point I have no real objection.
Well no, I think you can build a cargo-watch equivalent (with a bit of jank) from disparate utilities running in a shell script and still have fewer total lines.
And sure, the line count is a bit inflated with a lot of things not being compiled into the final binary. But the problem we're discussing here is if it's worth to depend on a bunch of things when all you're using is one or two functions.
As I understand it, whenever doing anything with windows, you pull in hideous quantities of code for wrapping entire swathes of windows. Why can't this be split up more so that if all I want is e.g. file watching that I get just file watching. I know windows has some basic things you inevitably always need, but surely this isn't enough to make up 2M SLOC. I've written C code for windows and yes it's painful but it's not 2M SLOC of boilerplate painful.
Large complex dependency graphs are obviously not a problem for the compiler, it can chug away, remove unnecessary shit, and get you a binary. They're usually not a big problem for binary size (although they can still lead to some inflation). But they are a massive issue for being able to work on the codebase (long compilation times) or review the codebase (huge amounts of complexity, even when code isn't called, you need to rule out that it's not called).
And huge complex dependency graphs where you're doing something relatively trivial (and honestly file watching isn't re-implementing cat but it's not a web browser or an OS) should just be discouraged.
We both agree that you can get this done in under 50k lines. That's much easier to manage from an auditing point of view than 4M lines of code, even if 3.9M lines end up compiled out.
Indeed, so why waste it reinventing the wheel instead of using a high quality third party package?
Or arguing about highly suspect LoC numbers pulled out of thin air as if it’s not an apples-to-oranges comparison, for that matter.
After reading though the wiki about pi calculus and looking up the few languages that support it, I would be pretty shocked to find a language that adds a pi calculus feature wouldn't need several blog posts explaining what it is and how to understand it.
Go's concurrency model is fundamentally lexical closures[1] and threading, with channels layered on top. Lexical closing is, afterall, how channels are initially "passed" to a goroutine, and for better or worse it's not actually a common pattern to pass channels through channels. And but for Go hiding some of the lower-level facilities needed for thread scheduling, you could fully implement channels atop Go's lexical closures and threading.
I think the similarity to pi calculus is mostly coincidence, or perhaps convergent evolution. The choice not to make goroutines referencible as objects, and the fact channels can be communicated over channels, makes for a superficial similarity. But the former--lack of threads as first-class objects--comes from the fact that though the concurrency model is obviously threading, Go designers didn't want people to focus on threads, per se; and also it conveniently sides-steps contentious issues like thread cancellation (though it made synchronous coroutines problematic to implement as the GC has no way to know when a coroutine has been abandoned). And the ability to pass channels through channels is just consistent design--any object can be passed through a channel.
[1] Shared reference--non-copying, non-moving--closures. Though Go's motto is "share memory by communicating" as opposed to "communicate by sharing memory", Go comes to the former by way of the latter.
As noted in my other comments, I'm very conscious about this problem and tend to avoid excess dependencies when I can do them myself with a bit of work. I even don't use iterutils (which is a popular convenience library that amends `Iterator`), because I normally want a few of them (`Iterutils::format` is one of things I really miss) and I can write them without making other aspects worse. But I'm also in the minority, I tend to depend on "big" dependencies that are not sensible to write them myself while others are much more liberal, and I do think that cargo-watch is already optimal in the number of such dependencies. More responsibilities and challenges remain for library authors, whose decisions directly contribute to the problem.
[1] I haven't actually checked the number of lines under this assumption, but I recall that it exceeds at least 100K lines of code, and probably much larger.
To respond to the OP. Go's concurrency model absolutely has multiple blogs written about it and explaining how it works. It's actually a little funny OP was thinking Go was based on pi calculus when it was actually based on CSP. That goes to my original disagreement. Good features need explanation and they don't become "bad" just because they require blog posts.
[1] https://go.dev/blog/codelab-share
[2] https://en.wikipedia.org/wiki/Actor_model_and_process_calcul...