Rust RAII is better than the Haskell bracket pattern(snoyman.com) |
Rust RAII is better than the Haskell bracket pattern(snoyman.com) |
[1] https://arxiv.org/abs/1710.09756 [2] https://www.youtube.com/watch?v=t0mhvd3-60Y
The explicit purpose of Haskell is to be a basis for research into functional language design (edit: among other purposes). By "explicit purpose" I mean exactly that... people got together in 1987 to come up with a language for research. Haskell was never supposed to ossify into some kind of "finished product", it was built exactly so people could experiment with things like linear types. If you want to just write libraries and get stuff done with a more or less fixed language, you probably want to be writing OCaml.
I mean, just look at the list of GHC extensions... there are something like a hundred of them! The list is growing longer every year. https://downloads.haskell.org/~ghc/latest/docs/html/users_gu...
Linear types are the perfect example of a feature that belongs in the core language, or at the very least into a core intermediate language. They are expressive, in that you can encode a lot of high-level design ideas into linear types. You can compile a number of complicated front-end language features into a linearly typed intermediate language. Linear types have clear semantics and can even be used for improved code generation. If we ignore the messy question of how best to expose linear types in a high-level language then this is just an all around win-win situation...
Unless, of course, you're implying it's very haskellish to implement libraries with huge usability gotchas (of which ResourceT was one until the Ghosts of Departed Proofs paper reminded us we can reuse the machinery of ST), then I totally agree.
Therefore the RAII style wouldn't really work in Haskell. The current bracket approach is still better than RAII in Haskell.
That said, the ST-style trick of a phantom type variable is pretty well-known. Unfortunately not many people knew the same trick can be used for non-ST as well. I feel like as a community we should be encouraging this style more often.
UPDATE: I wrote the original comment with the incorrect assumption that drop functions will always be called in Rust. This is wrong. Please see child comments.
This is a classical liveness vs. safety dualism. "Something good will eventually happen" and "nothing bad will ever happen" are promises whose solutions are often in conflict with one another.
The general problem — to make transactional state changes and transactional control flow (i.e. expectations about these state changes) match up precisely — is not easy to solve in the general case, especially once you move on to things that are less trivial than simple resource acquisition/release matching.
Python's "with" clause, and the way it interacts with exceptions, is the only system I've seen that gets this right for the nested case.
The mechanic point of this article is pretty clear:
- it's possible to be unsafe in both Haskell and Rust when dealing with resource cleanup
- Rust does a bit of a better job in the general case though it has it's own warts (see the other comments, it's hard to deal with issues during `drop`-triggered cleanup)
I want to make a muddier meta point -- Rust is the best systems language to date (does anyone know a better one I can look at?).
- The person who wrote this article Michael Snoyman[0] is mainly a haskell developer, he's the lead developer behind arguably the most popular web framework, yesod[1].
- Haskell developers generally have a higher standard for type systems, and spend a lot of time (whether they should or not) thinking about correctness due to the pro-activity of the compiler.
- These are the kind of people you want trying to use/enjoy your language, if only because they will create/disseminate patterns/insight that make programming safer and easier for everyone down the line -- research languages (Haskell is actually probably tied for the least "researchy" these days in the ML camp) are the Mercedes Benz's of the programming world -- the safety features trickle down from there.
- Rust is not a ML family language -- it's a systems language
- People who write Haskell on a daily basis are finding their way to rust, because it has a pretty great type system
When was the last time you saw a systems language with a type system so good that people who are into type systems were working with it? When was the last time you saw a systems language that scaled comfortably and gracefully from embedded systems to web services? When have you last seen a systems language with such a helpful, vibrant, excited community (TBH I don't think this can last), backed by an organization with values Mozilla's?
You owe it to yourself to check it out. As far as I see it rust has two main problems:
- Learning curve for one of it's main features (ownership/borrowing)
- Readability/Ergonomics (sigils, etc can make rust hard to read)
Admittedly, I never gave D[2] a proper shake, and I've heard it's good, but the safety and the emphasis on zero-cost abstractions Rust offers me makes it a non-starter. Rust is smart so I can be dumb. C++ had it's chance and it just has too much cruft for not enough upside -- there's so much struggle required to modernize, to make decisions that rust has had from the beginning (because it's so new). It might be the more stable choice for a x hundred people big corporate project today or next month, but I can't imagine a future where Rust isn't the premier backend/systems language for performance critical (and even those that are not critical) programs in the next ~5 years.
I'll go even one step further and say that I think that how much rust forces you to think about ownership/borrowing and how memory is shared around your application is important. Just as Haskell might force you to think about types more closely/methodically (and you're often better for it), Rust's brand of pain seems instructive.
[1]: https://www.yesodweb.com/
[2]: https://dlang.org/
The linked post is interesting, because I didn't realise "RAII is a much better way of managing resources than destructors" was controversial. It absolutely is, RAII is fast, predictable, and flexible. It's also one of the tradeoffs some languages make to achieve more flexibility in their design by enabling performant automatic garbage collection that doesn't require perfect escape analysis.
Which .NET is finally arriving to, thanks to Midori outcomes. And Java might eventually get there as well, depending on how Project Valhalla ends up.
As for languages like Haskell, a mix of bracket and linear types might be the way to go.
The linked article was comparing RAII with the bracket approach, not the destructor approach.
This isn't useless because memory allocation can happen during destruction/exit, e.g. to write some data to the filesystem.
Suppose you have a container with a billion objects. The container's destructor iterates over each object, doing some housekeeping that requires making a copy and then deleting the original before moving on to the next object.
That requires memory equivalent to one additional object because an original is destroyed following each copy. Stop dellocating memory during destruction/exit and the total memory required doubles, because you have all the copies but still all the originals.
There are also some helpful things that happen during deallocation. For example, glibc has double free detection, which strongly implies potential UAF but it's only detected if the second free() actually gets called.
However, this is different than the bracket pattern that the article is taking about. No one in the Haskell community advocates cleaning up resources (like file descriptors, etc) using only destructors.
thank god they do this. how many times did I have to manually force linux to release sockets because badly coded C programs which opened sockets forgot to release them causing them to hang up for ~5 minutes after the process ended. With proper RAII classes this does not happen.
Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function - on the socket in their destructor or whatever to prevent that. But I don't think the same applies for memory, once the process is destroyed the kernel should reclaim all memory in the process page tables automatically. Otherwise you'd end up with a pretty trivial way of disabling the system by exhausting all the memory...
There's also no guarantee for Rust/C++ destructors to be called. It's certainly less of an issue then depending on the GC to being called, but if you need absolute correctness, then you shouldn't rely on the destructors.
If you allocate an object on the heap with `new` then its destructor isn't called automatically unless you make it so through some other mechanism, but GP comment clearly want claiming that.
There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.
> forget is not marked as unsafe, because Rust's safety guarantees do not include a guarantee that destructors will always run.
This was a problem in Rust when scoped threads relied on destructors being run.
I'm not sure it's possible to force any code to be run (e.g. a process can be terminated at any time) although a closure might offer slightly stronger guarantees in some situations.
Your point about this being difficult to solve in the general case is true, it's just worth pointing out Rust intends to do that hard thing anyway.
You can still call drop on it manually to release it earlier, though.
No. Rust's ownership problem solves it for trivial cases, at the cost of making it hard to do other things (such as sharing references past the lifetime of the owner without resorting to Rc<T> or Arc<T>, at which point you don't really have lifetime guarantees anymore).
The essential limitation of Rust is that (without resorting to Rc<T> and Arc<T>, which would put you back to square one) it is conceptually limited to the equivalent of reference counting with a maximum reference count of 1. In order to make this work, Rust needs move semantics and the ability to prove that an alias has a lifetime that is a subset of the lifetime of the original object) and may even sometimes have to copy objects, because it can never actually increase the (purely fictitious) reference count after object creation.
This inherent limitation makes a lot of things hard (or at least hard to do without copying or explicit reference counting). Structural sharing in general, hash consing, persistent data structures, global and shared caches, cyclic data structures, and so forth.
In short, you have the problem with shared references less, because Rust makes it hard to share data in the first place, for better or worse. (Again, unless you resort to reference counting, and then you get the issue back in full force.)
Keeping a debug reference at end of transaction (via a shared-reference type, since a non-shared RAII reference type could never get into that state) isn't a coding error, it's a design error -- the development intentionally requested contradictory things. That is solved by using weak references if you don't want a debug tool to force an object to stay alive.
As a simple example, you may still want to access a resource after it has been released. Closing a network connection, for example, does not mean that accessing it is invalid. The connection may still have state (such as statistics collected or whether a non-blocking close was clean) that is perfectly legal to access after release (and in fact may only be consistent/observable afterwards).
The Eiffel FILE class [1], for example, allows you to call `is_closed` at any time (as well as the various `is_open` functions). This is necessary because `not is_closed` is evaluated as a precondition for many other operations.
A more complex example is a resource that is shared by many threads. Whether that resource is valid is often not a question of whether a reference is reachable, but a function of complex distributed state. Sometimes this can be solved by atomic reference counting, but even then atomic reference counting is expensive.
[1] https://archive.eiffel.com/products/base/classes/kernel/file...
That is unclear. Currently, `File::drop` ignores all errors and drops them on the floor ([unix], [windows]). This is a concern both long-standing and ongoing[0].
AFAIK discussion has gone no further than https://github.com/rust-lang-nursery/api-guidelines/issues/6...
[unix]: https://github.com/rust-lang/rust/blob/master/src/libstd/sys...
[windows]: https://github.com/rust-lang/rust/blob/master/src/libstd/sys...
[0] https://www.reddit.com/r/rust/comments/5o8zk7/using_stdfsfil...
Letting this slide for this long is a very bad sign. I’ve been a big Rust fan for my hobby projects, but the whole point of Rust is effortless correctness and safety. The more I encounter bugs and issues that have no near term solution planned, the more confidence I must admit I’m losing in their bug vs feature work prioritization scheme.
For example, it seems sometimes that Rust management would rather focus on cool new language enhancements / rewrite projects, than fix major bugs (sometimes even major borrow checker bugs, or random segfaults created in correct programs).
The destructor would instead be in charge to perform the rollback actions on an uncommitted transaction, if any. Rollback cannot fail and indeed the system must preserve integriy even if not performed as there is no guarantee that the process will not be killed externally.
Of course if you do not care about data integrity, swallowing errors in close is perfectly acceptable.
Edit: in general destructors should only be used to maintain the internal integrity of the process itself (freeing memory, clising fds, maintaining coherency of internal datastructures), not of external data or the whole system. It is fine to do external cleanup (removing temporary files, clearing committed transaction logs, unsubscribing from remote sources, releasing system wide locks etc), but shoud always be understood to be a best effort job.
A reliable system need to be able to continue in all circumstances (replying or rolling back transactions on restart, cleaning up leftover data, heartbeating and timing out on connections and subscriptions, using lock free algos or robust locks for memoryshared between processes, etc).
Also I guess in Haskell there is more expectation that the type system should prevent you from expressing runtime errors
I don't know if there's an elegant way to solve this. If Rust had exception you could use that but then again in C++ it's often explicitly discouraged to throw in destructors because you could end up in a bad situation if you throw an exception while propagating an other one. How does Python's "with" handle that?
Much as in C++, this is not really allowed: drop runs during panic unwinding, a panic during a panic will hard-abort the entire program.
> I don't know if there's an elegant way to solve this.
I don't really think there is. Maybe opt-in linear types could be added. That would be at the cost of convenience (the compiler would require explicitly closing every file and handling the result, you could not just forget about it and expect it to be closed) but it would fix the issue and would slightly reduce the holding lifetime of resources.
Furthermore, for convenience we could imagine a wrapper pointer converting a linear structure into an affine one.
> How does Python's "with" handle that?
You'll get the exception from `__exit__` chaining to whatever exception triggered it (if any). Exceptions are the normal error-handling mechanism of Python so it's not out of place.
It could take a callback. Then for any given file handle, if you don't care that the write failed you can ignore it; if you care but can't sensibly respond, you can panic; if you can sensibly respond you can do it inline or schedule work to be done somewhere with a longer lifetime than the file handle.
With RAII in C++ there's no visual difference between dumb data objects and objects like locks that are created and held on to mainly to cause implicit side effects.
In Rust this also prevents the compiler from dropping objects early - everything must be held until the end of its scope for the 0.1% of cases where you're RAII managing some externally visible resource. In those cases I would like the programmer to denote "The exact lifetime of this object is important", so the reader knows where to pay attention.
Additionally, part of Rust's core ideas is that the compiler has your back with this kind of thing, so there's less need for comments that say "CAUTION HERE BE DRAGONS." Those things can still be useful for understanding details of your code, of course, but they aren't needed to ensure that things are memory safe. That's what the compiler is for!
It is done properly in other languages as well, specially if they allow for trailing lambdas.
fn leak() {
// Create a 1KiB heap-allocated vector
let b = Box::new(vec![0u8; 1024]);
// Turn it into a raw pointer
let p = Box::into_raw(b);
// Then leak the pointer
}
Obviously that's kind of blatant, but there are more subtle ways to leak memory. Memory leaks aren't considered unsafe, so even though they're undesirable the compiler doesn't guarantee you won't have any.Reference cycles when using Rc<T> are a big one, but generally it's pretty hard to cause leaks by accident. I've only run into one instance of leaking memory outside of unsafe code, and that was caused by a library issue.
Granted, the ownership/borrowing semantics of rust make this a lot harder, but anything that uses Rc/Arc can easily fall prey to it — you can use those to create a reference cycle.
If you mean unintentional leaks then that is a harder problem. Others have noted ARC and RC leaks but also thread locals may (or may not) leak[0].
[0]: https://doc.rust-lang.org/std/thread/struct.LocalKey.html#pl...
It has no such static checking because it was deemed to reduce expressiveness, while not impacting memory safety.
> Rust's safety guarantees do not include a guarantee that destructors will always run. For example, a program can create a reference cycle using Rc, or call process::exit to exit without running destructors. Thus, allowing mem::forget from safe code does not fundamentally change Rust's safety guarantees.
Have a look at ATS[1], it supports many features that are available in Rust, and let you build proofs about your code behaviour. It's quite type annotation heavy though iirc, but it's very efficient.
[1] : http://www.ats-lang.org
I think it is a better venue which can help many applications simultaneusly. While linear types won't.
I do not oppose inferring linear use at core and/or intermediate representation (GRIN allowed for that and more). I just do not see their utility at the high level, in the language that is visible to the user.
So library approach thrives.
It is quite possible you may need to have RAII somewhere in Haskell code and that's where things like parametrized monads are good: http://blog.sigfpe.com/2009/02/beyond-monads.html
It is a library and I keep saying that what is usually programming language feature is just a library in Haskell.
Constructors are orthogonal. The job of a constructor is to construct your object given that the space for the object is already allocated. This could be on the stack, where allocation means bumping the stack pointer, or in-place in preallocated storage (like std::vector), or the result of calling `operator new`. Simply using the `new` syntax does both as a shorthand.
Similarly the job of a destructor is to destruct your object without deallocating it. One can in-place destruct without deallocating, or destruct and then deallocate implicitly when the stack pointer is adjusted, or not at all. The `delete` syntax does both destruction and deallocation as a convenience.
These other resources still need an in-memory representation to track and reference resources, so you can't really separate them.
Lifetimes really have no runtime effect at all. They only exist to prove things about the program at compile time. So all the types and function signatures get assembled together, and then a constraint solver gets run over the whole thing. As long as it returns "yes a solution exists", then no ones really cares about the details of the solution. The benefit of non-lexical lifetimes is to weaken the constraints on the system, so that code that used to appear invalid now appears valid. But I believe it will have no effect on any existing code. (There's a Rust compiler reimplementation somewhere that doesn't even check lifetimes, since you can always use the standard compiler in testing.)
It depends on exactly what you mean by this.
In a nutshell, "non-lexical lifetimes" means "things go out of scope when after their last use. Drop implies a use at the end of the current lexical scope."
Dropping Drop types earlier ("eager drop") was desired, but has significant problems, including "a large body of unsafe code exists in production which relies on knowing when Drop types go out of scope and changing this behavior may cause a ton of unsoundness in existing code."
Lifetimes are a language you use to help the compiler prove that all of your references will be valid. If it's unable to prove that, it will throw up an error. That doesn't prove that your references were wrong - it just says that they _might_ be wrong, and the compiler won't allow that possibility. Non-lexical lifetimes just provide the ability to prove more refences and thus allow more code to compile - code that was already fine, but, the compiler couldn't figure out that it was fine.
Right, I didn't really consider that a "drawback" because I'm in the camp that considers that panic! shouldn't unwind but actually abort the process here and there anyway. But you're right that if you rely on the default unwinding behavior panic!ing in destructors is a very bad idea.
You do rely on the default unwinding behavior anyway at least for `cargo test`: the test framework depends on being able to catch the unwinds from `assert_eq!` and similar.
Iirc Oleg Kiselyov implemented proper delimited continuations in ocaml as a library, without touching the runtime or compiler. Something similar has been done in Haskell.
I doubt fully dependent types can be implemented in Haskell without extra help by ghc. There has been lots of work in the area, and last time I checked you could simulate DT to some degree, but it never was as powerful as the dependant types in idris. Iirc t were some edge cases where the typing became undecidable.
To clarify this, the library you're talking about implements most of the functionality in C, reusing the runtime's exception mechanism. So it doesn't require any upstream change to compiler or runtime, but it also can't be implemented in pure OCaml.
For Haskell it is however possible. There is a neat paper by among others Kent Dybvig.
You can easily simulate that using parametrized monad: http://blog.sigfpe.com/2009/02/beyond-monads.html
E.g., hClose will have type like (Has listIn h, listOut ~ Del h listIn) => h -> ParamMonad listIn listOut () and hGetLine will result in type much like this one: (Has list h) => h -> ParamMonad list list String
It is not perfect: you still may have reference to handle after it's use and you may be tempted to introduce it somehow back and get run-time error; you also would struggle juggling two handles for copying files, for example (for this you may have to use ST parametrization trick).
But anyway, you really not need linear types all that often (they are cumbersome - I tried to develop language with linear types, type checking rules and types get unwiledy very soon) and when you do, you can have a library. Just like STM, parsers, web content generation/consumption, etc. Linear types do not warrant language change.
Zeroing on exit would be more secure, but significantly slower -- you want to exit quickly, so you can potentially start a replacement program, which would be expected to, at least sometimes, take time to allocate the same amount of memory. If it does allocate the whole amount immediately, it's not necessarily any slower in total time between zeroing at exit or on mapping; but it there's enough time for the pages to get zeroed in the background, that reduces the amount of time waiting for the kernel to do things.
It seems the jury is out on the benefits from a performance perspective (DragonflyBSD took out background zeroing, saying they were unable to observe a performance difference, so simpler code is better)
Start with Joe Duffy blog posts about Midori architecture.
http://joeduffyblog.com/2015/11/03/blogging-about-midori/
Then hop on to his talks.
"Safe Systems Programming in C# and .NET"
https://www.infoq.com/presentations/csharp-systems-programmi...
"RustConf 2017 - Closing Keynote: Safe Systems Software and the Future of Computing by Joe Duffy"
https://www.youtube.com/watch?v=EVm938gMWl0
Then you can watch "Inside .NET Native" from Channel 9
https://channel9.msdn.com/Shows/Going+Deep/Inside-NET-Native
Finally there are the specs and related discussions that lead up to C# 7.3 design.
https://github.com/dotnet/corefxlab/tree/master/docs/specs
The TL;DR; version, basically async/await, the UWP AOT compiler, improved handling of value types, spans (aka slices), improved GC (TryStartNoGCRegion()) have their roots in System C# used in Midori.
Also there are some influences of Singularity, namely Bartok and MDIL, on the WP 8.x AOT compiler, but that is not longer relevant.
The issue is when you produce an API that contains objects with destructors. Since you are handing these entities off to unknown code, you cannot ensure that they will be dropped. This was a problem in scoped threads in Rust.
In which case in rust you cannot be sure that "the drop" will be called?
See the very excellent http://cglab.ca/~abeinges/blah/everyone-poops/
Not the parent, but it is trivial to write C++ and Rust examples in which destructors of variables with block scope are not called. The std library of both languages do even come with utilities to do this:
C++ structs:
struct Foo {
Foo() { std::cout << "Foo()" << std::endl; }
~Foo() { std::cout << "~Foo()" << std::endl; }
};
{
std::aligned_storage<sizeof(Foo),alignof(Foo)> foo;
new(&foo) Foo;
/* destructor never called even though a Foo
lives in block scope and its storage is
free'd
*/
}
C++ unions: union Foo {
Foo() { std::cout << "Foo()" << std::endl; }
~Foo() { std::cout << "~Foo()" << std::endl; }
};
{
Foo foo();
/* destructor never called */
}
Rust: struct Foo;
impl Drop for Foo {
fn drop(&mut self) {
println!("drop!");
}
}
{
let _ = std::mem::ManuallyDrop::<Foo>::new(Foo);
/* destructor never called */
}
etc.> There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.
This is pretty much why it is impossible for a programming language to guarantee that destructors will be called.
Might seem trivial, but even when you have automatic storage, any of the things you mention can happen, such that destructors won't be reached.
In general, C++, Rust, etc. cannot guarantee that destructors will be called, because it is also trivial to make that impossible once you start using the heap (e.g. a `shared_ptr` cycle will never be freed).
well, the problem with non-RAII solutions is that you depend on the whims and talent of the programmer to call shutdown at some point. With a RAII solution like in C++ or Rust you know that if your socket opened successfully, a call to close will necessarily be issued.
yes, the class has to be written only once - and I have personnally never had to write it except in that one group project in school, since I use libraries that handle it - e.g. boost.asio or Qt Network.
If you are in C, even if you use an abstraction layer, you have to remember to call a _free / _destroy-like function every time you write some code that uses sockets.
My preferred semantics would have been early drops by default, and a must_drop annotation similar to must_use, to say that objects like RwLockReadGuard should be explicitly dropped or moved.
I think the reason for this is might be that, in Haskell, a function starting with 'with' is, by convention, using the bracket pattern and the way that you might use such a function would be very similar in structure to the Python way.
Something that is often said about C++ is that, you're only ever using 10% of the language, but that everyone uses a different 10% and it's true, but it's true of every language to differing degrees. Everyone has their own way of forming programs, just like everyone has their own slightly different style of playing chess, cooking or forming sentences.
When you have a well developed style, you will quickly spot any deviations from it. At that point, it doesn't matter if your style was forced on you by the language or whether it's just a convention that you use.
It's certainly true that Haskellers expect a lot from the type system, even compared to other static languages, let alone Python.
I'm not very familiar with Haskell but it seems like you'd get used to the type system telling you everything you need to know. But in this case it doesn't. In Python world we talk about 'pythonic/unpythonic'... it seems like it's maybe quite unhaskellish to have to rely on a naming convention and remembering not to use the return value of the function?
I would guess that's why the article and many of the other comments here focused on how you could express this behaviour in Haskell's type system, where you'd expect it.
In short: type system > syntax sugar > naming convention
Haskell is more similar than you realise, it's the difference between this:
withSomeResource $ \resource -> do
someFunctionOn resource
and this: with some_resource() as resource:
some_function_on(resource)
> I'm not very familiar with Haskell but it seems like you'd get used to the type system telling you everything you need to knowAs an outsider, you might expect a type-error to mean that you made a logic error, in practice it usually means you made a typo.
What happens is that the type system forces you to write things in a certain way. You internalize its rules and it moulds your style. You don't try random things until they stick, you write code expecting it to work and knowing why it should, just like you would in Python. It's just that more of your reasoning is being verified. "Verified" is the operative word here - the type system doesn't tell how to do anything.
> it seems like it's maybe quite unhaskellish to have to rely on a naming convention and remembering not to use the return value of the function?
The Python equivalent of the problem here would be:
current_resource = a_resource
with some_resource() as resource:
current_resource = resource
current_resource.some_method()
So it's not that using the return value of the withSomeResource function is a problem, it's the resource escaping from the scope where it is valid.I think the crux of our discussion is about checked vs unchecked constraints.
When you work on (successful) large codebases, whether in a static or dynamically typed language, there are always rules about style (and I mean this in a broader way than how your code is laid out). For example, in large Python projects, there might be rules about when it is acceptable to monkey-patch. These rules make reasoning about the behaviour of these programs possible without having to read through everything.
Large Haskell projects also have these rules, but Haskellers like to enforce at least some of them using the type system. It takes effort to encode these rules in the type system and it is more difficult to write code that demonstrably follows the rules than implicitly follows them, but the reward for this effort is that it gives you some assurance that the rules are actually being followed everywhere.
For some rules this extra effort makes sense and other times it doesn't. The type system is just another way to communicate intent. Writing the best Haskell doesn't necessarily mean writing the most straight-jacketly typed Haskell, but it does give you that option. Beginners often fall into the trap of wanting to try out the new-and-shiny and making everything more strict than is helpful.
For one-man projects, there's really no advantage to Haskell over Python (with the caveat that you may not remember all of the intricacies of your code in six months and using Haskell you may have encoded more of your assumptions in the type system).
with some_resource() as resource:
some_function_on(resource)
Is that broken? If some_function_on saves the resource, yes. If it just temporarily uses it, no.I don't think the claim that it's syntactically obvious in Python is correct. In both cases the typical syntax helps a little but it's easy to get wrong.
It is the case that "the typical syntax" is a little more enforced by Python-the-language.
I think Haskell does have a good model for bringing together practical application of theoretical research.
Parent's comment is spreading the myth that Haskell is an academic language. It's not wrong but it's not Haskell's only stated purpose or utility by far.
If it sounds like I'm saying that Haskell is not useful for boring, line-of-business programs then I wasn't clear... Haskell is a research language, yes, and not exclusively so. But I'm confused why it's objectionable to spread a "myth" if that myth is, in your words, "not wrong". The stated purpose of Haskell, when it was created, is a matter of historical record.
> It should be suitable for teaching, research, and applications, including building large systems.
This, to me, means that we are not going to freeze the language, and sacrifice research, in order to support business applications. That would go against the goals of the language.
Doing everything as a library seems "un-Haskellish" to me because there's an ongoing and vibrant community that's doing research into things like type theory, which can't be done as libraries, and kicking that group of people off the Haskell platform just to support business applications would be a failure of Haskell as a language.
Haskell can support both groups.
I think your post was unclear and supported that myth. After reading your reply I understand better what you meant!
I agree -- extensions do seem to be working rather well. I hope the new Haskell standard, Haskell2020, will include some of them into the language proper!
I'm looking forward to seeing how linear types work/interact with the rest of the language.
Compare language features and Haskell's approach: Erlang and distributed-process, goroutines and channels and Control.Concurrent(.Chan), (D)STM is a library, Control.Applicative and Control.Monad for many things hardly expressible in any other language, etc, etc.
Linear types, I am afraid, would go the way implicit parameters went - their use is cumbersome and they really do not help much with day-to-day programming and when they are needed they can be perfectly implemented with a library.
Please edit swipes like that out of your comments here. The rest is fine and stands on its own.
But Rc would not work if the drop was not guaranteed to be called.
Rust has never been about proving correctness. Yes, correctness is a goal, but it is subservient to other goals, depending on details.
Furthermore, it's not clear that this can really be implemented in a reasonable way, see https://news.ycombinator.com/item?id=18175838
> it seems sometimes that Rust management would rather focus on cool new language enhancements
In this comment, you're complaining that we haven't implemented a "cool new language enhancement." This is at odds with your desire stated here.
Perhaps I misread this documentation; if so, I don’t think I’d be alone here. I don’t see any particular mention that dropping an fs::File could lead to data loss, and I had generally assumed major edge cases like ‘data loss from a file system library’ would be documented.
Which is why RAII in a language without exceptions is inappropriate for a resource which has a status on closeout.
I'm quite comfortable stating that non-lexical lifetimes and async I/O, for instance, are far more important. The number of users who benefit from those two features are multiple orders of magnitude greater than the number of users who care about whether the official opinion of the guidelines subteam is that close() should take &self or &mut self. The Rust team would be doing a disservice to users if it focused on small issues like that—this isn't even a bug we're talking about, it's guidance around conventions!—instead of the biggest complaints that come up constantly.
I may call it a bug, and you may call it undocumented silent data loss behavior; either way, we’re talking about the same thing. Silent data loss from undocumented behavior is not good, wouldn’t you agree?
I certainly was not aware that drops in Rust could throw away potentially serious error codes. Now I have to go re-audit the correctness of all my Rust code that uses the file system (at the very least).
If the behavior was documented I would not consider this a bug. That said, perhaps I’m missing something in the documentarion — my apologies if thats the case — but I did just re-read the fs::File docs and see no mentions or precautions about potential data loss when a File is dropped.
This is a thing people say, but I think it's misleading. Reference counting can increase the lifetime of an object, but borrowing cannot. I've seen this really trip up beginners.
> This inherent limitation makes a lot of things hard
It can make them different, which can be hard, but these things are already hard. And some people think it can make them easier or even better; see Bodil Stokke's work on persistent data structures in Rust.
I'm not sure I follow.
The only reference-counted language I've used is (pre-ARC) Objective-C. There, it was a very common idiom to "borrow" objects - so common that it didn't even have a name. There was just objects you "retained" (that is, staked a claim on), and ones you didn't.
Maybe there's a pitfall to how the "automatic" part of automatic reference counting is typically implemented?
* You have an object. You call retain on it. You have a count of one.
* You also have a pointer to that object. The "borrow" in your analogy.
* You return this pointer, and stash it somewhere. The object still has a count of one, so it's still live, so this is okay.
* Later in your program, you use that pointer to call release.
Here, we've only ever had a reference count of one, but our object has lived across arbitrary inner scopes. In Rust, this would not work, unless you dropped into unsafe.
Obviously, with Arc and autoretain this kind of code doesn't get written anymore, I would hope. And even without, it wouldn't be guaranteed, so you'd want the "borrow" to actually bump the refcount. But Rust is about guaranteeing that it can't.
"With a maximum reference count of 1." As the reference count becomes 1 upon object creation, it cannot really be increased further. Hence, only operations that keep the (virtual) reference count at 1 or reduce it to 0 are allowed.
My point here is that you inherently cannot do things where you cannot prove that this virtual reference count can be capped at 1.
It also only refers to ownership, not borrowing, and both are equally important.
Beyond that, what I'm saying is something more meta: It doesn't really matter if this analogy is spot-on or not; it's got enough wiggle room in it that I've seen it trip up beginners. Maybe that's because they misunderstand the analogy, but given that its point is to convey understanding, that means that it isn't a great analogy, in my experience. YMMV.
1: directly, of course; this also depends on the language.
Your analysis of the trade offs is fine, but you claim that Rust only solves this problem for "trivial" cases. If that's true, then most of the Rust code I've written is trivial. To me, that pretty thoroughly weakens your dismissal here, at least in my case.
... yes, I know. And is presumably what you referred to as "trivial." But this in fact comprises the vast majority of Rust code I've written. So you can call it trivial if you want, but as I said, it significantly reduces the weight of your dismissal.
There's plenty of Rust code I've written that makes use of Arc/Rc, specifically for cases you've called out (global caches, structural sharing, etc.) but it's nowhere near ubiquitous. So what I'm trying to say is that your representation of the problems that Rust solves is at best misleading, as supported by my experience writing a not insignificant amount of Rust.
So in other words, sure, you can call most of my code "trivial," but on the other hand, I can say that the problems posed by you in your top-level comment are actually solved in most of my code, regardless of whether you think it's trivial or not.
So, it sounds to me like it's not necessarily that Rust's model is fundamentally different from "ref counting with a limit of 1", at least in terms of how you should be managing your memory, so much as that the language doesn't let you some things that you really shouldn't be doing in the first place.
Sometimes it felt like Objective C wouldn't just let you point a gun at your foot, it would actively cheer you on while you did it.
Throwing exceptions isn't a particularly good solution either, for the same reason. Exceptions are hard to reliably handle when you can't easily reason about where they will be thrown from.
There are plenty of cases where you'd prefer an application to crash upon an unhanded write error, rather than silently losing data that could be highly important and irrecoverable.
(Of course, actually handling the errors is preferred above both.)
> Destructors aren't supposed to fail.
But isn't the whole point of this discussion is that the destructor of fs::File (and probably any other buffered IO writer) can and does fail in some cases?
And the choices to handle such failed destructors are: blockingly retry until the problem goes away or just plain ignore it. Either way you can't rely on destructors for persistence/durability in case of a crash or power loss.
Not in the programs I write. I grant that perhaps this should be configurable.
I think it could take a callback, so it could note the failure somewhere if I care about it.
I addressed borrowing above. Borrowing is proving lifetime subset properties and that you therefore can avoid increasing the virtual reference count.
And this is not about whether this is useful for beginners. It is to illustrate inherent limitations of the approach.
Right, so what I'm saying is, the description of borrowing doesn't really fit in with the reference counting aspect of the analogy, so it ends up being separate from it.
> And this is not about whether this is useful for beginners.
Right, that was my point. :)
I mean, in the end, do what you'd like. All I'm saying is that I've seen this analogy lead to tons of confusion. YMMV.