Slightly different perspective from Grokking Simplicity[1]: functional programming is not about avoiding mutation because "mutation is bad." In fact, mutation is usually the desired result, but care must be taken because mutation depends on when and how many times it's called.
So good FP isn't about avoiding impure functions; instead it's about giving extra care to them. After all, the purpose of all software is to cause some type of mutation/effect (flip pixels on a screen, save bits to storage, send email, etc). Impure functions like these depend on the time they are called, so they are the most difficult to get right.
So Grokking Simplicity would probably say this:
1. Avoid pre-mature optimization. The overhead from FP is usually not significant, given the speed of today's computers. Also performance gains unlocked by FP may counter any performance losses.
2. If optimization via mutation is required, push it as far outside and as late as possible, keeping the "core" functionally pure and immutable.
This is similar to Functional Core, Imperative Shell[2]; and perhaps similar to options 1 or 2 from the article.
(Of course, if you really need impure functions for eg performance, `unsafePerformIO` has you covered in Haskell.)
People should try to stop thinking of mutation as something to be avoided, and start thinking of it as something to be managed.
Mutating state is good. That's usually the whole point.
What's bad is when you create "accidental state" that goes unmanaged or that requires an infeasible effort to maintain. What you want is a source of truth for any given bit of mutable state, plus a mechanism to update anything that depends on that state when it is changed.
The second part is where functional programming shines -- you have a simple way to just recompute all the derived things. And since it's presumably the same way the derived things were computed in the first place, you don't have to worry about its logic getting out-of-sync.
However, nobody had any experience with it. Now we do. And I think what that experience generally says is that it's a bit overkill. We can do better. Like Rust. Or possibly linear types, though that is I think much more speculative right now. Or other choices. I like mutable islands in a generally immutable/unshared global space as a design point myself, as mutability's main problem is that it gets exponentially more complicated to deal with as the domain of mutability grows, but if you confine mutability into lots of little domains that don't cross (ideally enforced by compiler, but not necessarily) it really isn't that scary.
It was a necessary step in the evolution of programming ideas, but it's an awful lot to ask that it be The One True Idea for all time, in all places, and that nobody in the intervening decades could come up with anything that was in any way an improvement in any problem space.
> However, nobody had any experience with it. Now we do.
Working intimately with C++ in the '90s, immutability as a concept was neither considered counterculture nor without significant experience employing it. At that time and in C++, it was commonly known as "const correctness" and was a key code review topic.
Go back another decade or two when K&R C ruled the land and that's a different story ;-).
Or OCaml perhaps ? They have taken a very pragmatic approach to fp and allow mutations because thats the pragmatic choice sometimes.
That's pretty easy to do in eg Haskell or even Rust.
I feel that early texts were good at this. Turtle Geometry is my personal favorite book in this vein. I seem to recall we spent a long time going over how to double buffer graphics so that you could be working on one buffer while letting the system draw the other. Not sure what texts we used for that, back in the day.
Later texts, though, go through a lot of hurdles to hide the fact that things are actively changing. The entire point is to change things.
We've other 'entire points' along the way.
Allocation and freeing of memory are fundamental to computing. We don't do malloc and free any more.
Control-flow (selecting which instruction to follow next) is also fundamental. We don't to goto anymore.
Thinking this further, this is also performance related. I think there is an interesting relationship between FP and DOD:
A technique of data oriented design is to keep state minimal and lazily derive data when you actually need it, which may involve recomputing things. The rationale is that compressed, normalized data requires less fetching from memory and computation on it is faster.
In contrast caching and buffering, both of which are heavily stateful and require a lot of additional memory, are often necessary, because they minimize inherently slow operations that are out of your control. Those kinds of things are often best implemented as (computational) objects with encapsulated, internal state and small, general interfaces, like OO has taught us.
But once the data in your control, this mindset has to be flipped on its head. You want to model your in-memory data not that differently from how you'd model for databases: Neatly aligned, normalized data, with computed columns, views and queries to get richer answers.
Interestingly if you follow this approach, then code starts to look more similar to functional code, because you potentially need the whole context to derive values from it and a lot less like independent objects that send messages to each other.
That should read, “plus one mechanism”. Another place where people get into trouble is thinking “a” means >= 1 instead of exactly one.
When the state has multiple entities trying and failing to manage it consistently is when things get bad. Functional tends to make for more friction which discourages doing a lot of state, which to some extent controls the superlinear complexity of state management by making each piece dearer.
Yup. Rust can't abstract over mutability. For owned values, it defaults to exclusive ownership, and needs explicit Rc<T> and clone() to share them. For references, in practice it requires making separate `foo()` and `foo_mut()` functions for each type of the loan.
Even though this sounds horribly clunky, it works okay in practice. Ability to temporarily borrow exclusively owned objects as either shared or exclusive-mutable adds enough flexibility.
Rust is known for being difficult, but I think a lot of that is due to lack of GC. Rust can't make any reference live longer, and programmers have to manually get scopes of loans right, or manually use Rc<RefCell> to have a mini DIY GC. Perhaps a shared XOR mutable with a GC could be the best of both?
> For example, let's say you're iterating over some structure and collecting your results in a sequence. The most efficient data structure to use here would be a mutable dynamic array and in an imperative language that's what pretty much everyone would use.
Well, no, this is straight confusion between what’s expressed by the program and what’s compiled. The idiomatic code in Ocaml will end up generating machine code which is as performant than using mutable array.
The fact that most programming languages don’t give enough semantic information for their compiler to do a good job doesn’t mean it necessary has to be so. Functional programmers just trust that their compiler will properly optimize their code.
It gets fairly obvious when you realise that most Ocaml developers switch to using array when they want to benefit from unboxed floats.
The whole article is secretly about Haskell and fails to come to the obvious conclusion: Haskell choice of segregating mutations in special types and use monads was an interesting and fruitful research topic but ultimately proved to be a terrible choice when it comes to language design (my opinion obviously not some absolute truth but I think the many fairly convoluted tricks haskellers pull to somehow reintroduce mutations support it). The solution is simple: stop using Haskell.
* https://hackage.haskell.org/package/bluefin-0.0.2.0/docs/Blu...
* https://hackage.haskell.org/package/bluefin-0.0.6.0/docs/Blu...
The downside in Tcl is that if you refactor some code suddenly you can add a new reference and drop into accidentally quadratic territory because now everything is being copied. This leads to some cute/ugly hacks that are only explainable in terms of interpreter implementation details, purely to reduce a refcount in the right spot.
some_func $value[set value “”]
where the [set value “”] bit reduces the refcount. There’s also a fairly widespread idiom of using the K combinator for this[1]: some_func [K $value [set value “”]]
It’s one of those things that has become second-nature to people in-the-know, but is a total headscratcher otherwise.[1]: https://wiki.tcl-lang.org/page/K#c2a6014c2d129837889d8a8000d...
I think Roc is doing this.
Granule has uniqueness types like Clean built onto a linear type system, which offers some additional advantages.
> The most efficient data structure to use here would be a mutable dynamic array and in an imperative language that's what pretty much everyone would use. But if you asked an OCaml programmer, they would almost certainly use a linked list instead.
I agree with this sentiment. However, OCaml does have mutable arrays that are both efficient and convenient to use. Why would a programmer prefer a list over them? In my opinion, the main benefit of lists in this context is that they allow pattern matching and inductive reasoning. To make functional programming languages more suited for array programming, we would thus need something like View Patterns for arrays.
A related issue is that mutation can actually be slower than fresh allocations in OCaml. The reason for this is that the garbage collector is optimized for immutable datastructures and has both a very fast minor heap that makes allocations cheap and expensive tracking for references that do not go from younger to older elements. See: https://dev.realworldocaml.org/garbage-collector.html#scroll...
> Unfortunately, this makes it impossible to use any standard functions like map on linear values and either makes linearity nearly useless or inevitably creates a parallel, incomplete universe of functions that also work on linear values.
You can implement polymorphism over linearity: this is done in Frank Pfenning's SNAX language and planned for the uniqueness types in a branch of OCaml.
> This might sound a little dangerous since accidentally holding on to a reference could turn a linear time algorithm quadratic
No, the in-place reuse optimization does not affect the asymptotic time complexity. But it can indeed change the performance drastically if a value is no longer shared since copies are needed then.
> A tracing garbage collector just doesn't give you this sort of information.
It is possible to add One-bit Reference Counts to a garbage collector, see https://gitlab.haskell.org/ghc/ghc/-/issues/23943
> for now even these struggle to keep up with tracing garbage collectors even when factoring in automatic reuse analysis.
I investigated the linked benchmarks for a while. The gap between Koka and Haskell is smaller than described in that initial comment, but a tuned GHC is indeed a bit faster than Koka on that benchmark.
> And if you only need mutation locally inside a function, using ST makes your code fundamentally more imperative in a way that really forces you to change your programming style. This isn't great either and doesn't exactly help with readability, so the mental overhead is rarely worth it.
What?? You are explicitly opting into writing mutation code. Of course that's going to change your programming code. It is expected to be different. Readability is increased because it clearly delineates a different coding style. And it's not even that different from other monadic code, even when you compare to non-mutating monadic code.
Terrible article.
Discus language: http://discus-lang.org/
Thesis: https://benl.ouroborus.net/papers/2010-impure/lippmeier-impu...
The thesis is the more interesting of those two links IMHO. The intro is chapter 1 that starts at page 17 of the pdf. It has one of the better critiques of Haskell that I've seen, and explains why uncontrolled mutation is not the answer. Reference types ala ML aren't the answer either, in his view.
> For example, let's say you're iterating over some structure and collecting your results in a sequence. The most efficient data structure to use here would be a mutable dynamic array and in an imperative language that's what pretty much everyone would use.
> But if you asked an OCaml programmer, they would almost certainly use a linked list instead.
What? One of OCaml’s most notable features as a functional programming language is how it was designed to support mutation (see “the value restriction”, e.g. https://stackoverflow.com/questions/22507448/the-value-restr...) In my own OCaml programs I used mutation whenever appropriate (my only complaint would be that I wish there were a little more syntactic sugar around e.g. hash table access).
I wanted to like this post but it seems like low-effort clickbait.
Especially if the approach isn't really Copy on Write, but Copy only when someone might want to use the old value. Default to trying to mutate in place, if you can prove that is safe.
For most locals, that should be rather doable, and it would be a pretty big gain. For function parameters it probably gets hairy though.
Because it doesn’t do copy-on-read, you have to know whether there are references other than yours that can read the data. A single bit “at some time there were at least two references to it” doesn’t suffice, as it would mean you can’t detect when the last reference goes away, so it would leak memory (lots of it)
> A lot of the same benefit (but not all) should be available based on static analysis right?
That’s an (very important) implementation detail that makes reference counting perform reasonably well. You don’t want increase-decrease cycles in tight loops, for example.
What does "always copy" mean? What and why would you copy?
The author mentions linear types. This is a bit of a pet peeve of mine because, while very useful, linear types are not the concept that many people think they are and they are not implemented in rust (and neither are affine types). What rust implements is referred to as a "uniqueness type".
The difference has to do with how they deal with what linear-types-people call "exponentials". A linear type, as the article mentions, is the type of a value must be "consumed" exactly once (where consuming means passing it into a function, returning it, or sometimes destructuring it). Of course, in this mad world of ours we sometimes need to consume a value more than once, and indeed a language with only linear types would not be turing-complete. This escape hatch is called an "exponential", I guess because exponential is kind of like the opposite of linear. A value with an exponential type can be used as often as you want. It is essentially most types in most programming languages.
IF a function expects a value with a linear type, can you pass an a value with an exponential type to it? The answer is that you can. Try this in linear haskell if you don't believe me. A function taking a value with a linear type just says "I consume this value exactly once, but you can pass me whatever you want". The restriction is that values with linear types can only be passed to functions that expect linear types. A value with a linear type must be consumed exactly once, so you certainly can't pass it to a function that expects a value with an exponential type, because it might use that value twice. In other words, a linear type is a restriction on the callee.
Those familiar with rust will notice that this is not how rust works. If a function takes T, and you have &T, you just cannot call that function. (Ignore Clone for now.) However, in the world of linear types, this would be allowed. This makes linear types not useful for the article's purposes, although they are still very useful for other things.
What rust wants is a constraint not provided by linear types. Where linear types are a restriction on the callee, rust wants to be able to restrict the caller. It wants to be able to restrict the caller in such a way that it can know that there are no other references to a variable that's been passed into a function. People call this a "uniqueness type" because you can say "I want this type to be 'unique'" (where 'unique' means not-aliased). Honestly this name doesn't make a lot of sense, but it makes a little more sense when you think of it in terms of references. If a reference is unique, then it means that no other reference that points to the same object (which is the requirement rust imposes on mutable references). So while a linear type allows you to pass a non-linear variable to a function that expects a linear one, rust doesn't allow you to pass a non-unique variable to a function that expects a unique one.
And adding this requirement to mutations resolves 90% of the issues that make mutability annoying. Mutability becomes challenging to understand when:
1. You have multiple references pointing to the same data in memory. 2. You change the data using one of these references. 3. As a result, the data appears to have changed when accessed through any of the other references.
This simply cannot happen when mutability requires values to not be aliased.
I blame Haskell and the sudden fixation on absolute purity that manifested just as the VC-startup set decided it was the next secret sauce, to the point of literally redefining what "functional programming" meant in the first place.
I think that fixation has forced a lot of focus on solving for "how do we make Haskell do real work" instead of "how do we make programming in general more predictable and functional", and so the latter fight got lost to "well we bolted lambdas into Java somehow, so it's functional now".
Bullseye.
EDIT: Found this: https://github.com/HigherOrderCO/HVM1/blob/master/guide/HOW....
Let's use the fibonacci sequence as an example. Let's write it the classic, elegant way: f(n) = f(n-1) + f(n-2). Gorgeous. It's the sum of the two previous. With the caveat that f(n=0|1) = n. In Python:
# fib for basic b's
def fib(n):
## Base case
if n == 0 or n == 1:
return n
return fib(n-1) + fib(n-2)
Right off the bat, performance is O(n)=n*2. Every call to f(n-1) will also need to compute f(n-2) anyways! It's a mess. But since Python passes arrays and dictionaries as pointers (cough, sorry! I meant to say references) it's super easy to memoize: # optimize-pilled memoize-chad version
def fib(n, saved={}):
if n in saved:
return saved[n]
if n == 0 or n == 1:
saved[n] = n
else:
saved[n] = fib(n-1) + fib(n-2)
return saved[n]
Okay, now our version is nearly as fast as the iterative approach.This is the normal pattern in most languages, memoizing otherwise "pure" functions is easy because you can reference a shared object using references, right? Even with multithreading, we're fine, since we have shared memory.
Okay, but in Hoon, there are no pointers! Well, there kinda are. The operating system lets you update the "subject" of your Urbit (the context in which your programs run), and you can do this via the filesystem (Clay) or daemons (Gall agents, which have their own state kind of).
But to do this within a simple function, not relying on fancy OS features? It's totally possible, but a huge pain the Aslan.
First, here's our bog-standard fib in Hoon:
|= n=@ud
?: (lte n 1)
n
%+ add
$(n (dec n))
$(n (sub n 2))
Now, I memoize on the way down, by calculating just f(n-1) and memoizing those values, to acquire f(n-2): :- %say
|= [* [n=@ud ~] [cache=(map @ud @ud) ~]]
:- %noun
^- [sum=@ud cache=(map @ud @ud)]
=/ has-n (~(get by cache) n)
?~ has-n
?: (lte n 1)
[n (~(put by cache) n n)]
=/ minus-1 $(n (dec n))
=/ minus-2
=/ search (~(get by cache.minus-1) (sub n 2))
?~ search 0
(need search)
:- (add sum.minus-1 minus-2)
(~(put by cache.minus-1) n (add sum.minus-1 minus-2))
[(need has-n) cache]
and that works in the Dojo: > =fib-8 +fib 8
> sum.fib-8
21
but it sure is easier in Python! And I'm not picking on Hoon here, it's just pure functional programming that makes you think this way - which as a hacker is fun, but in practice is kinda inconvenient.I even wonder how much faster I actually made things. Let's see:
> =old now
> =res +fib 18
> sum.res
2.584
> (sub now old)
1.688.849.860.263.936
:: now with the non-memoized code...
> =before now
> +fib 18
2.584
> (sub now before)
1.125.899.906.842.624
Ha! My super improved memoized code is actually slower! That's because computing the copies of the map costs more than just recurring a bunch. This math should change if I try to compute a bigger fib number...Wait. Nevermind. My memoized version is faster. I tested it with the Unix time command. It's just that Urbit Dojo has a wierd way of handling time that doesn't match my intuition. Oh well, I guess I can learn how that works. But my point is, thinking is hard, and in Python or JS or C I only have to think in terms of values and pointers. And yes, that comes with subtle bugs where you think you have a value but you really have a pointer! But most of the time it's pretty easy.
Btw sorry for rambling on with this trivial nonsense - I'm a devops guy so this is probably super boring and basic for all you master hn swe's. But it's just a tiny example of the constant frustrations I've had trying to do things that would be super simple if I could just grab a reference and modify something in memory, which for better or worse, is how every imperative language implicitly does things.
FP proponents spotted a small number of problems which arose in certain specific OOP implementations and stubbornly decided that OOP itself was to blame.
Some FP proponents have pointed out that passing around instance references to different parts of the code can lead to 'spooky action at a distance.' For example, if references to the same child instance are held by two different parent modules and they both invoke state-mutating methods on the child instance, it becomes difficult to know which of the two parent modules was responsible for the mutation of state within the child instance...
The mistake of FP proponents is that they failed to recognize the real culprit of the flaws that they've identified. In the case of 'spooky action at a distance', the culprit is pass-by-reference.
If you keep that in mind and consider what OOP pioneers such as Alan Kay have said about OOP "It's about messaging," it becomes an irrefutable fact that this flaw has nothing to do with OOP but is merely a flaw in specific implementations of OOP... Flawed implementations which neglected the messaging aspect of OOP.
To summarize it simply; with OOP, you're not supposed to pass around instances to each other, especially not references. What you're supposed to pass around are messages. The state should be fully encapsulated. Messages are not state, instances are state. Instances shouldn't be moved around across multiple modules, messages should.
If you approach OOP with these principles in mind, and make the effort to architect your systems in such a way, you will see it solves all the problems that FP claims to solve abd it doesn't introduce any of its problems... Which are numerous and would require an entire article to enumerate.
Rust quietly has several other features in order to improve the quality-of-life of its ownership model. Two examples: if you have a mutable reference then Rust will coerce it to an immutable reference if one is required, and if you have a mutable reference then Rust will transparently re-borrow it when calling functions that accept mutable references in order to allow you to use the mutable reference more than once despite the fact that they do not implement Copy.
However, I don't think a GC will catch on in Rust in its current form. Rust's userbase likes it as a no-GC language. Plus a 3rd party library GC wrapper type can't save you from having to learn ownership and lifetimes used by literally everything else. Once you invest time to learn the "zero-cost" references, a runtime Gc is less appealing, and won't be as ergonomic than the built-in references.
Swift, OCaml, and Mojo are trying to add some subset of Rust-like ownership and borrowing, but in a simpler form.
Since it's not needed and it's massively worse than reference counting (assuming you only change reference counts when essential and use borrowing normally) due to the absurd behavior of scanning most of the heap at arbitrary times, there is no Rust GC crate in widespread use.
I get what you're trying to say, but that is provably false. As great as the OCaml compiler is, it currently is not capable of the aggressive optimizations that GHC can do with lists.
More often than not, the compiler mostly won't have enough static assertions to reliably generate machine code like that in a real world application (unless explicit mutation is used, of course).
> Functional programmers just trust that their compiler will properly optimize their code.
Precisely. This is why having safe local mutation as a language level feature can give more control to the programmer. We no longer have to rely on the compiler to correctly guess whether a routine is better expressed as an array or a cons list.
> The whole article is secretly about Haskell.
and ML, Koka, Clean, Mercury. The article is about allowing local mutation without breaking referential transparency at the language level.
"Stop using haskell" is a very shallow conclusion, IMO.
This cannot be true in general. There are machine code patterns for which arrays are faster than linked lists. The OCaml compiler, great as it is, won't turn linked list source code into array machine code. Therefore, there is idiomatic code in OCaml that will not be as performant as arrays.
> It gets fairly obvious when you realise that most Ocaml developers switch to using array when they want to benefit from unboxed floats.
This is one example why your statement above is not true.
You are misreading my comment. I’m not intentionally contradicting myself in two paragraphs next to each other (I’m not always the brightest but still).
The point is that contrary to what the article states ML developers are not avoiding mutations because they are uneasy to use but because they trust their compiler when they know it will do good. Proof is that in other case they will use mutations when it makes sense to do so because the compiler does not do a good job.
The first paragraph refers to the specific case my comment quotes just before: data structure traversal and storage of elements in a set.
Why not? If the compiler can see that you have a short-lived local linked list and are using it in a way for which an array would be faster, why would it not do the same thing that an array would do?
"The fact that most programming languages don’t give enough semantic information for their compiler to do a good job doesn’t mean it necessary has to be so. Functional programmers just trust that their compiler will properly optimize their code."
> Well, no, this is straight confusion between what’s expressed by the program and what’s compiled. The idiomatic code in Ocaml will end up generating machine code which is as performant than using mutable array.
I disagree with. There are different ways to get close to the performance of `Array.map` with lists (best case scenario you don't care about order and can use `List.rev_map`), but you will always have memory (and GC) overhead and so lists are strictly inferior to arrays for the presented use case.
this forces me to optimize the f# code by thinking in terms of low-memory, mutable data structures when interfacing with external libraries.
I like to think we helped with this several years ago when making official guidance: https://learn.microsoft.com/en-us/dotnet/fsharp/style-guide/...
But this is not true, is it? I thought that Haskell solution for side effects was just tagging the functions with side effects, and make the compiler handle functions with side effects as regular programs, no laziness, no memoization, no reordering. 'Monad' is just a mathematical structure that regular programs obey, also 'do notation' is just a style that allows Haskell programmers to write imperative style code.
Come'on FP hackers, prove me wrong!
Was the article updated since you wrote this? I don’t see the text you’re referring to.
> Well, no, this is straight confusion between what’s expressed by the program and what’s compiled.
You’re getting at an important point here, but then you seem to fall into this same trap when you write:
> … the many fairly convoluted tricks haskellers pull to somehow reintroduce mutations
Monads started out as a way to represent the semantics of effects in a mathematical context, to support formal representations of the semantics of programming languages that were more tractable from the perspective of analysis and proofs. Even mainstream compilers ended up using related techniques, like static single assignment, for which an equivalence to continuation-passing style exists, and they did this for the same kinds of reasons: tractability of analysis and to support automated transformations.
The use of monads for writing ordinary code - as opposed to language semantics - in Haskell exploited these techniques, allowing effects to be expressed in a purely functional way. But at its root, this is a rigorous way of expressing scenarios that require effects, it’s not just some sort of “convoluted trick”. There are benefits to doing this that go beyond just a hack to implement effects in a pure language.
Which is why it’s unlikely that people who understand these issues will “stop using Haskell”, despite the learning curve barrier it seems to cause (arguably because people tend to learn to program in ad-hoc ways, which Dijkstra notoriously bemoaned.) But many of the most powerful languages have such a barrier, it just takes different forms depending on the nature of the language.
I mean, that statement is untrue but even then I wasn't talking about monads here (monads are not a convoluted trick as far I'm concerned). I was thinking of lenses.
> Which is why it’s unlikely that people who understand these issues will “stop using Haskell”
Plenty of people who value being able to analyse program and do proof don't use Haskell. The heart of the debate is "Is being pure worth the cost?".
While everyone has to trust their compiler will make reasonable optimizations to some extent, there becomes a certain point of complexity where it becomes difficult to intuitively know if a "sufficiently smart compiler"[1] will properly optimize which is problematic.
I realize you're arguing Haskell is worse than Ocaml in this regard, but I'd argue it's harder to reason about how functional code will be translated into machine code than comparable procedural code in general.
I think it's funny that they make you write linked list code as a metaphor for generators, but it seems like it should be the other way round.
(Also, it has exceptions which are a bad language feature, and typed throws which are a worse one.)
As an aside, I watched an Alexis King stream (which I can't now find) in which she did a deep dive into effect systems and said something along the lines of: algebraic effect systems should not change their behaviour depending on nesting order e.g. Either<State<>> vs State<Either<>>.
Does bluefin have a particular philosophy about how to approach this?
> I'll be sure to check it out
Great! If you have any questions or thoughts then feel free to file an issue on the repo (https://github.com/tomjaguarpaw/bluefin/issues/new).
https://hackage.haskell.org/package/mtl-2.3.1/docs/Control-M...
You're right, through a stroke of luck it's possible to use `ExceptT e ST r` and either handle the exception part first, to get `ST r`, or handle the ST part first to get `Either e r`, so in that sense you can "mix" exceptions and ST. That doesn't work for all transformers though. If you have `Stream (Of a) ST r` then you must consume the stream first. You can't run the ST part an get a pure stream `Stream (Of a) Identity r`. So in that sense ST can't be mixed with other effects. Bluefin does allow you to do that, though.
Compare ExceptT e (StateT m a) and StateT (ExceptT e m a): if you just want your computation to have state and exceptions the difference shouldn’t matter.
> Those familiar with rust will notice that this is not how rust works. If a function takes T, and you have &T, you just cannot call that function. (Ignore Clone for now.)
I think this is wrong. An exponential type in Rust is a type that implements `Copy`. The analogy in Rust is:
fn linear_fun<T>(x: T) {
// ...
}
fn main() {
let foo = 5;
linear_fun(foo);
println!(foo);
}
And that compiles fine: `foo` is implicitly copied to maintain that `linear_fun` owns its parameter.You can ignore `Clone`, but ignoring `Copy` destroys the premise, because without it Rust has no exponential types at all.
EDIT: I agree Rust solves the issue of mutability fairly well. Furthermore, I think practical linear types can be added to a Rust-like type system with Vale's (https://vale.dev/) Higher RAII, where a "linear type" is an affine type that can't be implicitly dropped outside of its declaring module.
I don't know if this is what Vale does, but to enforce "can't be implicitly dropped outside of its declaring module" in Rust I would add two changes:
- Whenever the compiler tries to insert implicit drop code for a linear type outside of its declaring module, it instead raises an error.
- Type parameters get an implicit `Affine` auto-trait, like `Sized`. If a type parameter is `?Affine`, the compiler will refuse to insert implicit drop code. Standard library generic parameters will be `?Affine` wherever possible, e.g. containers like `Vec` and `HashSet` will have `T: ?Affine`, but the methods that could implicitly destroy an element like `HashSet::insert` will have plain `T`.
it somewhat strains the analogy because rust is implemented in a very elegant way (where references can be used multiple times because they implement Copy), but the analogy to exponentials in rust would be references. Just imagine clone and copy aren’t a thing, and that references have a special case that allow them to be used multiple times while owned values can be used at most once. The thing to note is that if you have an owned value, you can pass a reference to it to as many functions as you want (so long as those functions expect references). But if you have a reference, you can’t pass it to a function that expects an owned value unless the type provides you a way to make a copy.
You can imagine starting with this and then building up to rust in a nice way. You first implement passing an owned value as a move. Then first add types that can be used multiple times because they are still valid after a move. (the Copy trait.) And then you make references Copy since they meet that criteria.
With Fibonacci numbers, you cab just compute two if them outright:
def fib_(n: int) -> tuple[int, int]:
if n == 0:
return (1, 1)
prev, this = fib_(n - 1)
return (this, this + prev)
def fib(n): return fib_(n)[0]
Now there is no mutable state that survives between function calls, the performance is linear.With true memoization though accessing a previously computed value.would be constant time.
This is precisely what I did in my Hoon solution :) However, I wasn't aware that this approach is the standard way, and I'm glad to have learned that! Thanks
++ fib
|= a=@
^- @
~+
?: (lte a 1) a
%+ add
$(a (sub a 2))
$(a (sub a 1))
Try e.g. (fib 100) (don't try it without the ~+)The compiled code is itself memoizable at the VM execution level. This memo cache is transient within one system event (i.e., pressing enter after (fib 100) to get the result).
Thanks so much for this enlightening comment :3
Although I'm curious why Hoon doesn't just detect and cache identical computations by default. I guess it's a tradeoff, since using ~+ is more memory intensive, and you don't always want that either. Especially is Urbit itself is already fairly memory intensive.
Svelte has these stores that are globally reactive. The reactivity is convenient, but the stores can also be globally updated. This could result in chaos that I wished to corral.
So I tweaked stores so they remained globally reactive, but could only be directly updated internally (from "inside the nation").
To update the the state from outside the nation, an event (message) must be sent to the nation state, which handles the direct update.
It's interesting how FP community took one issue which affects many OOP implementations and decided to throw the baby out with the bathwater... Completely ignoring the reason why OOP was invented in the first place.
I've worked on several FP projects. First of all, they're never pure FP projects because that's just not possible. Those which were close to pure FP were a tangled mess of curried functions. Many modules had weird technical names with unclear responsibilities, most features had to traverse many different files/functions with lots of unrelated data passing through various modules with detailed child state which had to be passed down from high up in the 'module' hierarchy and traverse many layers to finally get to the module which would actually use that state. Higher level modules exposed complex functions with complex input and output parameters (had to concern themselves with the highly specific state requirements of child and parent modules; including configurations!) which would have to be modified for every single high level AND low-level requirement change... I'm not surprised why big companies who use FP started using monorepos; FP makes monorepos necessary because every small change requires updating a large part of the code hierarchy because everything is intertwined with everything else (tight coupling). The higher level components have to know exactly what state the child components expect because the data comes from a central place and must be passed down.
The alternative approach would be to have each child sync itself with a global store; but then, it implies that you no longer have a single owner for each piece of state; thus, you might end up with different children which have overlapping responsibilities which update the same part of the global state, creating 'spooky action at a distance' which makes it hard to trace where the change originated... This takes us back to the very reason why FP community invented FP in the first place. In other words, FP doesn't solve the 'spooky action at a distance' problem which was its entire raison d'etre.
Frameworks like React had to invent a whole host of technical concepts and tools to help manage and track complex state changes... E.g. Redux, Saga, etc... But then we still end up with a situation where all our components are tightly intertwined with the specific implementation of that specific framework since the number of variations of how people use Redux/Sagas, etc... across different projects/companies is substantial. This means that components built for one project are generally not immediately compatible with components built for a different project. If the two projects use a different global store mechanism or they use the same mechanism but have different names for the same actions (to dispatch state changes), the components will not be compatible across different projects without a lot of refactoring to bring the component into alignment with both projects. This isn't modular! This isn't easy to read! It's not maintainable!
Control-flow is an awkward choice there, as goto is not necessarily fundamental for how control-flow is run for a lot of code. And I have absolutely used labeled break/continue in Java before for a control loop that ran great until people tried to refactor to use more indirect control.
I also think it is interesting as I greatly prefer code where you can do basic left/right and top/down reading to know what is intended by the code.
At any rate, my original intent was to discuss code that is controlling something works really well if you embrace a metaphor for the code you are in.
class Foo {
var foo = [Int]()
var bar: [Int] {
get {
foo
}
set {
foo = newValue
}
}
}
let obj = Foo()
Calling `obj.foo.append(i)` in a loop takes linear time, while `obj.bar.append(1)` is quadratic time. `obj.foo.append()` does a borrow operation resulting in there never being more than one reference at a time, while `obj.bar.append()` does a get followed by a set, meaning that there's always two references to the array and every append does a copy on write. `let bar = obj.bar; for i = 0..<max { bar.append(i) }; obj.bar = bar` would do just a single CoW.Usually of course your computed properties actually do something so this difference feels less surprising. Swift does offer the undocumented `_modify` property accessor to let computed operations do borrows, but making it an official feature is waiting for noncopyable types to be finalized.
Because it doesn't. Doesn't mean it couldn't, if it tried hard enough. But it doesn't, as a statement of current fact.
What has to repeat is an identical subject (which will have a shape like [argument-to-fib other-local-context standard-library]) and formula (the compiled code for fib). Pretty much the only time this will ever happen is something recursing into itself. Most tail recursion doesn't reevaluate the exact same arguments multiple times. It just so happens that naive fib's exponential self-recursion into itself twice does do that.
So it wouldn't be useful to stick a ~+ on, say, factorial. Except! If you're, say, computing every factorial from 1 to 100 in a loop, it comes in handy - because now you can reuse your computation of, say, 50! when computing 51!, so it's just one multiplication.
But most Nock reductions by volume are not anything that repeats usefully.
"Call-by-need (or lazy) languages, such as Haskell, wear a hair shirt because their evaluation order is deliberately unspecified. Suppose that we were to extend Haskell by adding side-effecting “functions” such as printChar. Now consider this list
xs = [printChar 'a', printChar 'b']
(The square brackets and commas denote a list in Haskell.) What on earth might this mean? In SML, evaluating this binding would print 'a' followed by 'b'. But in Haskell, the calls to printChar will only be executed if the elements of the list are evaluated. For example, if the only use of xs is in the call (length xs), then nothing at all will be printed, because length does not touch the elements of the list.The bottom line is that laziness and side effects are, from a practical point of view, incompatible. If you want to use a lazy language, it pretty much has to be a purely functional language; if you want to use side effects, you had better use a strict language.
For a long time this situation was rather embarrassing for the lazy community: even the input/output story for purely-functional languages was weak and unconvincing, let alone error recovery, concurrency, etc. Over the last few years, a surprising solution has emerged: the monad. I say “surprising” because anything with as exotic a name as “monad” — derived from category theory, one of the most abstract branches of mathematics — is unlikely to be very useful to red-blooded programmers. But one of the joys of functional programming is the way in which apparently-exotic theory can have a direct and practical application, and the monadic story is a good example. Using monads we have found how to structure programs that perform input/output so that we can, in effect, do imperative programming where that is what we want, and only where we want. Indeed, the IO monad is the unifying theme of these notes."
https://www.microsoft.com/en-us/research/wp-content/uploads/...
Haskell was actually widely successful in showing that, no doubt about that.
First, obviously you have unsafePerformIO, so that everything can have side-effects.
Second, you have side-effects like using memory or using the CPU. You are not supposed to worry about those. Though a more serious side effect you do have to worry about is non-termination. Haskell doesn't track that in its type system.
You are right that the way input/output is handled can be described not as _side-effects_ but as _effects_ of interpreting values of the IO datatype.
There's a reason why Monads aren't exactly monadic and why IO was the original monad in GHC -as well as why non-lazy, non-super-pure languages never really go for Monads
The way JavaScript handles async is pretty close to monadic. Error handling via the local equivalent of Maybe / Either is monadic. Tuples are monadic. Sequences can be monadic. Etc. Some languages have a flexible enough type system to expose this (like Haskell), some don't. Some like Rust generally don't expose monads to the type system, but their users are aware enough of the shared monadic structure that you can see it reflected in the naming conventions for functions that do essentially the same thing (in a monadic sense) but for different structures.
Sure, ultimately the code mutates values in place, but the specific actions are modeled as pure operations
You could craft hand written assembly code which will be faster than optimised C code most of the time yet you don't. Plenty of programmers are perfectly writing imperative code in Java doing a ton of necessary boxing and unboxing. It's all a trade off between performance and usability. The fact is that the Ocaml compiler does a good enough job with functional code that its performance is actually comparable to imperative solutions most of the time.
This is called Single Static Assignment form and its denial of mutation is crucial to optimization, from common expression removal, to efficient register allocation, and all sorts of control flow analysis.
You probably wanted to write something like 'If you compile your _mutating_ program with LLVM, [...]'?
Now that hardware angle has not been very successful on the whole, and we are left with languages that end up feeling a bit out of place on the hardware we have today.
Another thing to note is that there is a lot of untapped potential in fb compilers. It’s suffering from underinvestment.
Can you elaborate? You can't stop people simulating exceptions with sum types, and if you have exceptions, why wouldn't you want them to be typed?
What is presumably talked about is that Haskell also has actual exceptions, generated by calling e.g. `error` or `undefined`.
The semantics of these are.. interesting, mostly thanks to lazy evaluation. For example, `fst (5, error "second") ` is safe to evaluate because the second half of the tuple is a thunk and does not get evaluated. Additionally, there is, to my knowledge, no way to handle exceptions in pure code, presumably due to the undefined evaluation order.
That said, I'm not sure what the alternative would be, because a function like !! (list indexing) can fail and dealing with its fallibility would be a big burden on the programmer.
Yes, exactly.
> The semantics of these are.. interesting, mostly thanks to lazy evaluation. For example, `fst (5, error "second") ` is safe to evaluate because the second half of the tuple is a thunk and does not get evaluated
Correct, and the semantics of loops is also.. interesting. For example `fst (5, last [1..])` is also safe to evaluate.
> What is presumably talked about is that Haskell also has actual exceptions, generated by calling e.g. `error` or `undefined`.
Well, I'm not sure, that's why I asked. I'm trying to understand what astrange meant by "it has exceptions which are a bad language feature, and typed throws which are a worse one". (Throwing exceptions from pure code should be left to such cases, that are impossible to recover from, in my opinion.)
> there is, to my knowledge, no way to handle exceptions in pure code, presumably due to the undefined evaluation order
Correct
> That said, I'm not sure what the alternative would be, because a function like !! (list indexing) can fail and dealing with its fallibility would be a big burden on the programmer.
Indeed. Even more so, what is one supposed to do when an invariant has been violated due to a programming error and there's no way to make progress? Haskell's exceptions are essential. It's even better when they're used in a well typed, well scoped manner, such as provided by my effect library Bluefin
https://hackage.haskell.org/package/bluefin-0.0.6.0/docs/Blu...
That's why I wanted to understand more about what astrange meant. It doesn't match my understanding!
It will do a good job, yes. Will it do the best possible job compared to some other algorithm or data structure? It can't. Not in general.
And maybe not in the specific case either:
> The first paragraph refers to the specific case my comment quotes just before: data structure traversal and storage of elements in a set.
So, this: https://ocaml.org/play#code=bW9kdWxlIE15X2R5bmFycmF5ID0gc3Ry...
$ for len in 5_000_000 10_000_000 25_000_000; do echo "-- ${len} elements --"; ./a.out list $len; ./a.out dynarray $len; ./a.out my_dynarray $len; echo; done
-- 5_000_000 elements --
list: 0.191551 sec
list: 0.196947 sec
list: 0.192806 sec
dynarray: 0.301362 sec
dynarray: 0.268592 sec
dynarray: 0.266118 sec
my dynarray: 0.163004 sec
my dynarray: 0.142986 sec
my dynarray: 0.143634 sec
-- 10_000_000 elements --
list: 0.377447 sec
list: 0.367951 sec
list: 0.312575 sec
dynarray: 0.607158 sec
dynarray: 0.582378 sec
dynarray: 0.538621 sec
my dynarray: 0.319705 sec
my dynarray: 0.296607 sec
my dynarray: 0.286634 sec
-- 25_000_000 elements --
list: 0.971244 sec
list: 0.953493 sec
list: 0.922049 sec
dynarray: 1.515892 sec
dynarray: 1.319543 sec
dynarray: 1.328461 sec
my dynarray: 1.119322 sec
my dynarray: 0.971288 sec
my dynarray: 0.973556 sec
-- 50_000_000 elements --
list: 1.852812 sec
list: 1.848514 sec
list: 1.505391 sec
dynarray: 3.065143 sec
dynarray: 2.941400 sec
dynarray: 2.672760 sec
my dynarray: 2.115499 sec
my dynarray: 1.963535 sec
my dynarray: 1.995470 sec
-- 75_000_000 elements --
list: 2.942536 sec
list: 2.910063 sec
list: 2.354291 sec
dynarray: 4.567284 sec
dynarray: 4.342670 sec
dynarray: 3.979809 sec
my dynarray: 2.528073 sec
my dynarray: 2.225738 sec
my dynarray: 2.226844 sec
A simple dynamic array implementation (my_dynarray) beats a list over a wide range of lengths. But not at all lengths! OCaml's built-in Dynarray is not competitive, but that's because it wants to make certain strong guarantees.To be clear, I agree with your general point that we can do just fine writing nice clean pure functional OCaml code for most of our code and can hand-optimize where needed. But your very specific claims rub me the wrong way.
if d.length = Array.length d.values then begin
d.values <- Array.(append d.values (make (length d.values) x))
end;
the array reallocation should actually be: if d.length = Array.length d.values then begin
let new_array = Array.make (Array.length d.values * 2) x in
Array.blit d.values 0 new_array 0 (Array.length d.values);
d.values <- new_array
end;
otherwise we allocate about a third more memory than needed. It's telling that even with this performance bug the dynamic array was broadly better than lists.New results for the previously slowest cases:
-- 25_000_000 elements --
list: 0.977002 sec
list: 0.963903 sec
list: 0.950473 sec
dynarray: 1.476165 sec
dynarray: 1.281724 sec
dynarray: 1.343222 sec
my dynarray: 0.872558 sec
my dynarray: 0.755902 sec
my dynarray: 0.753746 sec
-- 50_000_000 elements --
list: 1.914777 sec
list: 1.886989 sec
list: 1.542614 sec
dynarray: 2.922376 sec
dynarray: 2.783559 sec
dynarray: 2.537473 sec
my dynarray: 1.725873 sec
my dynarray: 1.545252 sec
my dynarray: 1.515591 sec
-- 75_000_000 elements --
list: 2.827154 sec
list: 2.835789 sec
list: 2.318733 sec
dynarray: 4.354404 sec
dynarray: 4.150271 sec
dynarray: 3.781488 sec
my dynarray: 1.887360 sec
my dynarray: 1.929286 sec
my dynarray: 1.814873 sec
This turns an uneasy head-to-head into a clear win for dynamic arrays. Honestly, how could it be otherwise?It can't be otherwise if you're just assuming a straightforward compilation model where your written roughly reflects the assembly code that's generated. This just isn't the case with better compilers though.
For instance, Haskell can often perform rewrites or fusion passes that entirely eliminates the loop and all intermediate data structures, effectively giving a near-infinite speedup compared to alternatives. You typically can't perform such optimizations when side-effects are in the middle of the computation, for numerous reasons.
Also, List.map is not optimised in Ocaml. It uses a naive implementation and not a tail call. You have to use rev_map to get good perf.
The exact wording in the article is "let's say you're iterating over some structure and collecting your results in a sequence". You are interpreting a lot into this. Also, your description is of a map (reversed, and expressed as a fold). Anyway, where is your benchmark?
> Also, List.map is not optimised in Ocaml. It uses a naive implementation and not a tail call.
You are again contradicting yourself. Previously you were praising OCaml's optimization capabilities and now you are questioning them. Specifically in a case where there is a magic optimization that is explicitly motivated by List.map: https://ocaml.org/manual/5.2/tail_mod_cons.html . A magic optimization implemented using, guess what, mutation.
let rec f evens odds = function
| [] -> (evens, odds)
| x :: xs ->
if x mod 2 = 0 then f (x :: evens) odds xs
else f evens (x :: odds) xs
OCaml optimizes this fairly well and will compil it down to a single loop with two variables. If you reverse the lists there is going to be additional loops (and corresponding allocations).However OCaml also provides the "tail mod cons" optimization that allows to get some of the benefits of TCO without needing to reverse the list (this is implemented as a program transformation that uses mutability under the hood), and that one will only work if you are building a single list.
But sure, in the `fold` scenario where you don't know the number of results in advance (you are more likely to know if you use imperative data structures, e.g. `Hashtbl.length` is constant-time whereas `Map.cardinal` is not), lists might be faster than growing arrays with copies. They are still going to use more memory, and they are unlikely to to be faster than a rope-like structure that grows with no copies.
A parallel `Array.map` still computes a sequence, even though it may not compute in sequence.
But... you don't just want that. You almost certainly care whether state changes are discarded when an exception is thrown. I don't claim that the types there are the most obvious or natural way to specify that, but there is a meaningful difference that shouldn't be handwaved away.
If effect A is invoked before effect B, effect A should happen and stay that way: if I update state before an exception is thrown, the update should persist. If I send a network message before an exception is thrown, the network message is still sent. If I launch the nukes before an exception is thrown, the missiles are still flying.
If I want to batch up state updates to only happen together as a unit, I'll use a transaction effect.
Also, other ways of organizing dynamic not-quite-an-array-but-not-quite-a-linked-list data structures exist. It could be a dynamic array (or linked list) of dynamic arrays to eliminate repeated copying of the oldest elements.
(You are right on the general hand-waving level that such optimizations are sometimes possible. But there are no fuseable intermediate data structures in this particular problem.)
I haven't used Bluefin, but don't Bluefin exceptions suffer from the same issue where it, essentially, 'infects' the program flow? For example, `fst (5, error "second")` seems like it would translate to `fst $ (5,) <$> throw e "second"`, where you would also need to pull an `e` from somewhere. More realistic, something like head would probably be quite awkward:
head e [] = throw e "head: empty list" head e (x:xs) = pure x
You now need to pass in the exception handle, and the return value is now `Eff es a` instead of `a`. This means that you cannot just use the result, so you will likely need to fill your program with monadic stuff like do-notation, <$> and <$>, complicating the program flow and likely also reducing laziness.
fst (5, error "second")
and head [] = error "empty list"
head (x:xs) = x
are simply not things you should be writing. They're not good program design and it's not a weakness that Bluefin doesn't support them unaltered.The OCaml compiler disagrees with you ;)
It also won't ever turn a cons operation into a dynamic array append.
* Errors must be explicitly listed as part of the function signature. Checked exceptions and the equivalent for exceptions but they are rarely used in practice. I think Android uses them, but they were so unpopular in C++ that they removed them from the language!
* The syntax to catch and handle errors is very different and more more verbose for exceptions. It can also make flow control a real pain in some languages where you can't declare a variable outside the try body (e.g. references in C++).
* Result errors need to be explicitly handled whereas exceptions are silently propagated by default.
Even though they're similar enough that you could translate one to the other in most cases, they're different enough that saying one is "an emulation" of the other is just stupid.
In my experience Result-based handling is far superior with two exceptions:
1. In functional code like map & filter where it can become quite awkward to explicitly deal with returning errors.
2. It's hard to get a stack trace from where the Err was created rather than from where it was unwrapped. Less of a problem with exceptions which record a stack trace from where they were thrown (in most languages anyway - C++ is an annoying exception).
I did say that indeed. Am I to conclude doing so was stupid? If so I would find that very rude.
(For what it's worth I was trying to understand what astrange meant by "it has exceptions which are a bad language feature, and typed throws which are a worse one" and offering that characterization as a way of trying to tease out exactly what he/she meant. Your response contains many interesting points and I would otherwise be interested in discussing with you further, but I'm not too inclined to now that you have suggested you might think I'm stupid.)
In principle, static analysis could identify unhandled exceptions, then trace the exception, then make that information available to the top-level "Err returned from main" handler. In practice, that's never going to happen in Rust.
The problems with GC are threefold, and why you might not want it in a systems language:
1. GC requires more memory than strictly necessary to be efficient (usually about 1.5x - 2x the amount you absolutely need). You're basically trading runtime efficiency for memory.
2. GC performance is harder to predict and reason about than certain other allocation strategies
3. GC languages tend to encourage excessive heap allocation for various reasons, ending up with much more junk than a typical Rust or C program that has a similar amount of entities
Note that item 2 is the one that's least understood. The best part about GCs is that they make heap allocation trivial, and they make de-allocation a no-op. In contrast, both malloc() and free() are extremely complex and costly operations. The GC does impose a cost on every pointer write, similar to (but typically less than) the overhead of Arc<T> over a T*, but that has a very uniform and predictable cost. The problem of unpredictability only comes in the collection phase, and is mostly related to (a) when the collection happens, (b) how much data actually has to be scanned (how many live objects are present on the heap and stack), and (c) what type of collection needs to happen (is it enough to collect from this thread's young generation, or do you need to collect all generations from all threads).
Note that many of these problems are in fact solvable, and there actually exist GCs with constant predictable collection times, suitable even for realtime applications (which malloc/free don't support). They are very sophisticated technology that no one is distributing for free though (e.g. you have to pay Azul for a realtime-compatible Java).
I've worked on tiny embedded systems (.net micro framework) where for a given usage pattern the GC was perfectly predictable, as it should be.
Assuming you use a set of slabs of fixed size objects and keep free objects in a linked list, both malloc and free are trivial O(1) operations.
Destructors with cascading deletions can take time bounded only by memory allocation, but you can solve that for instance by destroying them on a separate thread, or having a linked list of objects to be destroyed and destroying a constant number/memory size of them on each allocation.
I'm not sure what you mean by this being the least understood. It seems like it's very well understood: GC introduces latency if you want good throughput, or it reduces throughput if you want excellent latency (sub-100us is possible).
Of course, that doesn't mean you can predict what specific throughput or latency properties your specific program will have, except for GCs that have maximum bounded latency guarantees.
What do you mean? Aren't GCs both less efficient at run time and use more memory?
Take the extreme case of a program where every allocation is permanent. The GC allocation will be much much faster than malloc() (GC allocation is normally just bumping a pointer to the end of the heap, while malloc() typically does a lot more work to segregate objects by size etc), and collection will never run. So, overall, the program will be significantly faster.
Edit: More realistically, a program that typically produces 1:1 live mmeory:junk, but occasionally spikes, will consume 0 GC time in normal operation (the GC will just let the junk sit there), while free() needs to run on all that junk as soon as it was created, otherwise it leaks forever.
Also, the fact that GCs can defer collection to later than the object going out of scope can often make them more efficient than a traditional free(). For example, if you have a large graph data structure that is now junk, free() needs to do work that's O(size of graph) to free all the nodes. A copying collector, the most popular design, never even looks at the graph, so it frees it in O(1). Edit: of course, the converse is also true: if the graph survives N collections, the copying GC has to do O(N * size) work, where malloc()/free() do nothing at all.
At this point I must mention that paged memory can get you just enough movability that catastrophic fragmentation of this kind is avoided[2] with high probability. Paging tricks can be seriously expensive on a modern multicore processor, though, so I’m not sure this is the way forward. (The paper report only 1% perf overhead for Firefox; I don’t know if that means Firefox is less multithreaded than it could be or what.)
Manual memory management as a solution to the fragmentation problem trades that off, not knowing anything about when free might be called, and so has to lean towards optimising space and immediate time rather than throughout. But there’s still a memory manager behind the scenes that has to deal with fragmentation as well; there’s no get out of jail free card for that, and that complexity is still hidden.
(Helpful memory usage disciplines like arenas/pools have their desirable properties for the same reasons: it’s a discipline on when you free memory in order to avoid fragmentation.)
So, there’s a performance hit of some kind with optional tuning that takes no expertise. Many people would go for those tradeoffs for some applications. Especially if they got to keep using safe, no-GC code in their libraries with their efficiency benefits.
I curate a list of what kinds of ownership people actually want: https://gist.github.com/o11c/dee52f11428b3d70914c4ed5652d43f...
It's been 6 years since I first posted it publicly, and neither I nor anyone giving suggestions has ever actually found a use for GC.
Rust uses plenty of reference counting. You could replace most of that with a GC; depending on your GC implementation and your reference counting implementation, and your requirements in terms of latency and throughput and ability to handle cycles.
> Since it's not needed and it's massively worse than reference counting (assuming you only change reference counts when essential and use borrowing normally) due to the absurd behavior of scanning most of the heap at arbitrary times, there is no Rust GC crate in widespread use.
There are choices. You can have real time garbage collectors. And if you want to handle cycles, you need some kind of tracing anyway.
Also, if you only care about throughput and not about latency, garbage collection can easily be faster than reference counting.
If you don't allow cycles, you can also take advantage of that in your garbage collection. See how Erlang does it. (In Erlang, the garbage collector relocates your objects so that they are in topological order, so all references only point forwards in memory. You can combine that with generational gc, too, of course.)
Lol, what? Maybe don’t go asserting stuff you clearly know little about. Reference counting is a fine tradeoff for manual memory-managed languages, but it is absolutely smoked out of the water by a tracing GC on most counts. It’s almost like JVM, V8, etc engineers know a thing about the topic and don’t have RC for a good reason.
Tracing GC doesn’t burden the mutator threads with additional work, almost everything can be done in parallel, resulting in vastly better throughput. Imagine dropping the last reference to a huge graph, one can actually observe it when exiting a c++ program, it might hang for a few seconds before returning control to you, as all the destructors are recursively called, serially, on the program thread, literally pointer by pointer jumping across the heap, the very thing you are so afraid of. And I didn’t even get to the atomic part, bumping a number up or down with synchronization between CPUs is literally the slowest operation you can do on a modern machine. Tracing GCs elegantly avoid all these problems at the price of some memory overhead. None of these GC algorithms (yes, RC is a GC) is a silver bullet, but let’s not joke ourselves.
If you use reference counting properly in a well-designed language then it's obviously better than GC since it's rarely used, fast, simple, local and needs no arbitrary heuristics.
The destructor cascades are only a problem for latency and potential stack overflow and can be solved by having custom destructors for recursive structures that queue nodes for destruction, or using arena allocators if applicable.
Like, as I explicitly wrote, it is probably the correct choice for low-level languages close to the metal, that want easy compatibility with other languages through FFI. But the method itself has still got a much slower throughput than a tracing GC, when used in a similar manner. Anything else is a useless comparison, like is a bicycle better than a tank.
What they lose is that they introduce also a certain guaranteed loss of available cpu time, due to optimizing for latency and real-time predictability.
Plus, the free list has to occasionally be walked to actually free pages back to the OS. If you don't, then memory is never freed by free(), it is only marked for potential reuse.
There are several popular implementations of malloc (and their corresponding free), and they are all quite complex and have different tradeoffs. And, in fact, none of them is any more suitable for high performance or realtime code than a GC is. The golden rule for writing this type of code is to just never call malloc/free in critical sections.
And I will note that probably the most commonly used malloc, the one in GNU's glibc, actually uses Linux locks, not atomics. Which means that you are virtually guaranteed to deadlock if you try to use malloc() after fork() but before exec() in a multi-threaded process.
Which means calculating how long free going to take for a given object is incredibly difficult. Which is my entire point. Of course it's not non-deterministic unless someone 's destructor call s some random code but it is not easy to calculate and in the real world is basically unpredictable.
Spitting up a separate thread and doing an unknown amount of work is no more predictable than doing on the main thread.
Fundamentally destructors can do whatever the hell they want which means you never know how long destruction is going to take.
And of course if you're freeing up an object that has a linked list inside of it and you need to clear out the link list, now you're jumping all around memory and that's never fun, and the run time sure as heck is not constant based on what needs to be paged in and out and what exists in cache lines where.
I'm not saying these problems are unsolvable for a given use case and there are good reasons why custom allocators abound throughout the industry, My main point is that manual memory management does not necessarily lead easy to predict run times for freeing up memory.
This especially true because a lot of developers think that malloc and free are just magic somehow.
In reality, garbage collectors often aren't that complicated, and when it comes to large complicated user applications, you're going to be spending a lot of time in the allocator and deallocator no matter what memory management schema you choose to use.
(Edit: and then there's fragmentation which is the death of many manually managed memory systems! A naively Java or c-sharp application can run for far longer than a naively written C++ application!)
You can run your garbage collector on a separate thread, too. Or use a real time garbage collector.
That is correct, but the issue is not with reference counting, but rather with having unnecessary extremely frequent RC/GC operations.
Once frequency is reduced to only necessary operations (which could be none at all for many programs), reference counting wins since its cost is proportional to the number of operations, while GC has fixed but large costs.
PDP-11 unix (without split instruction/data space) had 32kB of memory for userland process, total, plus another 24kB for kernel (8kB are reserved for I/O devices).
The memory management interface that gave us malloc() and free() ultimately boiled down to brk() and sbrk() which simply set a pointer describing where stack and heap met in memory. Using many small programs, you'd have as little as possible of that 32kB used by code, and prefer to simply allocate using bump pointer and never free (better reuse object pools, which is also a technique used with GCs). A program in a pipeline would mostly allocate some stuff at the start plus buffers, and while some of the early bits might become garbage it doesn't really matter because you can't share parts of your 32kB block (or at best you can use 8kB segments). So you do "missile-style GC" and just use a bump-pointer allocator (brk() + sbrk() + sizeof combo), then exit to OS when your work is done and the entire memory space gets cleaned.
EDIT: Some GCs (like in later versions of Symbolics Lisp Machines, or presently in SBCL behind some experimental options) provided multiple arena support explicitly to be able to force-declare everything allocated in an area as garbage.
And in fact C# has always supported "value objects", which do get allocated in-place, Java is adding the same support probably in the next release, and Go actually defaults to it.