Why Rust's ownership/borrowing is hard

Why Rust's ownership/borrowing is hard(softwaremaniacs.org)

122 points by wkornewald 10 years ago | 64 comments

jerf 10 years ago |

"Rust's ownership/borrowing system is hard because it creates a whole new class of side effects."

I'd submit it doesn't create them... it reveals them. They've always been there. Almost every other language fails to shine the light on them, but that doesn't mean they aren't there. All GC'ed languages still have issues of ownership, especially if threaded, and all non-GC'ed languages have all the issues Rust has... it's just that the language doesn't help you.

dpc_pw 10 years ago | |

After coding in Rust, I got paranoid about writting any C. I always knew the issues of ownership, concurrent access, allocation etc. are there, and I had to think about them anyway, but after explicitly dealing with them in Rust all the time, I see them now so clearly. They're everywhere, and without language support it's so easy to overlook eg. iterator invalidation issues, or aliasing.

Sure, in practice, if you keep your code neat and well designed, and hopefully not too big, it's hard to really trigger such bugs, but in a bigger codebase it's so easy to overlook them...

gnuvince 10 years ago | | |

Worse, when in C you finally convince yourself that a particular piece of code is free of ownership issues, another programmer can come in, add a single line of code or do a "helpful" refactoring and suddenly the issues are back.

klodolph 10 years ago | |

Don't be so quick to say that Rust is doing the "right thing" here. An early example in the article shows one of the big Rust gotchas: you want to borrow a reference to a part of a structure, but if you encapsulate that in a function which takes a reference to the enclosing structure, the entire enclosing structure is borrowed. This isn't a revealed side effect, this is an invented side effect that is an artifact of the type system.

nostrademons 10 years ago | | |

I thought that was one of the most fascinating parts - Rust's borrow-checker enforces the Law of Demeter and Principle of Least Privilege as a side-effect.

Code that takes a full structure when it only needs to operate on a part of the structure is badly designed. It's not conveying the full information about the data that it actually needs, which means that unexpected dependencies can crop up, implicit in the body of the function, as the code is modified later on. This is behind a lot of long-term maintenance messes; I remember a few multi-year projects at Google to break up "data whales" where a single class had become a dumping ground for all the information needed within a request.

Thing is, we all do it, because taking a reference to a general object and then pulling out the specific parts you need means that you don't have to change the function signature if the specific parts you need change. This saves a lot of work when you're iterating quickly and discovering new requirements. You're trading ease of modification now for difficulty of comprehension later, which is usually the economically wise choice for you but means that the people who come after you will have a mess to untangle.

This makes me think that Rust will be a very poor language for exploratory programming, but a very good one for programming-in-the-large, where you're building a massive system for requirements that are largely known.

yosefk 10 years ago | | |

I guess I like Rust (certainly more than I like C++), but I agree with you, and disagree with both sister comments, in that I think type systems do introduce artifacts which are then defended as "the right thing actually if you think about it the way the type system does" but they still are artifacts.

So today this helper function borrows one member and tomorrow it might need to borrow two, and every time you're supposed to change the interface to explain exactly what's happening. This is really tedious for a private helper and can be seriously problematic for a public interface, you don't do it in garbage-collected languages (which move the issue from the static type system into the dynamic run time system) and you don't do it in unsafe static languages (which move the issue from the static type system into your brain, which gets things right, some of the time.) The cost of moving it into the type system is as real as the benefit (not necessarily as big or small - it's hard to quantify these things and you need context - but certainly just as real) and it'd be great to see both acknowledged instead of having one or the other denied.

KMag 10 years ago | | |

However, there are real lifetime and thread-safety side effects of passing a reference to the outer structure when you only need a reference to the inner structure.

In a GC'd language without an effects system, passing in a reference to the outer structure would still prevent the GC from freeing the outer structure (depending on the ABI, even if you null out the reference passed outer reference inside the function, the value passed on the stack might be immutable) and would also mean that you need to be careful about later changes causing non-thread-safe mutations to the outer structure. If your language isn't GC'd and doesn't have an effects system, then you need to manually keep track of the borrowing.

These aren't artifacts of Rust's type system; they're genuine side effects that are present but more subtle in other languages.

asQuirreL 10 years ago | | |

Well, I would say it exposes a flaw in your interface, which would have given the function access to more information than it required. You know it's fine because you are aware of implementation details, but rust is telling you that your interface doesn't reflect that.

jerf 10 years ago | | |

I didn't say Rust is doing the right thing. And it may create other issues where it does not 100% correctly reveal the issue.

However, I'd still say that issues arising in sharing a composite structure in pieces have always existed. Rust's solution may or may not be correct, but the issue is not something it is creating. Adopting Rust does not mean adopting a brand new set of problems that never existed before. It means seeing them clearly for what is probably the first time.

Animats 10 years ago | |

Exactly. C programming has three big questions: "How big is it", "who releases it", and "who locks it". The language gives little help with any of those issues. C++ tries to address all three, but the mechanisms were all painfully retrofitted using templates and they leak.

In the example in the article, "is_origin(point)", the code for which is not shown, is clearly bogus. A function that's just a predicate should not consume its input. It should use read-only access by reference.

One big advantage of Rust is that, because the ownership checking is safe, you don't have to make copies of things just to simplify memory allocation control. In some C++ GUI libraries, strings are copied again and again to prevent memory allocation errors. Rust should be more efficient. It's going to be interesting to see if Servo puts a dent in browser memory consumption. It's insane that browsers now can need more than 1GB of RAM. There have to be multiple copies of the same data.

pcwalton 10 years ago | | |

> It's going to be interesting to see if Servo puts a dent in browser memory consumption. It's insane that browsers now can need more than 1GB of RAM. There have to be multiple copies of the same data.

Most of a browser's memory consumption usually consists of JS and DOM objects. In effect, the pages you're visiting are actually doing the bulk of the allocations, not your browser.

pron 10 years ago | |

> I'd submit it doesn't create them... it reveals them. They've always been there.

Hopefully, but not necessarily. Any (decidable) type system rejects well-typed programs (i.e., it "uncovers" problems that are not actually there), and the borrow checker is no exception. You will write some correct Rust programs that the borrow-checker just can't verify. This means that you will need to explain yourself in more detail (through more work) to the type checker, even though there was no mistake in your program. I think this is a good thing (depending on your requirements), but it isn't free.

rntz 10 years ago | |

> All GC'ed languages still have issues of ownership, especially if threaded,

This is not true. Ownership is only relevant in the presence of mutability. In a GC'ed language where most or all data is immutable, one rarely needs to think about ownership. In Rust, one needs to think about it all the time.

Rust is certainly a leg up compared to trying to write multithreaded C or C++ code, but that doesn't mean its approach is free of drawbacks.

Empirically, even in Python - a mutable-by-default language - I find myself rarely having to think about ownership. That could be an artifact of the kind of Python programs I find myself writing, though. I'd be interested to hear other people's experiences on that front.

Gankro 10 years ago | | |

Your operating system, and the outside world, is a shared mutable resource -- particularly file systems and network connections. Even in Haskell, you can explicitly close a file (the docs recommend it!), and if you shared that file with anyone else they'll start erroring out.

That said, programming languages (on their own) can't really do anything about other processes/computers interacting poorly with yours! The ultimate semantic of ownership is still there, though.

There's also the ST Monad, which provides safe mutability in Haskell by enforcing that the mutated state doesn't escape a region of the program. This is literally the same idea as the borrow checker.

jerf 10 years ago | | |

Stipulated about immutability.

"I find myself rarely having to think about ownership. That could be an artifact of the kind of Python programs..."

Python still can have action-at-a-distance, where things unexpectedly change as a result of executing code. The single-threaded analog to a race condition is less severe because it's at least deterministic, but can still make programming difficult to understand.

sixbrx 10 years ago | | |

> Ownership is only relevant in the presence of mutability. In a GC'ed language where most or all data is immutable, one rarely needs to think about ownership.

If mutability isn't needed, then you would declare the self parameter as a simple (non-exclusive) borrow, and you can borrow from it or parts of it freely without inteferance from the compiler.

hinkley 10 years ago | |

When people asked my why I used Java this was pretty much my answer. I got tired of having to design all of my code around figuring out who owned the arguments, and what happened when it wasn't always the same answer. I can call a method that borrows the value from one that takes ownership, but I can't do vice versa.

I have six other things that will kill my app faster than a memory leak, but I have to design this shit first? No thank you. Rust is on my list of things to learn in 2016 and I'm hoping its borrowing semantics will feel like a solution to this problem without having to sign up for nondeterministic application pauses in the bargain.

pdpi 10 years ago | |

Same as Haskell and IO, really. Every interesting language has side-effects, Haskell just forces you to be explicit about where you're using them.

jupp0r 10 years ago |

Coming from a mainly C++ background, I find that the Rust language makes best practices in languages with manual memory management (and to some degree also in garbage collected environments) explicit. This is a great property of the language and it definitely changed the my C++ programming. You can see it as automatization of the more boring aspects of code reviews.

Manishearth 10 years ago | |

I am from a more mixed background, but I have had my fair share of C++ before I learned Rust.

Now when I code C++ my Rust knowledge is a double edged sword. On one hand, I have a much better idea on how to manage my data in C++. I had this discipline before learning Rust, but I didn't have explicit rules to it; it was just a ... nebulous bunch of idea about how data works. Now it's explicit. On the other hand, I am absolutely terrified when writing C++ code (something I would do with ease in the past). Well, not exactly, but it's hard to accept somewhat-unsafe code (which is probably safe at a macro level -- i.e. safe when looked at in the context of its use) and while I can see that something is safe, I can also see how it could become unsafe. And I fret about it. Rust trains you to fret about it (more than you would in C++, that is), and simultaneously Rust handles it for you so you don't have to fret about it :) C++ doesn't handle it, but you still fret about it sicne Rust taught you to.

I guess it's a "Living is easy with eyes closed" thing :P

dikaiosune 10 years ago |

There were also some interesting comments yesterday when this was posted to the Rust subreddit:

https://www.reddit.com/r/rust/comments/45gcmh/why_rusts_owne...

gulpahum 10 years ago | |

There was one good point, which was also my conclusion:

"Given all that, I wonder if it makes sense to prefer plain old functions most of the time. Is that right, or am I overlooking something?"

The response was yes. Avoid impl methods which take a mutable self.

steveklabnik 10 years ago | | |

I wouldn't characterize it this way. There was a whole thread on just this question: https://www.reddit.com/r/rust/comments/45j6ua/why_dont_we_pr...

viperscape 10 years ago |

Try a temporary variable for state, then use the consume in a scoped let block.

cm3 10 years ago |

Why is move the default? In many code bases the number of immutable references outweighs that of pointers.

jonreem 10 years ago | |

In early rust, you actually did need to explicitly write `move` to move a value. However, this was extremely annoying as you move values a lot so moving was changed to be the default, which is far more tolerable.

EDIT: There still is a `move` keyword, but it is used to indicate that closures should take ownership of their environment vs. just borrow values from it, not to move individual values.

steveklabnik 10 years ago | |

That way the syntax is the same everywhere. Reference by default would look different, and then you'd need a sigil for move, etc.

pcwalton said that back in the 0.1 days this was actually implemented, and it was very confusing.

cm3 10 years ago | | |

I remember it but the language changed so often that I didn't recall this right away. At least it's tried and scrapped and not based on some opinion.

dikaiosune 10 years ago | |

For me at least, it's nice to have a language feature which tells me "BTW, the borrow checker is about to start caring about how long this memory is around." As opposed to the opposite, which is that everything you ever do will trigger the lifetime checks for passing arguments, and you would have to explicitly tell it to go away and that you want ownership to move.

Edit: I'm referring to the `&` operator which creates a reference/pointer to the memory it precedes.

cm3 10 years ago | | |

You mean it's forcing you realize this and restructure your code to free resources early like an eager collector?

imtringued 10 years ago |

>Nothing in the experience of most programmers would prepare them to point suddenly stopping working after being passed to is_origin()!

I'm not even a Rust programmer and only read three paragraphs about the borrow checker and I instantly saw that point is moved.

jorgecurio 10 years ago |

Should I learn Rust in 2016? What are you guys building with it and why Rust in particular?