-1
It's IO, CPU and Memory hungry, and it's distributed.
C is fast because it's close to how CPU and memory actually work. Go gives you 95+% of that plus easy to learn, easy to use language. A new person could start contributing useful features and bug fixes immediately. A senior person could get C-level performance.
More and more of our code is moved from C to Go, with very little performance penalty, but with a lot more safety and ease of use.
Our customers benefit, and our company makes more money.
In the end, that's what software is about.
Out-of-order execution, cache hierarchies, branch prediction, virtual memory, pipelining, vector instructions, ILP, NUMA are all pretty transparent to the C spec.
Trying to accommodat hardware quirks with C feels like blackbox engineering. It's certainly better than with managed languages but still....
I don't know of any other way of manipulating the cpu as closely as C and assembly. Do you?
I said C is "close to" how a computer works. C is an abstraction, and most (all?) cpu architectures pretends to be c machines. There was a talk (Kelly?) where even assembly is nowhere close to what a cpu actually does.
I don't know of any other way of manipulating the cpu as closely as C and assembly. Do you?
Don't feel too silly. Russ Cox, one of the technical leads on the Go language, made the same mistake in the regexp package of the standard library.
After profiling for a bit, I discovered that suddenly a lot of time was spent in isinf on Clang and no time in GCC… Clang was emitting a function call where GCC wasn’t. I happened to randomly change isinf to std::isinf (it’s a random habit of mine to put std:: in front of these C functions). Suddenly the regression disappeared! I guess on Clang only std::isinf was a compiler intrinsic while GCC recognized both? Anyway, that’s my small-change optimization story.
And did you learn your lesson about making random changes that "shouldn't matter" without proving they don't matter? :)
I find that once I spend the time to make these changes correctly, they are not worth the time to make correctly.
- The Go programming language spec https://go.dev/ref/spec
- Effective Go https://go.dev/doc/effective_go
- Advanced Go concurrency patterns https://go.dev/talks/2013/advconc.slide#1
- Plus many more talks/slides https://go.dev/talks/
Changing Ruleset from being []Rule to []*Rule, which would mean we no longer need to explicitly take a reference to the rule.
Returning a Rule rather than a *Rule. This would still copy the Rule, but it should stay on the stack instead of moving to the heap.
> However, both of these would have resulted in a breaking change as this method is part of the public API.The problem with heap allocated objects could be due to the incorrect public API.
The change that improves performance also gives out pointers to the actual elements of Ruleset itself permitting the caller to change the contents of Ruleset which wasn't possible before the speed-up. Perhaps you're already aware since change to []*Rule was being considered.
It's debatable what API guarantees existed around this though; most of the time this would be unspecified.
As long as the global slice is never mutated, the current approach is probably fine, but it is definitely a semantic change to the code.
I’m also fairly sure Go uses RVO here too, which cuts down on the number of times the object is copied around, but again, it’s irrelevant to the discussion of heap allocations. Copying the object isn’t the performance problem here, needlessly allocating a very short-lived object on the heap over and over is.
I agree with you for the garbage collector. By design, a GC allows you to willy-nilly copy without thinking about the consequences.
It's best to treat each struct as a "value" or "pointer" type, and use one or the other consistently for each type. This mostly avoids the need to use & in the first place
It's not that value semantics can't be better (they most assuredly can be), or that reference semantics don't cause their own complexity problems, but rather that so often we thoughtlessly imply/impose value semantics through interfaces in ways that negatively impact performance; getting interfaces wrong is a much tougher bell to unring.
The vast majority of my mental energy when I define an interface in C++ is carefully thinking through a combination of ownership contracts and value vs. reference semantics that I can mostly ignore in languages with implicit reference semantics. While occasionally ignoring those contracts while developing in Java/Python/whatever comes back to bite me, the problem isn't nearly as common or problematic as when I unintentionally impose value semantics in a language that allows me to.
if match || err != nil {
return rule, err
}
Translating this code to actual logic takes too much thought and is too fragile. Is that an error path or a success path? It’s both! The logic is “if we found a rule or if there was an error then return a tuple that hopefully indicates the outcome”. If any further code were to be added in this block, it would have to be validated for the success and the error case.But this only makes any sense at all if one is okay with reading Go result returns in their full generality. A canonical Go function returns either Success(value) or Error(err not nil, meaningless auxiliary value). And this code has “meaningless auxiliary value” != nil! In fact, it’s a pointer that likely escapes further into unrelated error handling code and thus complicates and kind of lifetime or escape analysis.
I don’t use Go, but if I did, I think this part of the language would be my biggest peeve. Go has very little explicit error handling; fine, that’s a reasonable design decision. But Go’s error handing is incorrectly typed, and that is IMO not a reasonable design.
If something previously took 4s now takes 2s then it's 100% faster.
Think of driving 10miles. If you drive at 20mph then it takes 30 minutes. If you drive twice as fast, 40mph, it takes 15 minutes.
40mph is 100% faster than 20mph.
Half the time is twice as fast!
I tried all of the obvious things, but the offender ended up being a call to allocate and fill a `struct tm` object from a string representation of a date. This doesn't have any obvious reasons (to me) that it would cause cache invalidation, etc, so I'm a little in the dark.
Still, replacing this four line block improved single threaded performance by 5x, and fixed the threaded behavior, so on the whole it is now ~70x faster and parses about 400mb of csv per second.
Yep, with values that take a lot of memory, it's faster to pass pointers/references around than it is to pass the values around, because it is less bytes to copy.
Of course there is more to such a decision than just performance, because if the code makes changes to the value which are not meant to be persisted, then one wants to be working with a copy of the value, not a pointer to the value. So one should take care if simply switching some code from values to pointers-to-values.
All of these things are things that coders with more experience of languages that use such semantics kinda know already, almost as second nature, since the first day they got caught out by them. But everyone is learning, to various degrees, and we all have to start somewhere (i.e. knowing little to nothing).
That's a very nice feature! I wonder if compilers for other languages have something similar.
Of course sometimes pointers are faster, or much more convenient. But as a rule of thumb: don't use pointers unless you've got a specific reason for them. This applies even more so if you're creating a lot of pointers (like in a loop, or a function that gets called very frequently).
Looks like that's been gone for awhile in favor of C++ 11 stuff, which I don't really like:
https://google.github.io/styleguide/cppguide.html#Copyable_M...
A lot of good software was written in that style, but it has grown bureaucratic over time, and as the C++ language evolved
May I ask, is that theme custom or available somewhere? I really enjoyed it
Moving a single character from one place to another. :-)
A good explanation of why "fire the developers with the lowest 50% of lines added" is an idiotic thing to do: this sort of deep analysis takes a lot of time and expertise, and frequently results in tiny changes.
Looks a bit like https://newcss.net/ or Water CSS
func (r Ruleset) Match(path string) (*Rule, error)
to: func (r *Ruleset) Match(path string) (*Rule, error) type Ruleset []Rule
The original code creates a local copy of a rule and explicitly returns a pointer to that. Taking the ruleset by address wouldn't change that issue.There are valid use cases for wanting to take a copy, and then pass along a pointer of the copy. Perhaps to go through a series of modification methods that don't touch the original. I'd sure hate it if the compiler tried to outsmart me on that and changed the behavior away from what I'd written.
The trap here is that everything is passed by reference (pointer), but the intermediate local value is, well, a value (a copy).
Rule is not a gigantic monster struct (it's 72 bytes), chances are returning it by value would not have been an issue.
Anyway I would say there is an issue with Go here: it's way too easy to copy out of a slice.
I spend most of my time in a JVM language of one flavor or another, and when I was learning Go, the first thing that stuck out at me was, "why would I ever want the compiler to invisibly copy a data structure for me?"
I suppose the primary reason is to prevent the callee from modifying the caller's data out from under them; unless you pass a reference value, you know the callee cannot modify your data.
But, as someone who leans heavily into "everything should be as immutable as possible," the second thing that stuck out at me was "wait, a struct can't have const fields?"
When I write code, it's common to have references to immutable classes thrown around with wild abandon, heedless of ownership, threads, or good taste, because the data just can't change. But that's a paradigm that Go simply doesn't support.
If there's anything I wish languages with implicit reference semantics would adopt, it's implicit immutability. I wish Java would be so much nicer with keyword that is half way between "final" and "volatile" that means, "yes, you can actually mutate this" and then make final semantics the default for fields & variables.
You might get a kick out of Virgil. It's easy (and terse!) to define immutable classes and you can have immutable ADTs too. (plus tuples, generics with separate typechecking, etc).
I think that’s true. Expensive copies should never have been implicit. There was a story some time ago about a single keypress in the address bar of Chrome causing thousands of memory allocations. The culprit: lots of std::string arguments up and down the call stack.
Rust gets this right, with the hindsight of C++’s example: “a = b” is a move operation by default and clone() is always explicit, except for plain data types where copying is literally memcpy — and those are clearly marked as such by the type system.
Oh, I miss it every time. ;-)
I will say though that some newer languages seem to have a confused idea about how to offer mixed semantics. A bunch of them tie semantics to types. The ideal interface can vary by usage context. It's hard enough getting the semantics right as the callee (as opposed to caller), let alone when you're defining a type that will be used by who knows how many interfaces.
> I guess I always figured the solution was "value semantics with better education / tooling".
I've always thought much the same, but I have slowly come to appreciate that it's more than just education & tooling. Even with good education & tooling, there's a cognitive load that comes with getting interfaces right that for the general case is just not worth it.
To me, this sounds like this is it. Explicit is better than implicit is a very useful truism
Rider does this already for C#.
That is to say, I think I mostly am agreeing with you. In Java, objects are always passed by reference, never by value, and never implicitly copied. But Java doesn't have any value types other than primitives. When I added ADTs to Virgil, I wanted to get the best of both worlds; objects are pass by reference, like Java, and ADTs are immutable, identity-less, so they are deep compared but never deep copied. (ADTs in Virgil can also sometimes be flattened, so they don't necessarily result in heap allocation).
I'd have to take a look at Virgil to appreciate your approach, but I'm always leery of implicit value vs. reference semantics tied to types (aside from the whole array fiasco, easily the ugliest part of Java's type system). So often the particular semantics you want are driven by the context of what you're doing, rather than the what you're doing it with.
There's no performance cost to value semantics, so of course not.
> The copy is cheap. The lifetime tracking is not because it forces a heap allocation and creates additional GC pressure. In fact, assuming Rule is small, if Match returned by value, the code would be similarly as fast.
I'm referring more to how this stuff seeps in without the programmer realizing it. It's the implicit nature of all this behaviour that is the problem.
Esit: Also, pointers into slices will probably leave you sad. You get a pointer to the slice storage, not a pointer to the thing. And if the slice resizes your pointer mow references a dead slice. Basically, pointers and slices are not friends. Unless you have a slice of pointers, which idiomatic go avoids unless there is a decent reason for it :)
PHP, for example, has explicit references. If you have an `$arr1=array(1,2,3)` and an `$arr2 = $arr1`, that second array is a full copy of the first array, and updating $arr1 does nothing to $arr2. Similarly, `function update_array($arr) { $arr[0] = 'cake'; }` called with `$arr1` will create a copy of $arr1 for use inside that function. Any changes to `$arr` inside the function only apply to that copy, not to $arr1. Unless you explicitly tell PHP you want to work with references, by using `function update_array(&$arr) { ... }` instead. PHP uses value semantics.
JS, on the other hand, uses implicit reference semantics. If you have a `const arr1 = [1,2,3];` then `arr1` is an alias, simply pointing to a memory location, and declaring a `const arr2 = arr1`, makes arr2 also be an alias for that same memory location. They both reference the same thing, but neither "is" the thing. Similarly, if you have a `function updateArray(arr) { arr[0] = 'cake'; }` then that `arr` is also just an alias pointing to the memory location of whatever you passed in, and changes to `arr` change the underlying data, so whether you try to access that through `arr1`, `arr2` and `arr`, it's the same thing. JS uses implicit reference semantics.
(But note that many languages with implicit reference semantics make exceptions for immutable primitives like numbers or strings)
If your language has implicit reference semantics, “x = y” will cause x and y to refer to the same object. If it has value semantics, x will be a copy of y.
"Value semantics": you pass the value itself around, which means you're shallow-copying it all the time. That's what you get when you pass or return a bare non-interface type in Go.
PHP is a bit of a mess, objects are passed by reference (= implicit reference semantics) but arrays are passed by value, and you can opt them into pass-by-reference. You can also pass non-arrays by reference, though I don't think that's very common.
b = a
b[0] = 3
print(a)
What does the above print? If the language implements reference semantics, it prints [3, 2]. If it implements value semantics, it prints [1, 2].
In languages like C, C++, Go, Rust… references are more explicit. If you want a pointer to an object, you have to &, or something similar.
It gets a bit fuzzy.
Yeesh
In a normal value-semantics system if you have a pointer you just copy the pointer. Obviously if the langage doesn’t have your back it also means you’ve now fucked up your ownership, but you’re in C so that was to be expected.
Had the author been looking at the type information within the syntax of the code the profile output may not have been a surprise. Perhaps the problem would never have existed in the first place.
If you were forced to stop and think what type to declare, I bet you'd write "var rule *Rule". Even if you don't think deeply and just look at the return type.
And then if you assigned "r[i]" to "rule", you'd get a type error.
Nevertheless, the convention is that if a function returns (value, err), and err != nil, the value is discarded (I think of it as "undefined"). So the code is conventional.
But Go is a garbage collected language, and there is so such thing as “discarding” a pointer. Either it’s there or it isn’t, and this kind of leak has side effects. I find it baffling that the language designers and the community consider this acceptable.
(One thing I really like about Rust is that you can’t play fast and loose with lifetimes like this. If you have function taking &'a Vec<T> and you return &'a T, you can’t arbitrarily “discard” and leak that reference up the call chain. You need to genuinely get rid of it by the time the vector is gone.)
if err != nil {
return nil, err
}
if match {
return rule, nil
}Now of course it's important to read the documentation, but a language with sum types (or exceptions) would have used a separate type to indicate an error condition plus useful information on that error.
The optimization would only work if you had a way to tell the compiler that some values are constant/immutable.
In Rust you can write a function that returns the pointer of one element of a slice. You can also write a function that returns the pointer to a heap-allocated copy of an element of the slice. The two functions would have different signatures.
The compiler would also prevent mutation of the slice as long as there are any references to individual elements of the slice being passed around.
When it's a pointer to a copy, no such implicit dependencies occur.
In this specific case, that technique would be whole-program dataflow analysis. Given a Golang function that passes out references-to-copies-of owned data, you could actually determine for certain — at least in the default static-binary linkage mode — whether these two properties hold universally within the resulting binary:
1. whether no caller of the function will ever try to do anything that would cause data within their copy of the struct to be modified;
2. whether the owner of the data will never modify the data of the original struct in such a way that, if the copy were elided, the changes would be "seen" by any reads done in any of the callers. (The owner could still modify internal metadata within the struct for its own use, as long as such internal metadata is 1. in private fields, 2. where all callers live outside the package defining the struct, making those fields inaccessible; and 3. the fields are never accessed by any struct methods called by borrowers of the struct — keeping in mind that such methods can be defined outside the package by caller code.)
If you could prove both of these properties (using dataflow analysis), then you could safely elide the copy within the function, turning the return of a reference-to-a-copy-of-X into a return of a reference-to-X.
(And, in fact, if you can only prove the second property universally, and the first property in specific instances, then you can still elide the copy from the function itself; but you'd also generate a wrapper function that calls said function [receiving a reference-to-X], copies, and so returns a reference-to-a-copy-of-X; and then, for any call-site where the first property doesn't hold — i.e. callers whose transitive call-graph will ever modify the data — you'd replace the call to the original function with a call to the wrapper. So "safe" connected caller sub-graphs would receive references, while "unsafe" connected caller sub-graphs would receive copies.)
The frontend may or may not have its own optimizations and logs tho e.g. rustc now has MIR optimizations (https://rustc-dev-guide.rust-lang.org/mir/optimizations.html) but while you can dump MIR data (https://rustc-dev-guide.rust-lang.org/mir/debugging.html) I don't remember seeing an optimisation log.
At the end of the day, I think it's more likely that you take a look at the assembly and infer problems from there if the profiler doesn't tell you straight. An other difference is the kind of decisions the compiler makes e.g. while a compiler can optimize away allocations in "manual allocation" languages (https://godbolt.org/z/5nEo7xjEr) the allocations are plainly visible, so if they're trivially avoidable... you'll just avoid them.
Using Rust as an example, you'd have something like this:
pub fn match_(&self, path: &str) -> Result<&Rule, Error> {
for rule in self.0.iter() {
if rule.match_(path)? {
return Ok(rule);
}
}
Err(Error)
}
You couldn't miss an allocation, because the return type would have to change, and you'd need to perform the copy out: pub fn match_(&self, path: &str) -> Result<Box<Rule>, Error> {
for rule in self.0.iter() {
if rule.match_(path)? {
return Ok(Box::new(rule.clone()));
}
}
Err(Error)
} -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
and -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation
The generated logs can be seen with JITWatch [1].Notably, if you're defaulting to values, you may still have a bunch of allocations when you try to implement interfaces, which usually (always?) requires implicitly boxing values; however, if you pass a pointer into a function that takes an interface, I don't think it gets moved to the heap (but I'm not sure, which is why Go programmers need to be comfortable with profiling the escape analyzer and also why it would be great if Go actually had explicit semantics for allocations).
Isn't it still possible for the value to escape here? For example the callee could stick it until a global data structure.
In fact it seems like a pointer passed to a function would need to be on the heap "by default" unless the compiler can prove that it doesn't escape.
Do you have evidence for this claim? AFAIK the Go compiler does escape analysis, and allocates pointers that don't escape on the stack.
IMHO they should have used pointers everywhere they didn’t want value semantics with deep copies.
In my head, the ideal flow would be an annotation on your library function/class which triggers an IntelliJ suggestion for downstream users affected by your breaking change to run the automated refactor for them. Kinda like a more helpful @Deprecated.
The maven plug-in seems to use the recipes directly as dependencies: https://github.com/openrewrite/rewrite-maven-plugin
In our internal situation the parent Pom could already control this plug-in definition including versions of the recipes. At the very least the recipes could follow the same version as the artefact.
I thought I saw somewhere where the recipe was bundled with the artefact itself. That is very neat for simple usecase.
However, it suffers from the same flaw as the native-image configuration for GraalVM in artifacts. Sometimes the configuration/recipe needs to change. Eg because of new insights or because it is incompatible with new versions of GraalVM/open rewrite. So I suppose having an extra version dimension Could solve this. Then one can always depend on the latest version of the recipe for that artifact.
70% more stuff in the same time is 17 things in 100 time (100/17 =5.9 time per thing);
70% faster is 10 things in 30 time (30/10 =3 time per thing) - or, I would argue incorrectly, perhaps said to mean 10 things in 70 time (70/10 =7 time per thing).
You're describing "in 70% less time", not "70% faster".
explicit implicit
good * < *
v v v
bad * > *
Good implicit is better than good explicit. (If all is good, go for implicit.)Bad explicit is better than bad implicit. (If all is bad, go for explicit; don't hide bad explicit with bad implicit.)
Good explicit or implicit is better than bad explicit or implicit.
I see the OP has solved the problem by removing any references to how much faster from the article title!
You're right about non-linear algorithms though. If an O(n^2) algorithm is 2x / 100% faster, it can't process 100% more items in the same time, but I'd understand it to mean taking half the time for the same n.
$arr1 = [1,2,3]; // $arr1 is a pointer to a zval array [1,2,3] and refcount:1
$arr2 = $arr1; // $arr1 and $arr2 are pointers to the same zval array but incremented refcount to 2.
$arr2[] = 4; // at this point, $arr2 creates a new zval array, copies over the data from the first, sets the recount to 1, and appends int(4). The [1,2,3] array has its refcount decremented to 1 as it's still referenced by $arr1.
[1] http://hengrui-li.blogspot.com/2011/08/php-copy-on-write-how...
CoW is not semantics, it’s a way of implementing value semantics which avoids unnecessary defensive copies.
This stuff seeps in without the programmer realizing it because Go made a deliberate design decision to have automatic lifetime management. In other words, this is a feature of the language and not a bug. The only way for this to not seep in is if Go forced programmers to specify most lifetimes, which would make the language much more cumbersome to use.
I.e., this is not a value-vs-reference semantics issue. It's a manual-lifetime-management vs automatic-lifetime-management issue. The solution is to either 1) write in a language with more explicit lifetimes if performance is that important, or 2) profile your code and reduce heap allocations, which is what the person who wrote the article did.
I would disagree about this stuff seeping in because of automatic lifetime management. The equivalent code in Java, Smalltalk, etc., would either not have the performance problem or it would be much more obvious that the code wasn't as efficient as it could be.
Thanks, no worries.
You're right that Java doesn't have this performance problem because it (apart from primitives) has no implicit copies. (So the code wouldn't copy, it would return a reference, hence no allocation.)
I think what you're trying to say is: programs written languages with value semantics can implicitly create more objects, which (in some cases) have additional allocation/lifetime tracking in languages with automatic lifetime management. So maybe one solution is to prevent copies in most circumstances, and when you do need to copy, make them explicit.
But I think this conflicts with Go's philosophy as well. First, in highly concurrent programs, copies are necessary. A mutex/atomic will become a bottleneck; sharing is bad for performance. This means that copies should be part of most types in the language. (If they aren't, you'll run into issues like not being able to pickle when using multiprocessing in Python [1].)
OK, so we need copies. But clearly, I shouldn't have to write `a = b.Copy()` if b is an integer. That's too verbose. And if everything has a `.Copy()` call, that leads to visual noise and it's not clear when copies are actually expensive. So what set of "blessed" types should be implicitly copyable? Java picks primitives, but this means it's basically impossible to create an int-like object with value semantics. I think this is bad. C++ is at the other extreme with "(almost) all objects are implicitly copyable" but some people dislike how forgetting a & on a std::vector type might lead to a huge copy.
Go has chosen a reasonable middle ground -- dynamically sized types like arrays and maps are not implicitly copyable, but "cheap" objects like structs are implicitly copyable. This means that when you see a Copy call, it's meaningful. But this means you run into these situations when an implicit copy interacting with lifetimes/garbage collection causes a slowdown. But I don't see any other alternative. Getting rid of implicit copies for cheap types will cause visual noise. Getting rid of GC makes your language harder to use. Both of these alternatives seem worse to me.
To be clear, I'm not a huge fan of Go. Especially around the lack of monadic error handling. But I think they have done a decent job wrt. their design goals.
[1] https://stackoverflow.com/questions/8804830/python-multiproc...
Have fun fighting the borrow checker on that one.
The borrow checker ensures that you either do it right or you find another way that is safe.
For example it may be totally fine to help allocate a result, not all programs need to be optimized to the bone. Just return a boxed value. If you need to share it, wrap it in an Rc or Arc.
One problem I see with it is that rust chooses to make the most efficient way of programming also the most "simple looking", and so it lines up incentives in such a way that people will unnecessarily try to avoid using an extra Arc just because it looks like unnecessary clutter.
When you learn to embrace your boxing you'll learn that the borrow checker is not your enemy.
If you try to read 1000 bytes, you might get “800 bytes read, EOF” as a result. The worst part is that os.File won’t ever do this!
I like Go's multiple returns and error checking by default, but it definitely should have been implemented with some sort of "Result/Error" type union instead. Go made so many good decisions, that I am frankly amazed they made this one really bad one.
Doesn't bother me at all.
Only if the calling function does something with the pointer (which it generally won't, if err is non-nil). If the calling code does do something with the pointer even when err is non-nil, then either
(a) the value of the pointer is meaningful, and this is fine; or
(b) there's a logic error in the calling code, which is a far more serious bug than a potential memory leak.
So I don't really see the problem here.
At least Go doesn’t use RAII, so inadvertently pinning a pointer won’t keep a socket open or a lock held.
The following toy example doesn't leak memory. Are you thinking of some other pattern that would leak memory? If so, what is it? I can only think of patterns where a logic error is also involved. That is, where the meaningless value of 'val' is used in code paths where err != nil.
func alwaysError() (*int, error) {
var dummy int
// We return &dummy even though it's a meaningless value.
// Will this cause a memory leak?
return &dummy, fmt.Errorf("An error")
}
func caller() {
val, err := alwaysError()
if err != nil {
fmt.Printf("Error\n")
// It won't! Because the value pointed to by 'val'
// can be GCed from this point on.
return
}
// never get here
fmt.Printf("Value %v\n", val)
//
// ...lots of code that uses *val...
//
}
You might think that you'd get a leak if the error branch was also long-lived, but Go's GC seems to be precise with respect to conditional branching (as far as I can tell by experimenting with runtime.SetFinalizer). That is, as long as the error branch doesn't refer to 'val', then the value pointed to by 'val' can be collected once the error branch is entered (and before 'caller' returns).Could you Imagine a Java where you have a `Map` and a `MutableMap` and that's what you put at your API? I'd make it SO much clearer how safe any individual API is to call.
Mutable is not part of the Java meta-language used for typing.
ImmutableMap implements Map
https://guava.dev/releases/23.0/api/docs/com/google/common/c...
and throws exceptions on mutation.
https://guava.dev/releases/23.0/api/docs/src-html/com/google...
I'm unfamiliar with this rule (and not finding anything good to google). Can you elaborate?
I can't really think of a scenario where an immutable datastructure isn't a subset of actions against a mutable datastructure.
Conceptually, you can have constness in your subtype-system (as long as you are sticking with interfaces (methods), as Liskov's subtyping model does, and aren't inherting potentially-mutable fields).
MutableMap and ImmutableMap are both subtypes of a hypothetical ReadableMap. ImmutableMap is the same as ReadableMap, but has an informal contract that subclasses shouldn't add mutability.
That's assuming a 64-bit CPU (which admittedly seems like a reasonable assumption. The nice thing about the abstraction though is that there's nothing preventing the runtime from applying value semantics for those trivial small-object cases where they're obviously more efficient.
Curious about what you mean here. This sounds like C#'s class/struct distinction to me.
In a language where destruction of an object often releases external resources (C++, for example, or Python or Rust to a lesser extent), this effect would be more dramatic. Imagine someone doing this in Python:
def func():
f = tmpfile(…)
err = do something
return f, err
This style would be absurd and wrong — it keeps f alive to long.> History constraint (the "history rule"). Objects are regarded as being modifiable only through their methods (encapsulation). Because subtypes may introduce methods that are not present in the supertype, the introduction of these methods may allow state changes in the subtype that are not permissible in the supertype. The history constraint prohibits this. It was the novel element introduced by Liskov and Wing. A violation of this constraint can be exemplified by defining a mutable point as a subtype of an immutable point. This is a violation of the history constraint, because in the history of the immutable point, the state is always the same after creation, so it cannot include the history of a mutable point in general. Fields added to the subtype may however be safely modified because they are not observable through the supertype methods. Thus, one can define a circle with immutable center and mutable radius as a subtype of an immutable point without violating the history constraint.
[1]: https://en.wikipedia.org/wiki/Liskov_substitution_principle#...
This is also a violation of LSP and the open close principle.
Consider a `List` with an `add` function. What would you do with that `add` function to make an `ImmutableList` subtype?
But allocations aren't the same as copies, and the argument for reference semantics has always been that implicit copies are problematic. In your std::string example, having that many String copies in a Java program would be similarly terrible (and this sometimes happens by accident because of abstraction layers that hide all the copying going on under the covers).
I do think Rust gets a lot of stuff right, but Rust's cognitive load is broadly recognized. I tend to see it as C++ with a lot fewer foot guns. ;-)
Java programs make "ridiculous amounts of implicit allocations" because allocations are cheap in Java. And they need to be cheap because Java doesn't have value semantics so it leans hard on escape analysis + cheap allocations.
I agree with the rest of your comment, although I think most of Rust's "cognitive load" amounts to borrow-checker-vs-garbage-collection. You could envision a Rust with explicit allocations and a GC, and that language would have a "cognitive load" approaching that of Go while also being a fair bit more performant insofar as people can much more easily reason about allocations and thus performance.
Yes, but that's kind of the point, right? Implicit allocation isn't really a problem because a runtime that optimizes the allocations magically for you is a lot easier to build than a runtime that optimizes whether you really need to be copying objects as much as you do.
Java has a tremendously good GC, so can cope with lots of allocations. Go has an OK one, so needs some help (but mollifying it often pays dividends elsewhere in locality and memory usage too). C++ has your default system heap, good luck.
Note that a move can still do a copy; in fact, Rust is kinda notorious for generating more on-stack memory copy operations than C++. It’s slowly improving, but it can still be surprisingly bad in some cases.
what do you mean by this? If I say `let x = 5; let y = x;` in rust, that's a "plain data type copy" of a stack value, but memcpy is usually used to copy heap memory. What connection between copying of primitive simple stack values and memcpy are you suggesting here?
Try playing with memcpy on Godbolt and you'll find that the compiler will compile the memcpy to a single mov instruction when the size is small, and some movdqu/movups when the size is slightly large, and only a function call when the size is huge.
> memcpy is usually used to copy heap memory
memcpy is often used in low-level serialization / deserialization code since you can't just cast a buffer pointer to a uint32_t pointer and dereference that; the solution is memcpy between variables that are often both on the stack.
They're just using 'memcpy' as a shorthand for saying the bitpattern is blitted. Semantically, that's like a memcpy. The point is, there are no expensive allocations, nor do any embedded pointer fields need adjusted, etc.
I think the confusion here is that there isn't always a literal call to memcpy for copying small types like ints in the emitted code, but it's always doing something with the same effect and maybe sometimes using an actual memcpy (probably when copying arrays?).
Also something interesting is that memcpy is used for copying data between stack variables in C sometimes when you need to convert some type to another one without using a cast.
That seems like a good thing. If you're handed a map to const values you can't just go "imma gunna mutate them anyway".
Casting const away and then mutating is a footgun as you are in undefined behavior territory as soon as your pointer/reference is not just const itself but points to a const variable.
But still it doesn't look very nice in practice. I think that in that case, it does make much more sense for a mutable type to be a subtype of the immutable one.
The immutable threw me off track. It's just that the supertype would lack mutation operations. (in a sense immutable would be the default for every type).
Thank you.
As far as I know, Java's (default) runtime gives cheap allocations at the cost of long GC pause times.
> than a runtime that optimizes whether you really need to be copying objects as much as you do
It's not "copying", it's "allocating", and avoiding allocations isn't that much work (and frankly I'm surprised it's such a minor problem that no one has bothered to build an IDE plugin that highlights these allocation points automatically--or at least I haven't heard of such a thing). Anyway, "a runtime that minimizes allocations" is just an escape analyzer and Java has one of these too, and IIRC it's a lot more sophisticated than Go's (but it's also a lot harder to reason about as a consequence).
"long GC pause times" is kind of vague, so I guess you could be correct, but in practice there's a LOT of different ways the memory management can be handled, many of which are deemed "pauseless GC" (though the term is somewhat misleading).
My statement was considering that reality though. While not true for some use cases, in the vast majority of cases, the runtime optimizes the allocations more than sufficiently.
> It's not "copying", it's "allocating"
Allocators can do a pretty good job of minimizing the overhead of allocation, to the point the amortized cost isn't much more than a single machine instruction. Allocating gigabytes of memory quickly is possible. Copying the data can be a lot more work, and often objects have copy semantics that add a lot more additional work.
> Anyway, "a runtime that minimizes allocations" is just an escape analyzer and Java has one of these too, and IIRC it's a lot more sophisticated than Go's (but it's also a lot harder to reason about as a consequence).
I think you're implicitly saying "a runtime that minimizes heap allocations" there, in which case I'd agree.
Yes, but I'm pretty sure those "pauseless GC" schemes impose other tradeoffs.
> My statement was considering that reality though. While not true for some use cases, in the vast majority of cases, the runtime optimizes the allocations more than sufficiently.
I'm not sure I follow. The same could be said for Go--in the vast majority of cases, Go's tradeoffs (slow allocations, low latency / non-moving GC) are also suitable.
> Allocators can do a pretty good job of minimizing the overhead of allocation, to the point the amortized cost isn't much more than a single machine instruction.
As far as I know, speeding up allocations to this degree requires a moving GC which imposes a bunch of other constraints (including copying a bunch of memory around).
> Allocating gigabytes of memory quickly is possible. Copying the data can be a lot more work, and often objects have copy semantics that add a lot more additional work.
Yes, but the bottleneck here wasn't the copying, it was the allocations. And if you optimized away allocation cost entirely such that only the copy cost remained, that cost would be so small that the OP would never have bothered to profile because copying small objects like this is so cheap compared to everything else (even if it is expensive compared to bump allocating).
> I think you're implicitly saying "a runtime that minimizes heap allocations" there, in which case I'd agree.
Yes, the allocator and GC are concerned with heap allocations and not stack allocations. I'm using "allocations" as a shorthand for "heap allocations".
I think what really throws people off here is that getting good performance out of a Java application involves some skills which are alien to C++ programmers, and vice versa. You take an experienced C++ programmer and drop them into a Java codebase, they may have a very poor sense of what is expensive and what is cheap. Vice versa… experienced Java programmers don’t do well in C++ either.
The result is that you have precious few people with any significant, real-world experience fixing performance issues in both languages.
I’ve had no issues with Java 17+ under heavy allocation/garbage collection (data encryption pipeline I haven’t tuned to reuse buffers yet), and it’s pause times are on the order of a handful of milliseconds, without meaningful tuning. I think it’s doing something like a GB/s of garbage collection.
And the jvm in question is doing a LOT more than just this, so it’s coping with millions of allocated objects at the same time.
std::unordered_map<std::string, const std::string> m;
m["foo"] = "bar";
std::unordered_map<std::string, std::string> m2 = m;
doesn't compile.Or if you (or something you call) mutate it trough another reference/pointer. Sometimes I wish there was a stronger const where the compiler (and programmer) could actually assume the object is not mutated while that reference/pointer is alive. Of course checking that is no longer doable with just the type system.
> You can't even easily copy your way out.
That depends on the type. std::unordered_maps could provide a copy constructor that does the conversion.
And you can do the copy via iterators, although that is probably not as efficient as it could be because it needs to re-hash the keys.
std::unordered_map<std::string, std::string> m2(m.begin(), m.end());Capable of handling TB sized heaps with micro seconds pauses.
If they changed the default GC, a lot of folks would freak out, even if it was objectively better in nearly every situation. Because someone, somewhere is relying on some weird bug in G1, or whatever and now their software behaves differently, and it’s a problem.
Give it a couple more major revs, and it might still happen. G1 became the default in what, Java 9?