Never create Ruby strings longer than 23 characters(patshaughnessy.net) |
Never create Ruby strings longer than 23 characters(patshaughnessy.net) |
One interesting corollary is that moving short strings in an implementation that does this could actually be ever so slightly (negligibly) slower than moving long ones (since byte copies are slower than word copies). But generally, this is a free lunch optimisation and can save you hundreds of megs of memory when writing programs dealing with millions of short strings.
[1] http://llvm.org/svn/llvm-project/libcxx/trunk/include/string - search for "union"
Also (pedantic):
#define RSTRING_EMBED_LEN_MAX ((int)((sizeof(VALUE)*3)/sizeof(char)-1))
sizeof(char) is always 1, so that division is superfluous.http://stackoverflow.com/q/4562249/25981
sizeof(char) is always defined to be one. This can't be altered by a conforming compiler.
comeon peopleeeee, a bit of an arbitrary internal standard, no?
i understand the point about "CONCLUSION: it doesn't matter for a few strings!" but.... comeonnnn it must matter on some level, otherwise why is Rails such a pain in the ass to optimize? these things must add up...
And even if you'd assume web development as the only purpose, there's a lot of strings that are shorter than 23 characters: Header for request and responses, form fields passed by the client (usernames, passwords, ...), field names passed in hashes, table and column names, template and file names, URLs or even the occasional, totally rare string in a json structure. It's an optimization with major gain and little loss.
Conclusion: "Don’t worry! I don’t think you should refactor all your code to be sure you have strings of length 23 or less."
struct RString {
struct RBasic basic;
union {
struct {
long len;
char *ptr;
union {
long capa;
VALUE shared;
} aux;
} heap;
char ary[];
} as;
};
/* apologies if I messed up the syntax here */
#define RSTRING_EMBED_LEN_MAX (sizeof(((RString*)(0))->as) - 1)
Then you can even use the padding the compiler added, if any, plus you can add more things to heap and the embed length will grow automatically.On a related note, I've found a much less comprehensive (but still useful) guide to Python internals: http://tech.blog.aknin.name/category/my-projects/pythons-inn....
Also, I would guess the performance gains (from skipping malloc) would wash out the longer your average string gets -- even if the huge stack use doesn't kill your performance for some other reason (blowing the d-cache?).
Ruby experts can correct me if I'm wrong, but when Ruby sees a name like "str2" it looks it up in a table, which points it to the RString structure. From there, it can follow the pointer to the actual array of characters, which in this case is only stored once.
In fact, the diagram is simply wrong. This was rectified by the author in an article two weeks later:
http://patshaughnessy.net/2012/1/18/seeing-double-how-ruby-s...
VALUE seems to be unsigned int defined via "typedef uintptr_t VALUE;" and "typedef unsigned __int64 uintptr_t;"
But why is it calculated like that I don't get. Anyone can explain?
What I don't know is why they don't simply use sizeof(heap) as the buffer size.
https://github.com/ruby/ruby/blob/8f77cfb308061ff49de0a47e82...
Note the `as` union. The `heap` version has three VALUE-sized entries, so RSTRING_EMBED_LEN_MAX is calculated accordingly, with the -1 to account for the null terminator.
Edit: I missed that part of that was another union, removed what I said about it being off on 32 bit.
I still don't understand why they go so roundabout by dividing by one and casting to int...
When programmers don't know in advance how long name/email/input/whatever field is going to be - they just use the magic "power of two" length :)
So 32 (or 33) in this case would be more reasonable.
#define RSTRING_EMBED_LEN_MAX ((int)((sizeof(VALUE)*3)/sizeof(char)-1))
23 wasn't chosen, it was calculated to be the size that would be required for a struct describing a heap string, and will actually be a different number for different architectures. Choosing to make it bigger would add unnecessary overhead to the RString struct.They did know. And there are many more cutoffs than powers of two, depending on storage backend.
If this is intended to sit on the stack, which I find highly unlikely (especially given the timings that seem to be the delta between one malloc and two, and would be much more significant if it were a stack allocation versus a heap allocation. This is not comparable to small string optimizations for the stack in C++), maybe. But otherwise it seems like a poorly considered hack.
The string type could as easily have been dynamically allocated based upon the length of the string, where the ptr by default points inside that same allocated block. If the string is expanded it can then be realloced and the string alloced somewhere else. No waste, a single allocation, etc.
allocation of the RString (which becomes larger and thus more difficult to malloc), and the 23 string bytes that will sit unused for longer strings.
I got the distinct impression (backed up by an actual code snippet defining the max embedded string size) that the 23 byte limit was calculated to exactly match the size of the data that would otherwise have to be stored for a heap string anyway. Thus, it doesn't actually take any extra space in the struct, and those 23 bytes do not go unused in other strings.So this holds (on a 64-bit machine) 8-bytes for the pointer, 8-bytes for the length (string not null terminated), and 8-bytes for the capacity. Alternately, via a union, it stores 24 bytes of string (null terminated). It knows whether it is a or b via a separate flag that it holds separately in RBasic.
I retract my jab about memory loss, but it still sounds rather terrible. Every bit of code dealing with strings needs to validate flags on every use to determine what it is dealing with, alternate between length specified or null terminated, etc. Ugh.
The key point is that for short strings, you're not adding it to the cost of the long string, but instead overwriting the data that you would track the long string.
In other words, the memory layout you get is this:
short:
[Rbasic]['s', 't', 'r', 'i', 'n', 'g', '\0']
long:
[Rbasic][ length ][ dataptr ][capa or value]
Which leads to another interesting question: How does the optimization interact with null in the middle of short strings, since only long strings have a length stored? Does Ruby check for embedded null when creating a string, and disable this optimization?I understand that there's many places where strings are shorter than 23 characters and I never meant to say that this optimization was superfluous. I actually think it's pretty clever.
And it doesn't need to alternate between length specified or null terminated - they're all length specified (or how do you think short Ruby strings are also able to store ASCII NUL)
And I could write books about why Rails is such a pain to optimize but this is not it: This is an optimisation over the generic case of allocating the buffer elsewhere. An optimisation you also happens to find in C++ implementations for example.
Btw, the fact that you don't understand it doesn't make it arbitrary. Hint: it's not arbitrary at all.
#define RSTRING_EMBED_LEN_MAX ((int)((sizeof(VALUE)*3)/sizeof(char)-1))
explain "3" in terms of your comp sci 101 divine computer law... Ok I get itttt "len/ptr/capa/shared". I've never written an interpreter so I can't rewrite the implementation myself, but I've worked with Rails enough to know that Ruby's is terribly inefficient, and these are problems created in it's abstraction from C, not inherent to C. Maybe this RString case is a minor issue, but it seems to me to lack some rigorousness, to be a performance leak. It's one of those things that makes you wonder "What Would Linus Do?". Although I do understand the points about including it AT ALL to be "free gain, little loss" and the language can't be C, but sometimes I don't know....
Just admit that maybe something is wrong here. Please. you'll be the first Rubyist ever, the dawn of a golden era. I've been intrigued more and more by the Lisp community because many of those devs seem to have a philosophical unwillingness to compromise performance and proper engineering, whereas in Ruby people get mad at you for not embracing the compromises!! Geez.
Hint: writing "Hint:" and then saying something really obvious that you learned from the article we all just read makes you sound craaaaaazy obnoxious
(Incidentally, it's issues like this that make me thank my lucky stars I do my web work on the JVM now.)
However, in this case, the number 23 comes from the amount of space that would be needed to store the smallest string in heap memory after taking into account malloc's bookkeeping info and whatnot.
As an aside, if you already knew the answer, why go on a rant that makes you seem ignorant?
I've heard of cases where Ruby performance supposedly increased tremendously because of raising some memory / garbage collection limits, and in today's world I always wondered why they don't try harder to make the sacrifice on memory footprint in order to gain speed. As a platform that is to this day routinely blasted for being so damn slow (I'm sure a lot of this falls on Rails' shoulders especially since ActiveRecord still was glacial when I stopped following... ~1.3 seconds of Ruby processing to render simple pages IIRC) why not start inserting some performance hacks or re-implementing with performance as top priority?
The answer to your question is that I'm sociopathic on the internet.
As for slowness, there are many, many obvious targets much higher up the list than String. Slow method dispatch being the primary culprit.
You're right, there must be a lot more low-hanging fruit out there. Something interesting though... isn't the Ruby interpreter just a standard (RubySpec)? JRuby (and Rubinius I think?) are other, better-performing implementations, correct? I've been thinking about trying JRuby since I was looking into Grails and thought "eh why not just try JRuby on Rails".
but i've kinda gone in the other direction anyway and am more interested in Clojure > Ruby, especially since Clojure "frameworks" are just dead simple libs. It's weird because RoR was around first (I think, maybe some overlap in development...) but it seems that now that Clojure and Scala exist, RoR is kindof like a bad compromise between the two.
Though Clojure is kindof minimal and not everyone is singing Scala's praises...