How to Think About Variables in C(denniskubes.com) |
How to Think About Variables in C(denniskubes.com) |
This might have been mildly interesting if there had been the assembly for a few different architectures (x86, MIPS, ARM, PowerPC, etc) showing how the C code was translated to assembler for each. And could have been very interesting with an additional discussion of memory barriers and atomic operations in C and their relation to assignments and pointers.
As someone who has had difficulty picking up real programming languages, and has only found some marginal success due to Obj-C's ARC feature, I can tell you this puts everything I've read in to much better perspective.
Try not to be so negative, man, I think it's clear you weren't even the intended target anyways.
It is almost always the first language ported to any system, almost every computer science program at least covers the basics, it has been in 1st/2nd place on the TIOBE index for over a decade, its the 5th most popular language on github by commits and it is over 40 years old.
But- I'm willing to accept there might be people on Hacker news that don't know C, thats why I gave suggestions to the author to expand on the content and make it interesting to a wider audience. That was the point of my post.
So a better formulation might be: "C provides an abstraction layer on top of a computer's memory model and instruction set that will allow your code to be portable between different machine architectures, but only if you play strictly by the rules."
By the way, the classic K&R book explains the fundamentals of C pretty well. If you really want to understand C, I'd recommend reading it cover to cover (it's pretty short).
I surely hope not.
If you read the C standard, you'll notice it doesn't talk much about "memory" (the word only appears 13 times in C99); it mostly talks about "objects" (mentioned 735 times in C99). These objects aren't OO-objects -- obviously C doesn't have OOP built in -- but rather all the basic types like int, float, struct, etc are objects. When you declare a variable like "int x", you are creating an object.
C's aliasing rules dictate that you can only access an object via a pointer of that object's actual type. This is why it is dangerous to think of the assignment operator as a simple memory-copying operation. If assignment were a simple memcpy, you could do something like this:
int x = 5;
// BAD: undefined behavior, violates aliasing.
short y = *(short*)&x;
If a variable were just a memory address and assignment were just a memory copy, this would be a valid operation. But the right way to think of it is that a variable is a storage object whose address can be taken, and and a dereference is an operation that reads a storage object.A pointer isn't a generic memory-reading facility, it must actually point to a valid storage object of the pointer's type (or to NULL).
If you do want to read and write arbitrary objects in memory, you can always use memcpy():
int x = 5;
short y;
// This is fine, and smart C compilers optimize away the
// function call.
memcpy(&y, &x, sizeof(y));The size of a type is just one of its many attributes. Even if, for example, "long", "float", and "void* " happen to have the same size, they're still very distinct types.
"Integer data types are defined in the limits.h file. Float data types are defined via macros in the floats.h file."
Integer and floating-point types are defined by the compiler, guided by the hardware and the ABI for the platform. <limits.h> and <float.h> document the characteristics of the predefined numeric types.
"A pointer doesn’t hold a memory address, it holds a number that represents a memory address."
Sure, and a floating-point object is ultimately just a collection of bits -- but that's hardly the best way to think about either of them. Integers and pointers (addresses) are logically very distinct things, even if they happen to have similar representations. For example, the addresses of two distinct variables have no defined relationship to each other (other than being unequal); just evaluating (&x < &y) has undefined behavior.
C lets you get away with a lot of type-unsafe stuff, particularly if you resort to pointer casts, but it's fundamentally much more strongly typed than the author seems to think it is.
2 &x = 20; // this doesn't work
3 * (&x) = 20; // this does work
Why does line 2 &x not work but line 3 does? Because &x returns a pointer, a number representing a memory address. This is an important distinction. A pointer doesn’t hold a memory address, it holds a number that represents a memory address.
=======
No, that is not why. Note that the following does work:
int * x = 0;
and the following works, though typically yields a warning:
int * x = 20;
Line 2 fails because & doesn't give back an l-value.
Definitely not true. More like, "it will have an address, if you take the address with the & operator". Otherwise, the compiler is quite free to store locals in registers.
As stated in the post.
"In most assembly languages, data types don’t exist. You operate on bytes and offsets."
This is just not true.
Most assembly languages (I learned on PDP-11 assembler, which I remember best, but what I say is true of 68000 and x86 too) have a notion of a byte, but also integers of various word lengths, and floating point numbers.
In fact, some registers are in effect designated as "pointers" for various kinds of conventional indirect addressing (the instruction pointer, the register holding the stack pointer, and others).
In this sense, C is even closer to assembly than you indicate, because the data types are so analogous.
What exactly is the "syntactic sugar" that hides the idea that names can have addresses? Structs? Some specific kind of expression? Array index syntax? The names themselves?
As an example, an IPv4 address is 32 bits. Don't convert it to a string and put it in a varchar(64) in your database when you are optimizing for space (I actually saw this once). And yes, the DB had an inet type, but no one knew how to use it, what it was or why it mattered.
int r = ((int (*)())startAddress)(); // Wheeee!http://en.wikipedia.org/wiki/Lie-to-children
> A lie-to-children, sometimes referred to as a Wittgenstein's ladder (see below), is an expression that describes the simplification of technical or difficult-to-understand material for consumption by children. The word "children" should not be taken literally, but as encompassing anyone in the process of learning about a given topic, regardless of age. [snip] Because life and its aspects can be extremely difficult to understand without experience, to present a full level of complexity to a student or child all at once can be overwhelming. Hence elementary explanations tend to be simple, concise, or simply "wrong" — but in a way that attempts to make the lesson more understandable.
OK, the very first sentence of this piece falls flat on its face when you begin to think about how a computer actually handles getting data into and out of the parts of the CPU that actually do the work of modifying data according to the opcodes in flight.
In specific, C is meant to be a pleasant syntax to sling data around a large, flat address space, where the assumption is that every part of the address space can be treated like any other, with no special consideration given to some locations being faster than others. (The 'register' keyword mucked with this a bit, but approximately nobody uses it anymore in new code. Just as well, because good compilers ignore it anyway; more below.)
This is horribly, hilariously wrong when you learn about cache hierarchy, and becomes even more wrong when you throw an OS implementing virtual memory and a disk cache into the picture. C doesn't have any way to refer to cache; you can't tell the compiler 'store this in cache' because that would break the abstraction C enforces.
So we loop back around: C enforces the abstraction for a good reason; namely, compilers are better than humans at scheduling memory use in practically every case, and in the few cases they aren't, you're doing something hardware-specific enough you'll need to drop into assembly anyway. This is also the reason the 'register' keyword is a no-op and has been for decades. Compilers can schedule registers better than humans because compilers know more about all of the optimizations in play, and when they can't, you'll have to drop into assembly anyway.
TL;DR: This is a basic introductory post. Nitpicking it for things that compilers take care of for you anyway is pointless.
It's a valid operation regardless of whether a standards body says it's not.
uint32 x = 5;
uint16 y = *(uint16*)&x;
The effect is to set y to the first two bytes of memory from x. Values assigned to x are serialized into memory in either big endian or little endian order. Those are the only two cases you have to account for. Quake 3 engine has a macro for the above operation which produces the same value of y on all platforms. This is useful for serializing x to disk, then loading it later (and possibly on a different architecture).One source of confusion is that int and short are essentially, for all intents and purposes, undefined -- they are of course defined by the standards, but their implementation is allowed to vary so much that no programmer can make any assumptions about their size (in bytes) at runtime.
int8, int16, int32, int64 are all explicit and force the compiler (and the hardware) to obey the wishes of the programmer. This is, I think, the right approach. People make much ado about the fact that "a byte isn't necessarily 8 bits" and "the only assumption you can make about a short is that it's smaller than an int, and larger than a char", etc, which is probably unnecessary mental effort.
"Bytes are 8 bits. Here are four bytes. Here's the value that the four bytes store. Copy two of the four bytes to this other spot (adjusting for endianness appropriately via a macro)."
You typically don't want a memcpy in situations like this due to endianness.
The reason it's useful to explicitly "break the rules" like this is because it's important to know what assumptions you in fact can rely on, regardless of what standards bodies have to say about it. Because at that point you can do incredible things such as http://www.codercorner.com/RadixSortRevisited.htm
inline float fabs(float x){
return (float&) ((unsigned int&)x)&0x7fffffff ;
}
The reason this is incredible and awesome (rather than horrible and dangerous) is because it enabled game developers to achieve a more impressive product for end users, because they were able to do more with the CPU resources that were available at the time.It's of course not so relevant nowadays, since it's reasonable to assume that most gamers have at least a core 2 duo. But it's one of those things that isn't relevant until suddenly it is -- you're in some situation that requires sorting millions of floats, and your dataset simply demands more performance than your compiler typically gives you. Then suddenly you find you can do amazing things like this, and surprise people with how effectively you can use a modern CPU.
(Although, the modern antidote to "I need to sort millions of floats quickly" is to use SSE, not to sort floats as integers. Yet that's even more evidence that it's better to understand the capabilities of the hardware.)
Whoa there, cowboy. You may not feel personally beholden to standards bodies, but compiler vendors are following their lead. The major compilers are getting more and more aggressive about optimizing away undefined behavior every year.
> The effect is to set y to the first two bytes of memory from x.
No, it's really not. It's undefined behavior and the compiler is free to do absolutely whatever it wants.
> One source of confusion is that int and short are essentially, for all intents and purposes, undefined -- they are of course defined by the standards, but their implementation is allowed to vary so much that no programmer can make any assumptions about their size (in bytes) at runtime.
I agree with this, and have made this argument before: http://blog.reverberate.org/2013/03/cc-gripe-1-integer-types...
But this is an entirely separate issue.
Given that compilers do break when programmers violate aliasing rules, you should recheck what assumptions you think you can rely on. Non-strict aliasing is not one of them. Unless you want to slow everything down with compiler-specific flags like -fno-strict-aliasing.
uint8_t foo[4]; *(uint32_t*)foo = 0;
Besides even without strict aliasing, the above is not at all guaranteed to work since not all architectures support unaligned loads. (and if you think "well but no one uses them, just like no one uses 1's complement architectures anymore", keep in mind that this includes ARM)(also use stdint types already)
At least in C99, the compiler doesn't need to support exact-width integer types.
>People make much ado about the fact that "a byte isn't necessarily 8 bits"
Well, POSIX.1-2004 requires that CHAR_BIT == 8.
All the world's a VAX, sure. Don't mind the next generation of hardware coming down the pike and the next wave of compiler optimizations.
If you keep track of which boxes are and are not runtime memory cells, that should be enough to work out any particular C pointer problem except the pointer-array almost-equivalence mess.
When it comes to understanding memory in C, another important aspect is understanding how linkers and loaders work. Also, it's good to know something about calling conventions.
Also, when you get to manually allocated heap data (which this article doesn't cover) you don't have to worry about deallocations... usually.
Variables? What state? Everything is puuuuuuuuure.
In Python:
Everything is an object (numbers, true/false values, strings, etc), some are mutable and some are not. Variables are temporary labels on objects (think of them as hard links).
In Rust/C++:
There are various types of boxes / smart pointers (shared, unique, heap, etc), and unsafe / raw pointers should be avoided when possible.
In C:
Not every variable has a data type, e.g. void or function pointers.
A void pointer has type "void* "; a function pointer also has some appropriate type.
Not every object has a type (e.g., a chunk of memory allocated by `malloc()`), but if "variable" means "object created by a declaration", then yes, every object has a type.
uint8_t foo[4]; *(uint32_t*)foo = 0;
Besides even without strict aliasing, the above is not at all guaranteed to work since not all architectures support unaligned loads.So, the interesting thing about this example is that it does work. It's in fact very, very difficult to find a platform where that example won't work (i.e. crashes the program). For example, any C library involving image manipulation is likely going to have code similar to what you've described, and those libraries work on almost every platform.
Standards are a good and useful thing. All I'm saying is that it's important to know which rules you can safely violate.
No, it isn't. Many ARM processors will bus error on that code if (foo & 3) != 0. I believe PowerPC doesn't do unaligned word reads either...
It quite often has to do with the memory controller and not with the particular processor, though I believe x86 has to support unaligned reads. I've certainly worked first hand with ARMs that did not support it.
Would
uint8_t foo[4]; *(uint32_t*)(&foo[0]) = 0;
also result in a bus error? Why?...which actually is exactly how I found out first-hand that it doesn't always work. If you only ever test on x86 you'll never catch it. You might not even catch it on ARM if you're lucky.
Which is the point - that compilers can and do make use of almost all undefined behavior of C for optimizations, which one developer might not catch because their current compiler happened to work. Then a new version is released that can find and exploit more undefined behavior. And strict aliasing is one of those rules you can't safely violate.
The point is that compilers do some specific thing, regardless of the fact that the standards bodies say they're free to reboot your computer.
As long as all you care about is x86/x86_64/PowerPC (and probably ARM as well), then you can trust that the compiler is going to generate code which copies the first two bytes of x into the memory occupied by y.
That's the thing that haberman is trying to tell you, you can't trust that any more, even with architectures you think you know. What you said was true about 10 years ago, but things have changed. Go read about "-fno-strict-aliasing" [1].
#include <stdio.h>
#include <stdint.h>
void f(uint32_t *x, uint16_t *y) {
*x = 5;
printf("%d\n", *y);
}
int main() {
uint32_t x = 10;
f(&x, (uint16_t*)&x);
} *x = 5;
__sync_synchronize();
printf("%d\n", *y);
http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins....The reason this example is fundamentally different from my example is because mine doesn't create two objects that point to the same memory. In such situations, memory barriers are necessary. Also, your program won't work on different platforms due to endianness.
Notice I keep saying "if the address is unaligned". The insidious part is that it probably will work for a while since it's likely that your "foo" array will happen to be aligned. But add one uint8_t variable to your structure or stack frame or wherever "foo" is defined and things could shift and suddenly it starts causing bus errors. It can be a very annoying type of heisenbug.
And bus errors are actually a good thing. I believe I've used hardware (an ARM or an SH2, can't remember) where the memory controller just ignored the last 2 bits during whole word reads and writes (which works fine as long as you only read aligned words). So if run your code on that hardware it doesn't give you an error, it just subtly "corrupts" your data. Yay!
The memory barrier "fixed" this program similarly to how a cruise missile "fixes" a termite problem. It was just a coincidence and it was the wrong tool for the job.
A tool doesn't have a purpose. It has capabilities, and understanding why something works (and why it can be relied upon) is all that matters.
The problem with my program is that it casts an int32_t pointer to int16_t pointer. The correct fix is to not do that. "Fixing" the problem with a memory barrier is a step in the wrong direction.
My point is that it is guaranteed to work. A memory barrier guarantees that all memory operations before the barrier take effect before any operations after the barrier.
I think this whole exchange is fascinating because it illustrates two completely different philosophies to hacking. Both are equally valid. I tend to prefer yours because it tends to result in shorter programs. Yet this is just a programmer convention. The machines do not care.
Yet there are some instances where my philosophy -- understanding which rules may be safely ignored -- has paid off. For example, if your invalid program were in a closed-source library which I was forced to interface with, then the program can't simply be fixed. In that case, a memory barrier would probably be the cleanest workaround.
It's an unfortunate fact that this type of situation -- broken third-party code that can't be fixed and can't be replaced -- is quite common in the field. It seems like it's an important skill for an engineer to know how to handle such situations.
EDIT: By the way, Scrybe Music looks really cool!
There is a time and place to break the rules, but it is a calculated risk. It can only be considered "safe" if you make assumptions about your environment (platform, toolchain, etc). You're vulnerable if any of those assumptions change. The things people considered "safe" 10 years ago aren't "safe" any more. But the people who followed the rules never have to change their approach.
For what it's worth, a cheaper barrier in this case (if you were going to take that route) is just a compiler barrier like __asm__ __volatile__ (""); (see: http://en.wikipedia.org/wiki/Memory_barrier#Out-of-order_exe...). There's no need to emit an actual CPU barrier.
Thanks about Scribe; it's a labour of love.
It looks like the following code would work:
#include <stdio.h>
#include <stdint.h>
void f(volatile uint32_t *x, volatile uint16_t *y) {
*x = 5;
printf("%d\n", *y);
}
int main() {
volatile uint32_t x = 10;
f(&x, (volatile uint16_t*)&x);
}
And the compiler is guarantied () to issue store op on x = 5 and consecutively load op on y, but the code is looking pretty ugly.() assuming no alignment problems