Essential C (2003) [pdf](cslibrary.stanford.edu) |
Essential C (2003) [pdf](cslibrary.stanford.edu) |
Example: http://www.cs.stanford.edu/cslibrary/PointerFunCBig.avi returns 404. :(
Does 'advanceed' mean 'more advanced than advanced?' [1]
Although some feel like it’s too opinionated to belong in this article I really appreciated the above. Edit To be clear the author is advocating for simpler syntax here to increase program readability. This could be taken other ways.
Ugh. He's talking about inline use of post-increment and pre-increment (i.e. x++ and ++x) here. This is perfectly readable to a C programmer, and sidestepping them actually makes the code harder to understand.
for(i=n; i-- > 0 ;)
{ /* operate on a[i] */ }
converting i--; to a statement at the start of block makes it less clear that it's part of the iteration idiom rather than a ad hoc adjustment that's specific to this particular logic. There are other examples, but they're either more involved or statementification is less obviously wrong. return x++;
:) x = *stack--; // pop 'x' off of the stack
*++stack = y; // push 'y' onto the stack
This way is simple, direct, and it avoids inconsistent state.https://www.youtube.com/playlist?list=PLhQjrBD2T381L3iZyDTxR...
Sure it doesn't get in details about the language but you get the essential and the videos are great.
Topics:
* Best practices for C function signatures (caller allocates (which size?), callee allocates (where? which allocator?))
* Memory Ownership Models
* Borrowing
* Reference Counting
* Garbage Collectors and C-Libraries providing this functionality
* Interning Objects (Strings)
* RAII [1]? And it's benefits/flaws
[1] https://en.wikipedia.org/wiki/Resource_acquisition_is_initia...
Context: I feel that understanding the C memory primitives in not that hard (stack variables, malloc/free, C++'s new). But how to use them is devilishly tricky. I have seen little information about this.
This generally discusses the lack of RAII in C (towards the end), and what to do about it:
https://floooh.github.io/2019/09/27/modern-c-for-cpp-peeps.h...
...and this presents a (reasonably runtime-safe) general memory management strategy using tagged-index-handles instead of pointers:
https://floooh.github.io/2018/06/17/handles-vs-pointers.html
The gist is basically:
- don't allocate small chunks of memory all over the code base, instead move memory management into few centralized systems, and let those systems own all memory they allocate
- don't use pointers as public "object references", instead use "tagged index handles"
- don't use "owning pointers" at all, use pointers only as short-lived "immutable borrow references"
A previous HN discussion on that book:
https://news.ycombinator.com/item?id=15624521
EDIT: That earlier discussion has an excellent first post. Quoting:
"It bothers me so much that very few books (Kernighan) talk about WHY. WHY. WHY is a variable needed? WHY is a function needed? WHY do we use OOP? Every single book out there jumps straight into explaining objects, how to create them, constructors, blah blah blah. No one fricking talks about what's the point of all this?
Teaching syntax is like muscle memory for learning Guitar. It is trivial and simply takes time. Syntax - everyone can learn and it is only one part of learning how to code. Concepts are explained on their own without building upon it.
[... A list with learning resources the poster finds great ...]
This is learning how to produce music. Not learning the F chord. Teaching how to code is fundamentally broken and very few books/courses do it well."
I never really got to a point of learning Haskell or Lisp up until recently, it was always this --- I can do everything with C/C++/Java/Python and I could. But the thing is it is only after learning lisp that I really got the hang of thinking in top down manner(recursively), or for that matter it took Haskell to teach me composition intuitively, which then could be extended to my main language(C++). I understand that syntax doesn't matter much, but fwiw I still think in terms of lisp syntax when writing recursive code in C++/C. So yeah, take that for you will.
> In particular, if you are designing a function that will be implemented on several different machines, it is a good idea to use typedefs to set up types like Int32 for 32 bit int and Int16 for 16 bit int.
Use <stdint.h> please
> The char constant 'A' is really just a synonym for the ordinary integer value 65 which is the ASCII value
Not always, especially right after you came off a paragraph explaining how different machines have implementation-specific behaviors
> The compiler can do whatever it wants in overflow situations -- typically the high order bits just vanish.
This is a good time to explain what undefined behavior actually means
> The // comment form is so handy that many C compilers now also support it, although it is not technically part of the C language.
Part of the language since C99
> C does not have a distinct boolean type
_Bool since C99
> Relying on the difference between the pre and post variations of these operators is a classic area of C programmer ego showmanship.
I'm fine with you mentioning that this can be tricky, but this is more opinion than I am comfortable with in an introductory text
> The value 0 is false, anything else is true. The operators evaluate left to right and stop as soon as the truth or falsity of the expression can be deduced. (Such operators are called "short circuiting") In ANSI C, these are furthermore guaranteed to use 1 to represent true, and not just some random non-zero bit pattern.
Under the assumption that there are no boolean types from earlier, this is not true
> The do-while is an unpopular area of the language, most everyone tries to use the straight while if at all possible.
I would argue that people use do-while more than they need to
> I generally stick the * on the left with the type.
Not a problem, but :(
> The & operator is one of the ways that pointers are set to point to things. The & operator computes a pointer to the argument to its right. The argument can be any variable which takes up space in the stack or heap
And constants/globals
> To avoid buffer overflow attacks, production code should check the size of the data first, to make sure it fits in the destination string. See the strlcpy() function in Appendix A.
strlcpy is non-standard and probably not what you want
> The programmer is allowed to cast any pointer type to any other pointer type like this to change the code the compiler generates.
> p = (int * ) ( ((char * )p) + 12); // [Some spaces added by me to prevent Hacker News from eating the formatting]
Only in some very specific cases…
> Because the block pointer returned by malloc() is a void* (i.e. it makes no claim about the type of its pointee), a cast will probably be required when storing the void* pointer into a regular typed pointer.
Casting malloc is never required (and I would say usually not a good thing to do)
https://www.amazon.com/Primer-Plus-5th-Stephen-Prata/dp/0672...
My version was older than the Amazon version as I learned in 1987.
> Under the assumption that there are no boolean types from earlier, this is not true
Actually, I believe _Bool is guaranteed to use 0 for false, and any non-0 value is stored as 1 for true. Arithmetic on _Bool is also guaranteed, based on those values.
For example, I believe the standard guarantees:
_Bool x = 255;
assert(x == 1);
size_t y = 10 + x;
assert(y == 11);> Not a problem, but :(
In a declaration, * is a type modifier. E.g. `int a;` declares a variable of type "int" named "a". `int* a;` declares a variable of type "pointer to int" named "a".
The only time this doesn't work is if you stick multiple declarations on the same line. That's annoying to me, because it means you're breaking the "declare variables as close to their first use as possible" practice. It's not K&R C any more, you don't need to declare everything all at once at the top of the scope.
Also, to prove that `` in a declaration is part of the type, note that K&R style function declarations (no argument names, just types) are still valid C (though I'd strongly discourage their use). So `void func(int a, int b);` is identical to `void func(int, int);`. It's very different from `void func(int, int);` that you'd get if you assume the `` goes with the `a`.
Can you elaborate on this one? I thought && and || expressions always evaluated to 0 or 1.
Converting any scalar value (that includes pointers) to _Bool yields 0 or 1 (false or true).
edit: Ah unsigned underflow. :O
for (size_t i = n-1; i < n; --i) { /* operate on a[i] */ }
It works fine (unsigned overflow is well defined) but it's even less clear. for(i = n-1; i >= 0; i--)
{ /* operate on a[i] */ }
breaks if i is unsigned, like a size_t.For someone who doesn't have the operator precedence rules memorized, it isn't clear whether the above code means this:
x = *stack;
stack--;
or this: stack--;
x = *stack;
Combining those two operations into one line is a trade-off I will never agree with. And I'm a fan of C myself: https://gist.github.com/cellularmitosis/3327379b151445c602ad... https://gist.github.com/cellularmitosis/d8d4034c82b0ef817913...The two-liner is actually the one which is simpler and more direct, as it requires less knowledge of operator precedence rules. The one-liner and two-liner compile to the same number of instructions, so I don't see how either "avoids inconsistent state".
Many expert-level C programmers tend towards one-liners. Here's an example from the original "Red book":
c = ((((i&0x8)==0)^((j&0x8))==))*255;
nooooo don't do it sadpanda.jpgIt's about performance, or thread safety, or anything like that; it's about having a coherent mental model of the code. A statement should, if possible, represent a single, complete operation. Invariants should not be violated by a statement, with respect to its environment. (This more true for 'push' than 'pop'.) One way of solving that is to bundle the 'push' and 'pop' operations up into functions; someone else in this thread did that. But why bother with the mental overhead of a function call when you could just represent the operation directly? To be sure, there are cases where the abstraction is warranted, but a two~three-line stack operation isn't abstraction, it's just indirection.
> For someone who doesn't have the operator precedence rules memorized, it isn't clear whether the above code means [snipped] or [snipped]
> The two-liner [...] requires less knowledge of operator precedence rules
It's not operator precedence—that's a separate issue; despite having implemented c operator precedence, I don't know all of them by heart—but simply behaviour of pre- and post-increment/decrement operations. It's even mnemonic—when the increment symbol goes before the thing being incremented, the increment happens first; else after—but even if not, it's a fairly basic language feature.
Even beyond that, though, it's an idiom. Code is not written in a vacuum. Patterns of pre- and post-increment fall into common use over time and become part of an established lexicon which is not specified anywhere. Natural language works the same way. Nothing wrong with that.
> It's even mnemonic—when the increment symbol goes before the thing being incremented, the increment happens first; else after—but even if not, it's a fairly basic language feature.
I think you missed the issue.
This is 100% about operator precedence, and has nothing to do with the decrement operator being in front of or behind the variable.
This expression:
*stack--
means either this: (*stack)--
or this: *(stack--)
depending on the operator precedence rules.If this is the layout of memory:
~~~~~~
stack-1: | 52 |
stack: | 23 |
stack+1: | 19 |
~~~~~~
(* stack)-- evaluates to 22, while *(stack--) evaluates to 52. int pop_int ()
{
int x = *stack;
--stack;
return x;
}
void push_int(int x)
{
++stack;
*stack = x;
}
Genunine questions:- Is this worse? - How does the state get inconsistent?
Right, yes. I got confused by your example, because the example is definitely about pre- vs post-increment. My point about idioms still stands, though.
> (* stack)-- evaluates to 22, while * (stack--) evaluates to 52.
Actually, (* stack)-- evaluates to 23, but changes *stack to 22 :)
Thanks, I just realized I had that wrong :) https://pastebin.com/n7sHzW3p
> The type are [...] size_t which is the unsigned integral type of the result of the sizeof operator
(Emphasis mine.)
> The type size_t is an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object ([expr.sizeof]).
It's not guaranteed to have a full negative range, only to be able to represent -1.
Use ptrdiff_t as a signed size type.
Relying on post-increment? Make sure it's a one line block that is totally unbraced with only single letter variable names if you do it because otherwise it's just faux-macho C and that's /weak/.
I think you're projecting. The point being made was that when you're writing a simple stack (as you often might do in C, since the standard library and the language itself conspire against providing you one) and you don't have the overhead to write multiple functions to wrap it up (vertical space is an issue when you make more than one of these–trust me, I used to write Java and every thing about it was just a papercut in verbosity), the post- and pre-increment versions are concise, idiomatic, and–to be honest–more clear simply because they use the operators in the way that they are meant to be used. I can glance at them and see, OK, this one gives me whatever the stack is pointing to and then makes it point to the next element; this one first moves the pointer to the next element (which is free) and sets it. All in one line. There's nothing to show off here, this is just how you write C; those operators exist for exactly this purpose (and IMO single letter variable names are generally only a good idea in the smallest of scopes, and I personally use braces even when optional).
Umm, no? Indices are ordinals[0], forming the canonical/nominal well-ordering of a collection such as a array.
> an ordinal number, or ordinal, is one generalization of the concept of a natural number that is used to describe a way to arrange a (possibly infinite) collection of objects in order, one after another. [...] Ordinal numbers are thus the "labels" needed to arrange collections of objects in order.
In C an object can be larger than PTRDIFF_MAX, a real possibility in modern 32-bit environments. (Some libc's have been modified to fail malloc invocations that large, but mmap can suffice.) Because pointer subtraction is represented as ptrdiff_t, the expression &a[n] - a could produce undefined behavior where n is > PTRDIFF_MAX. But a + n is well defined behavior for all positive n (signed or unsigned) as long as the size of a is >= n.
There's an asymmetry between pointer-pointer arithmetic and pointer-integer arithmetic; they behave differently and have different semantics. Pointers are a powerful concept, but like most powerful concepts the abstraction can leak and produce aberrations. I realize opinions vary on whether to prefer signed vs unsigned indices and object sizes (IME, the camps tend to split into C vs C++ programers), but the choice shouldn't be predicated on the semantics of C pointers because those semantics alone don't favor one over the other.
Parent* p = container_of(field,Parent,pa_somefield);
access(p->pa_otherfield);
You'd usually define container_of using subtraction (not negative offset per se): #define container_of(FIELD,TYPE,MEMB) ({ \
const typeof( ((TYPE*)0)->MEMB )* _mptr = (FIELD); \
(TYPE*)( (char*)_mptr - __builtin_offsetof(TYPE,MEMB) ); \
})
but you shouldn't actually be using that directly, because thats what the macro is for. int abc(int a, int b, int c)
{
}
I can do postincrement. I learned C the macho way. We all still have to read that crap. Now I know better when I'm writing it. I strongly disagree that a = *stack--;
*++stack = b
is better in any way beyond "I'm a macho C guy" than a = pop_int();
push_int(b);
https://en.wikipedia.org/wiki/Duff%27s_deviceIt's fun when you first see it. Sure.
I agree with saagarjha, there is nothing unclear or "crap" about using basic operators in an idiomatic way.
But this is as much beside the point under discussion as global pointers you raise.
Post-increment is an artefact from PDP-11 assembler and maps to a single instuction there. That's where it came from quite directly. It's completely unnecessary. Most modern languages find it useless enough they remove it. Python goes fine without it relying on +=, for example. (Although some do repeat C mistakes when basing their syntax on C, eg the unbraced, single line block that serves only to add non-zero probability of introducing a future bug but with the benefit of precisely nothing.. Hi Walter! Larry Wall cops flack for Perl syntax but he did not copy that.)
Post increment is hardly the end of the world it just isn't useful. It doesn't help readability. It can harm it. As a question of taste I find it lacking.
But hey, everyone else uses it, and duff's device is fun to read so go with them, knock yourself out.