#include <stdio.h>
#include <stdint.h>
#define YDUMMY(suffix, size) char dummy##suffix[size]
#define XDUMMY(suffix, size) YDUMMY(suffix, size)
#define PAD(size) XDUMMY(__COUNTER__, size)
struct ExplicitLayoutStruct {
union {
struct __attribute__((packed)) { PAD(3); uint32_t foo; };
struct __attribute__((packed)) { PAD(5); uint16_t bar; };
struct __attribute__((packed)) { PAD(13); uint64_t baz; };
};
};
int main(void) {
// offset foo = 3
// offset bar = 5
// offset baz = 13
printf("offset foo = %d\n", offsetof(struct ExplicitLayoutStruct, foo));
printf("offset bar = %d\n", offsetof(struct ExplicitLayoutStruct, bar));
printf("offset baz = %d\n", offsetof(struct ExplicitLayoutStruct, baz));
return 0;
}A macro is effectively preprocessing facilitated by the language. You could always preprosess externally if you wanted, and there's nothing stopping you from doing that in the "Powerful Language (TM)" either.
Now whether people use macros and preprocessing usefully is another question, but not one to which the answer is "abolish macros for more language features". When used correctly, macros ARE power.
But if -sadly- you must use C, metaprogramming using macros is not a terrible thing.
That is, C style enums don’t have to have a name but “type safe” (enum class) ones do. One classic use is to name an otherwise boolean option in a function signature; there’s typically no need to otherwise name it.
C++ incompatibly requires a name for all struct and class declarations, again a waste when you will only have a single object of a given type.
I don't, either. Such were in D from 2000 or so.
I also don't understand why `class` in C++ sits in the tag name space. I wrote Bjarne in the 1980s asking him to remove it from the tag name space, as the tag name space is an abomination. He replied that there was too much water under that bridge.
D doesn't have the tag name space, and in 20 years not a single person has asked for it.
This did cause some trouble for me with ImportC to support things like:
struct S { ... };
int S;
but I found a way. Although such code is an abomination. I've only seen it in the wild in system .h files.You're right about "enum class", but anonymous classes and structs are perfectly valid in C++:
https://github.com/floooh/sokol-samples/blob/bfb30ea00b5948f...
(also note the 'inplace initialization' which follows the state struct definition using C99's designated initialization)
(I'm the author of the linked-to article.)
https://shafik.github.io/c++/undefined%20behavior/2019/05/11...
C++ namespaces are a way to avoid library A's symbol "cow" clashing with library B's symbol "cow" without everything being named library_a_cow and library_b_cow all over the place which is annoying. I agree C would be nicer with such a namespace feature.
However this technique is about what happens when you realise your structure members x and y should be inside a sub-structure position, and you want both:
d = calculate_distance(s.x, s.y); // Old code
and
d = calculate_distance(s.position.x, s.position.y); // New
... to work while you transition to this naming.
The story is different in C++, but in practice many compilers support it the same as in C. Especially for games, where VC++ (PC, Xbox) and Clang (PS4/PS5) are the most commonly used compilers, it also works as expected. The trick is to only use type punning for trivial structs that don't invoke complications like con/de-structors or operators. The GP's example of a Vec3 struct that puns float x,y,z with float[3] is a very common one in games.
You don't need eval(), you've got strcpy()!
I think that C++ is better than C, but C is not that bad, even for large projects.
Sure, and operating systems have been written in assmebly too. The question is whether it would be better than just sufficient if Linux were written in C++, today (ie C++17 or 20, not something old). Switching now probably wouldn't be feasible (even ignoring technical reasons, the kernel developer community is familiar with the C codebase and code standards and bought into it), but if Linux were started today, would it be a better choice?
Maybe the answer is still no and C would still be chosen, but the choice today is very different than it was when Linux was started. Of course, maybe Rust or something would be chosen today instead.
> C++ would probably be very easy
Not necessary, besides some small? problems due to the C++ allowing "more magic optimizations" then C they would switch to a sub-set of C++, and it might be so you would need to communicate to all contributors that a lot of C++ things are not allowed. And it might be easier to simple not use C++. I mean if it would be that easy the kernel likely would have switched.
I've had far more success hard-firewalling C++ into its own box where programmers can use whatever they can get running than trying to limit people to subsets.
The union is only as big as necessary to hold its largest data member. The other data members are allocated in the same bytes as part of that largest member. The details of that allocation are implementation-defined but all non-static data members will have the same address (since C++14). It's undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union.
What 6.5.2.3 simplifies is the use of unions of the type:
struct A{int type; DataA a;}
struct B{int type; DataB b;}
union U{A a;B b};
U u;
switch(u.type)...
Its not what is beeing used here.
std::variant is designed to deprecate all legitimate uses of union
It also may be true that one can't replace every use of the preprocessor. That shouldn't stop replacing what one can.
Everybody says that. Everybody believes it. And everybody goes to town making a rat's nest with macros, just like that snarl of cables under my desk that resist all attempts to make it nice.
Myself included. I've even written an article about clever C macros. Look, ma! I was so proud of myself.
But then I got older. I started replacing the macros in my C code with regular code. It turns out they weren't that necessary at all. I liked the C code a lot better when it didn't have a single # in it other than #include.
His tl;dr being that Rust feels very much like a proper systems programming language, and more of a « better C » than C++. I don't entirely know what to make of it, but my instinct is that something like C++ with such an opportunity space for baroque concoctions (leading to an obsession with design patterns) is just playing with fire.
I suppose migration is another possible use case for the union trick, and for that case C++ inline namespaces can be used as part of an implementation that achieves a broadly similar goal, but in a completely different way. As tialaramex notes, with inline namespaces you still end up with two different types.
C has polymorphism. Inheritance-based virtual dispatch is just one kind of polymorphism. It's common to wire up polymorphism in C with bespoke data structures using tagged unions it function pointers. Changing an implementation at link time is even a form of polymorphism.
I never truly appreciated how polymorphism can take so many different forms!
An example of the former is so-called "expression templates" in C++. I've seen them used to create a regular expression language using C++ expression templates. The author was quite proud of them, and indeed they were very clever.
However nice the execution, the concept was terrible. There was no way to visually tell that some ordinary code was actually doing regular expressions.
C++ expression templates had their day in the sun, but fortunately they seem to have been thrown onto the trash pile of sounds-like-a-good-idea-but-oops.
(I wrote an article showing how to do expression templates in D, mainly to answer criticisms that D couldn't do it, not because it was a good idea.)
> I'm referring to both syntax based (AST) macros ...
This surprises me greatly. Various lisps are among the most powerful languages I know of and a large part of the reason is macros coupled with their ability to execute arbitrary code at compile time (which itself uses additional macros, which in turn invoke more code, and so on). What's your take on this?
(Continuations are also pretty nice ...)
I've seen this happen with assembler macro languages, too.
> most powerful
It's like putting a 1000 hp motor in a car. It's main use is to wreck the car and kill the driver.
BTW, D is the first language of its type (curly brace static compilation) to be able to execute arbitrary code at compile time. It started as kind of "let's see what happens if I implement this", and it spawned an explosion of creativity. It has since been adopted by other languages.
The speaker was proud and beaming for showing how powerful C++ is and the audience was in awe.
I was incredulous! Jaw open! The "solution" was horrible with a bunch of workarounds for a bunch of shortcomings. It was a "the emperor does not have cloths" moment for me.
Boost Spirit may be better today with newer C++ features; I don't know.
But it was done just like you'd do it in C++. The same thing, just with D syntax.
Actually, they use it themselves. [0]
But it's also not used <to have namespacing> but to <improve on cross-field memory operation>.
I think C++ though is adding them.
What I'd like in c is designated function parameters.
// these the same
bar(.a = 10, .b = 12);
bar(.b = 12, .a = 10); struct bar_arguments {
int a, b;
};
int bar(struct bar_arguments args) { return 2*args.a + args.b;}
#define bar(...) bar((struct bar_arguments) {__VA_ARGS__})
// usage (will print 32 three times)
printf("%d\n", bar(10, 12));
printf("%d\n", bar(.a = 10, .b = 12));
printf("%d\n", bar(.b = 12, .a = 10));
The main drawback is that all parameters are now optional: it will not complain if you forget to assign all parameters, it will silently set them to 0 :-/ printf("%d\n", bar(10));
printf("%d\n", bar(.a = 10));
printf("%d\n", bar(.b = 12));
will print 20, 20 and 12.You can change those "default values", but then calling the function with regular positional parameters is impaired :-/
I don't understand. How would struct or class initialization be any different from simply doing, say, `for (auto& a : { x, y, z }) frob (a);` which is perfectly legal?
I don’t use extensions, even convenient ones, as I have to be able to run my code on a variety of compilers. If you don’t have to do that, some extensions (like this one) are really handy.
> An unnamed enumeration that does not have a typedef name for linkage purposes ([dcl.typedef]) and that has a first enumerator is denoted, for linkage purposes ([basic.link]), by its underlying type and its first enumerator; such an enumeration is said to have an enumerator as a name for linkage purposes.
And for classes/structs, [class.pre](https://eel.is/c++draft/class.pre#def:class,unnamed) has explicit wording:
> A class-specifier whose class-head omits the class-head-name defines an unnamed class.
So both are entirely fine (and likewise, unions are too).
Note that my links are for the current draft, but I just checked and this was already the case as far back as C++11. So I wonder where this persistent myth seems to come from.
The part of enums you quoted was C-compatible enums; anonymous scoped enums are explicitly forbidden: "The optional enum-head-name shall not be omitted in the declaration of a scoped enumeration" (dcl.enum 2).
Sigh. I will send in a clarification at least on the class/struct/union side. Ideally the grammar would be fixed rather than that paragraph.
The draft I looked at is https://timsong-cpp.github.io/cppwp/n4868/ (2020-10-18, shortly after the standard was approved).
https://www.yodaiken.com/2018/06/07/torvalds-on-aliasing/
Sure, it's compiler-specific, but I'm already using `__attribute__((packed))` anyways.
Secondly this technique does something different. The C hack doesn't touch the old code. But this "inline namespace" trick means old code has to explicitly opt into this backward compatibility fix or else it might blow up.
Lastly, I didn't try this, but presumably you did. Are the two separately namespaces classes the "same thing" as far as type checking is concerned? A vital feature of this union trick is that it's just one structure, it type checks as the same structure because it is the same structure. At a glance, I think the C++ solution results in two types with similar names, so that would fail type checking.
Yes, inline namespaces were only introduced in C++11, about 10 years ago, now lets dive into article.
"Learning that you can use unions in C for grouping things into namespaces"
Grouping into namespaces, so when did C++ get said feature?
ANSI/ISO C++89 released to the world in September 1998, which makes around 23 years, or 24 years if we consider the release of C++ compilers already supporting it the year before, like Borland C++.
This C hack definitly does touch old code, as it requires the code to be written to take advantage of the technique and is also touched again, when changes to the structs are required.
And naturally recompilation.
With inline namespaces, assumign recompilation you can naturally also change which set of identifiers and type aliases are visibile by default.
> If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’)
This is broader than the common initial subsequence clause, and allows punning between completely different types, e.g. int, char[4], and float.
You might ask, what is the point of the "common initial subsequence" rule then? It's to allow certain accesses that don't go directly through the union, so the compiler doesn't know for sure whether there's a union involved. Only problem is that all major compilers completely ignore this rule. [1] (But they do implement the first clause I mentioned, where the accesses do go through the union.)
[1] https://stackoverflow.com/questions/34616086/union-punning-s...
The C standard references "struct or union" all over the place because the two are so similar. The distinction is of course made clear in multiple places, but one that seems relevant here is:
> As discussed in 6.2.5, a structure is a type consisting of a sequence of members, whose storage is allocated in an ordered sequence, and a union is a type consisting of a sequence of members whose storage overlap. (ISO/IEC 9899:201x, §6.7.2.1, #6)
That's it. There's nothing about undefined behavior if you access one member and then another later. In fact there's even a paragraph which mentions doing just that:
> The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members (or if a member is a bitfield, then to the unit in which it resides), and vice versa. (ISO/IEC 9899:201x, §6.7.2.1, #16)
A pointer to the union points to each of its members, and can be dereferenced to access it.
std::variant is not used in C; C and C++ are two different languages.
Correct me if I'm wrong, but there is no part of the C spec that says this:
When initializing a union member that is smaller than the largest member, the remaining bytes will always automatically be initialized to zero.
If I'm right then the following caveat must be added to your statement:
> A pointer to the union points to each of its members, and can be dereferenced to access it.
... if and only if the member which was originally initialized is at least as large as the other member being accessed.
In other words, if you write your program in a way that ensures it will only compile when all union members are exactly the same size, and you have mandatory tooling to make sure that any changes to said union follow the same rule by force of compilation errors, then and only then can you claim what you claimed without the threat of undefined behavior.
static obj& some_call (obj& o, enum struct { abandon, save } disposition) { ... };
This is a common case (and should be more common) to avoid using an obscure boolean flag, which can lead to bugs. It shouldn't need a name.An anonymous namespace just means the name itself won't leak out; under C++ rules I need the name even to specify the enum tag, which is absurd.
The point is to prevent the “mysterious bool arguments” class of error.
The question is if ADL could infer the scope of the enum, as template instant is toon can now infer the right thing and don’t always need the <T> notation.
That's pretty strange, considering e.g. this paper for quite some time ago: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p022... which is written as if anonymous structs always were a thing.
I wonder if there isn't a deep confusion somewhere where "anonymous" and "unnamed" mean different things to different persons.
Nope. C considers that a.field2 is still a perfectly reasonable name for well... a.field2, even though a.sub.field2 is now also a name for that same field. The old code that cares about a.field2 works, no changes. You can recompile it, or not, either way, still a.field2
New code (or, sure, rewritten code if you have the budget to go around rewriting all your code) gets to talk about the new-fangled a.sub and it all works together.
Whereas with the C++ namespace hack that doesn't work.
Which is fine -- no reason your C++ namespace hack isn't great for whatever you wanted that for, and this hack is great for what it wanted to do. But where we end up with this thread is your claim that since Bjarne was pestered into adding the namespace feature to C++ in about 1990 this C hack isn't necessary for C++ even though the two are orthogonal.
Yes C++ has namespaces. Yes that's a good feature. No it doesn't help you solve this problem even with the later "inline namespace" feature.
> Grouping into namespaces, so when did C++ get said feature?
What you've done there, and perhaps in this whole thread, is assume that your context is the only context. There is some irony in the fact that this is the sort of problem namespace features in programming languages often aim to prevent. std::cmp::Ordering is very different from std::sync::atomic::Ordering
In your context "namespace" means the C++ feature. But in the author's context, as a C programmer, it just meant the plain fact that C considers a.foo to be a field in a, while a.b.foo is a field in b, which is in turn a field in a, and these names are separate, they don't shadow, they don't clash. The same way member names from different classes are in separate namespaces.
The correct way to access packed structs is through memcpy, just like you'd access any other potentially unaligned object.
Packed is technically not a undefined behaviour, but it is certainly a trap. Especially because the compiler macros leads people to make defines which select packed by compiler automatically. Then the special case of didn't recognize compiler is just left empty, meaning compiles but no longer does what you think.
It is a terrible thing. It's possible to do without them, and you'll like your code better. Your symbolic debugger and syntax directed editor will work properly. The poor schlub who has to fix the bugs in your code after you leave will be grateful. Your spouse will be happy and your children will prosper.
For example,
#define foo(x) ((x) + 1)
replace with: int foo(int x) { return x + 1; }
The compiler will inline it for you. There's no penalty. #define FOO 3
Replace with: enum { FOO = 3 };
or: const int FOO = 3;
Replace: #if FOO == 3
bar();
#endif
with: if (FOO == 3) bar();
The optimizer will delete bar() if FOO is a manifest constant that is not 3.According to this stance, any code that's suppressed by the C preprocessor should either be written in an if {} statement so that it will at least continue to compile as the surrounding code changes, or be replaced with comments describing what it does (or did), if it's important enough to keep track of.
Can't really think of many good counterarguments to this. Machine dependence might be one, but then you could argue that the preprocessor is being used to cover up for an inadequate HAL.
Conditional compilation is an optimization, not a semantic intent.
> non-optimized debug builds are affected
The code size will be larger. Doesn't matter.
The optimizer will delete bar() if FOO is a manifest constant that is not 3.
Yes, but the compiler won't delete it and complain if bar is not defined. if constexpr is not a direct replacement for if macro extern void bar();
will define it to the satisfaction of the compiler. The linker won't complain if the compiler removes the call to it. /* example.h */
void f(enum struct { x, y } arg);
/* example.c */
void f(enum struct { x, y } arg) {
/* do something with arg */
}
…then you've just created a function with two different overloadings based on distinct anonymous types which just happen to be spelled the same way. Without a type name I don't see any way you could define a function whose prototype would be compatible with the forward declaration. You also have conflicting definitions of "x" and "y" with the same names and scope but different types. Perhaps with GNU extensions you could use typeof(x) for the argument and avoid the conflict, but that isn't standard C++."Somebody once told me that in basketball you can't hold the ball and run. I got a basketball and tried it and it worked just fine. He obviously didn't understand basketball." https://blog.regehr.org/archives/213
Different GCC versions aren't randomly going to change documented behavior. And when they accidentally do, they will consider it a bug.
[1] https://stackoverflow.com/questions/11639947/is-type-punning...
[2] https://stackoverflow.com/questions/25664848/unions-and-type...
This is not true.
> what it really means is implementation defined
Implementation-defined behavior is a thing in the standard and is separate from undefined behavior.
void my_bar_class_type::bar(my_bar_type<my_bar_type_2>::my_bar_inside_type paramater);
and you are calling something like myObject->bar(mySecondObject.getWhatever());
You have to mock up like everythingThis is not (typically) the case. It would be like saying that you need to write lots of templates to get things done in D. Metaprogramming is certainly very nice to have but it's not a requirement for the vast majority of tasks.
It's important to note that Lisps are an entire family of languages; some implementations are batteries included while others are extremely minimal. Where things can get a bit confusing is that many macro implementations are so seamless that significant pieces of core language functionality are built in them. Schemes tend to take this to an extreme, with many constructs that I would consider essential to productive use of the language provided as SRFIs.
> macros then become your personal undocumented wacky language
That's Doing It Wrong™. You could as well argue to remove goto from a language because sometimes people abuse it and write spaghetti. C++ has operator overloading. D has alias this. If (for example) a DSL is the appropriate tool then being able to use macros to integrate it seamlessly into the host language is a good thing.
Right, but the temptation to do it is irresistible.
> That's Doing It Wrong™
Of course it's doing it wrong. The point is, that seems to always happen because the temptation is irresistible.
> goto
I rarely see a goto anymore. It just doesn't have the temptation that macros do.
> alias this
Has turned out to be a mistake.
> integrate it seamlessly into the host language is a good thing
Supporting the creation of embedded DSLs is a good thing. Hijacking the syntax of the language to create your own language is a bad thing. I've seen it over and over, it never works out very well. It's one of those things you just have to experience to realize it.
D's support for DSLs comes from its ability to manipulate string literals at compile time, generate new strings, and mixin those strings into the code. This is clearly distinguishable in the source code from ASTs.
I'd like to suggest that I think you might be missing some perspective here. You say that misuse of macros always happens and that you've seen it over and over. Yet if you explore the Scheme ecosystem you might notice that significant parts of any given implementation often take the form of macros. Racket in particular fully embraces the idea of the programmer mixing customized languages together and while examples of bad code certainly exist it seems to work out quite well on the whole.
To be clear, I do appreciate having easy access to tools that are simple and safe. I just also like having seamless access to and interop with a set of powerful ones that don't try to protect me from my own poor decisions. I shouldn't need to do extra work to make use of an alternative more powerful tool for a small part of a project. At that point it becomes very tempting to drop the safer tool altogether in favor of the more powerful one just to avoid the obviously needless and therefore particularly irritating overhead.
I much prefer the approach of providing limited language subsets that can be opted into and out of in a targeted manner. Having the compiler enforce a simple one by default provides a set of guard rails without getting in the way when it matters.
If I could write the majority of my code in something resembling Go and just a small bit of it in an alternative dialect with expressive power comparable to Common Lisp that would be ideal. To that end, I'm a huge fan of features like @system, @safe, and @nogc in D while very much disliking the need to use string mixins to write a DSL, the various restrictions placed on CTFE behavior, and other similar things.
I don't know much about D compile time evaluation. How is it better than macros/templates?
Also, what do you think about Haskell's and Rust's approach of generics with typeclass/trait bounds?
CTFE isn't better than templates, it's a completely different tool. CTFE computes a result at compile time as though you had written a literal in the source code. Templates generate blocks of specialized code on the fly based on various parameters (typically types). They solve different problems.
What does D do, then?
Why can't it have semantic intent? I frequently use it for cross-platform adjustments, and those don't usually compile on the defined-away platform.
Then you're doing it wrong :-/
All these can be made to work.
P.S. compiling successfully is not the same thing as linking successfully. Think stubs and deciding which files to link together.
And who is to say I'm doing it wrong? Just because conditional compilation CAN lead to a rat nest doesn't mean it MUST. And if it happens to be well-organized and works as is, there is no problem.
> What does D do, then?
Most mainstream languages, D included, very intentionally don't provide any features that could potentially be used to extend the language itself on the fly. (At least not in a straightforward manner. Obviously the C preprocessor kind of sort of facilitates a bit of this.)
As a counterexample, Rust does provide some of this in the form of procedural macros but doesn't provide (to the best of my knowledge) an equivalent to Lisp reader macros.
Rewriting it the way I suggest tends to force a cleanup of all that. You'll like the results.
BTW, if you want to try transitioning to the next level, try getting rid of all the `if` statements, too!
Oh, have I heard that before. Here's what happens, over the years, to such code:
1. #ifdef's on the wrong feature. For example, #ifdef on operating system for a CPU feature.
2. Overly complex #if expressions, that steadily get worse.
3. Fixing support for X that subtlety breaks F and Q support. It goes unnoticed because your build/test machines are not F and Q.
4. People having no idea what #defines to set on the command line, nor what the ones that are set are doing (if anything).
4. People having no idea how to fold in support for new configurations.
5. Code will #ifdef on predefined macros that have little or no documentation on what they're for or under exactly what circumstances they are set. If the writer even bothered to research that. gcc predefines what, 400 macros?
> Just because conditional compilation CAN lead to a rat nest doesn't mean it MUST.
Jyust yew wite, 'enry 'iggins, jyust yew wite!
Over how many years? Some of my projects are 30+ years old and support a few dozen combinations of OSes, compilers, and architectures, and the method I described hasn't led to a rat's nest yet.
> #ifdef's on the wrong feature. For example, #ifdef on operating system for a CPU feature
Sure, same thing happens when you decide whether to compile in your stubs or not. But if I make a mistake, the code usually won't compile on the configuration in question; if you make a similar mistake, it'll compile with your stub, it might even link, but you'll get odd runtime behavior. Sounds worse.
> Overly complex #if expressions, that steadily get worse
All code gets worse over time and bit rots if you don't maintain it, I don't see things necessarily worse in this area.
> Fixing support for X that subtlety breaks F and Q support. It goes unnoticed because your build/test machines are not F and Q
If you aren't testing on tier one configurations, they shouldn't be tier one. And if a tier three configuration breaks when you do finally get around to testing it, it's tier three, so it gets fixed when it can be and there's no problem. Similar to normal code, I'd say.
> People having no idea what #defines to set on the command line, nor what the ones that are set are doing (if anything) > People having no idea how to fold in support for new configurations
These problems exist for normal code as well. Someone has to own the build configurations, and they have to be maintained, like anything else.
> Code will #ifdef on predefined macros that have little or no documentation on what they're for or under exactly what circumstances they are set. If the writer even bothered to research that. gcc predefines what, 400 macros?
We only use a handful, and they are fairly well documented. But even if they weren't, since the code works on all tier one platforms, we'd know if something was changed out from underneath us because it was underspecified or accidentally relied upon.
But again the same can happen with normal code and system headers, compiler extensions, etc - I don't see why pre-processor macros have to be singled out here.
> Jyust yew wite, 'enry 'iggins, jyust yew wite!
How long?