Cost of enum-to-string: C++26 reflection vs. the old ways

Cost of enum-to-string: C++26 reflection vs. the old ways(vittorioromeo.com)

96 points by sagacity 27 days ago | 179 comments

vanderZwan 27 days ago |

> The header is the cost. Not the reflection. The reflection algorithm is fast – asymptotically ~0.07 ms per enumerator, essentially the same as the hand-rolled switch in the X-macro version (~0.06 ms). What makes reflection look expensive is <meta>: just including it costs ~155 ms per TU over the baseline.

So speaking of old ways, I'm not a C++ dev, but a while ago saw someone comment that they still organize their C++ projects using tips from John Lakos' Large-scale C++ software design from 1997, and that their compile times are incredibly fast. So I decided to find a digital copy on the high seas and read it out of historical curiosity. While I didn't finish it, one wild thing stood out to me: he advised for using redundant external include guards around every include, e.g.

     #ifndef INCLUDED_MATH
     #include <math>
     #define INCLUDED_MATH
     #endif

The reason for this being that (in 1997) every include required that the pre-processor opened the file just to check for an include guard and reading it all the way to the end to find the closing #endif, causing potentially O(N*2) disk read overhead (if anyone feels like verifying this, it's explained on pages 85 to 87).

Again, that was in 1997. I have no idea what mitigations for this problem exist in compilers by now, but I hope at least a few, right?

This conclusion is making me wonder if following that advice still would have a positive impact on compile times today after all though. Surely not, right? Can anyone more knowledgeable about this comment on that?

SuperV1234 27 days ago | |

This cost is not significant nowadays, it's the frontend/parsing time.

You can also use `#pragma once` which works everywhere, is nicer, and technically needs less work by the compiler, but compilers have optimized for include guards since a long time ago.

Some random measurements I found: https://github.com/Return-To-The-Roots/s25client/issues/1073

vanderZwan 27 days ago | | |

Yes, I've heard that before, but comments like this one in your linked issue still make me wonder:

> at least for gcc and Visual Studio using #pragma once has a significant impact. The fact is, the compiler does not need to continue parsing the whole file when reaching a #pragma once. otherwise the compiler always needs to do it even if the include guard afterwards will avoid double processing of the content afterwards.

As written the explanation for these optimizationst suggest that both "pragma once" and include guard optimization still requires opening and closing the file each time an include is encountered, even if you bail after parsing the first line. Is that overhead zero? Or are the optimizations explained poorly and is repeatedly opening/closing the file also avoided?

Either way, do you know what causes the slowdown as a result of including <meta>?

daemin 27 days ago | |

What I found (so far on MSVC) is that #pragma once does only process the file once, where as include guards still open the file each time it is included. Though it takes almost no time to do so but it still appears on the traces.

I'm going to experiment with other compilers and figure out how they handle it.

schaefer 25 days ago | |

>...from John Lakos' Large-scale C++ software design from 1997...

I'll just point out that Lakos updated his work with a new edition in 2019:

Large-Scale C++ Volume I: Process and Architecture

and there's scattered evidence that Volume II might be published in Feb. 2027 [1]

Large-Scale C++ Volume II: Design and Implementation

[1]: https://www.amazon.co.uk/Large-Scale-Implementation-Addison-...

vanderZwan 25 days ago | | |

Oh nice, thanks for the tip! Don't know if I can justify picking up a copy given that I do not work with C++ at all nor with large-scale systems. But I know a few people who might be interested.

MaxBarraclough 26 days ago | |

I've not been diligently keeping up with C++ recently but there's a C++20 feature called modules. Per Wikipedia, they're somewhat like precompiled headers.

https://en.wikipedia.org/wiki/Modules_(C%2B%2B)

sagacity 27 days ago |

Oof, that first example (the idiomatic C++26 way) looks so foreign if you're mostly used to C++11.

jsd1982 27 days ago |

I think the conclusion section should indicate that they are based entirely on GCC 16's behavior and current implementation. We should avoid generalizing one compiler's behavior and performance. Curious how this same test would behave once clang ships C++26 reflection.

SuperV1234 27 days ago | |

I explicitly mentioned that GCC 16.1 was the compiler used in the benchmarking section, do you think I also need to add a disclaimer in the conclusion section as well?

Regardless, I don't think things are going to differ much with Clang. Without PCH/modules, standard header inclusion is still the "slow part" of C++ compilation, regardless of the compiler used and the standard library used (libstdc++ vs libc++). `#include` is fundamentally the same on any modern compiler.

Because the reflection feature itself seems quite fast on GCC (compared to the cost of the header), I predict the results will be similar on Clang as well.

bluGill 27 days ago | |

I was thinking the same thing. Modules are still not widely used, it is a reasonable guess that there are a lot of optimization opportunities left.

SuperV1234 27 days ago | | |

That is true, but on the other hand Modules were standardized more than 6 years ago.

Promises and claims have been made for longer than that on how Modules would have improved compilation times and made everyone's lives easier. In 2026, I still have to see any real evidence of that, especially when PCH + unity builds are much easier to use (except on damn Bazel, which supports neither) and deliver great results.

If after 6+ years of development Modules are still so far behind, it is fair to question if the problem is with the design/implementability of the feature itself.

pjmlp 26 days ago | |

Or VC++ if ever, which has the best modules support, but it is still trailing behind in C++23.

w4rh4wk5 27 days ago |

I've been wondering about debug-ability of code using reflection. X-Macros are quite annoying to step through in most debuggers, though possible. While the code in the first example is evaluated fully at compile-time, how would you approach debugging it?

HarHarVeryFunny 27 days ago |

No doubt reflection has been built with other use cases in mind, but it sure would have been nice just to have std::to_string(enum)

bluGill 27 days ago | |

C++ conference speakers (including keynotes) are now begging everyone to stop using enum to string in their example. While they are a simple and easy to understand example, reflection is for much more interesting problems. I can't think of any other example that I would type into a comment box or put on a slide.

maccard 27 days ago | | |

Serialization is the canonical example. Being able to turn

    struct MyStruct {
      int val = 42;
      string name = "my name";
    };

into

    {
      "val": 42, // if JSON had integers, and comments of course
      "name": "my name",
    }

is incredibly powerfuly. If reflection supported attributes (i can't believe it shipped without, honestly), then you could also mark members as [[ignore]] and skip them.

cogman10 27 days ago | | |

It comes up pretty frequently in java. Serialization/Deserialization, adding capabilities based on type, Adding new capabilities to a type, general tuning (for example, adding a timing or logging call onto methods).

Almost all the Java web frameworks are giant balls of reflection. Name a function the right way or add the right magic annotation and the framework will autowire it correctly.

It's a pretty powerful tool. (IDK if C++'s reflection is as capable, but this is what was enabled by java's reflection).

surajrmal 27 days ago | | |

Anybody the derive traits rust has are a good demo.

theICEBeardk 27 days ago | | |

I mean a readable implementation of tuple with minimal overhead is a great case for me (went from around 1.6k lines to approximately 250 lines). I wrote an implementation including the normally difficult to implement tuple_cat based on c++26 within a few hours.

My favorite thing is that I will get to remove and replace most of the cryptic template recursion stuff I have with "template for" and maybe a bit of reflection. Debugging the unrolled stuff will be a joy in comparison.

randusername 27 days ago |

I can't imagine myself using reflection much, but maybe it will eliminate a lot of feature proposals bogging down the committee and they can focus on harder problems.

It would be cool if the stated goal of C++29 was compile times.

w4rh4wk5 27 days ago | |

I'd argue reflection is very much a feature for libraries. You wouldn't use it directly, but your JSON / YAML serialize is then built on top of it. So are your bindings for scripting engines like Lua.

SuperV1234 27 days ago | | |

You can already automatically serialize/deserialize arbitrarily nested structs since C++17 (using Boost.PFR). Since C++20, you can also serialize/deserialize the struct data member names automatically.

For many useful use cases, you don't need C++26 reflection at all. E.g. https://www.linkedin.com/posts/vittorioromeo_cpp-gamedev-ref...

bluGill 27 days ago | | |

There are a lot of things that are very very important for a tiny niche. In any non-trivial project you will end up with a lot of custom libraries and some of them really benefit from some obscure feature that no place else in your project would want.

agentultra 27 days ago | | |

Also nice for UI tooling; game tools, debuggers, etc. Pull apart a struct and display it on screen and not have to patch the UI tool every time you change the struct is pretty nice.

gpderetta 25 days ago |

I don't particularly mind the ^^ and [::] sigils, but the 'template for (constexpr auto ...)' is a bit ugly and hard to explain to a beginner.

But interestingly the code can be improved. The issue is that meta::info[1] is a pure compile time object so in the original code we need to statically unroll the loop of the vector that contains it so that we can splice it in in the loop body. But if we convert it to our own objects, then we can use a plain for loop.

  template<class T>
  constexpr static inline auto reflect_type = ^^T; // not really necessary

  template <typename T>
    requires std::is_enum_v<T>
  constexpr std::string_view to_enum_string(T val)
  {
     struct my_string_view { const char * ptr; size_t sz = strlen(ptr); }; 
     static constexpr auto meta = std::define_static_array(
          std::meta::enumerators_of(reflect_type<T>)
        | std::ranges::views::transform(
        [](auto e) { 
          return std::pair{my_string_view{define_static_string(std::meta::identifier_of(e))}, extract<T>(e)}; 
        }));;

     for (auto [name, value] : meta)        
     {
        if (val == value) { return name; }
     }
     return "<unknown>";
  }

This actually generate less code bloat as, if the array is large it will use a plain loop instead of always unrolling. Also the meta array can now be used for as lookup table for dense enums, while I don't think it is doable with the original version. Supposedly GCC should be able to convert a if chain into a switch statement, but it doesn't seem to trigger here [edit: scratch that: GCC does the switch conversion for the original version].

define_static{_array,_string} still feel as unnecessary magic, but hopefully they are only transient and we will be able to use std::vectors directly. Also somehow GCC doesn't let me use std::string_view and I had to introduce an helper string type.

edit: I literally learned everything I know about static reflection in the last 24 hours. It is complicated, but not that complicated.

[1] Not sure why, I suspect they want to avoid being constrained by ABI.

cv5005 27 days ago |

Never quite understood why people are so obsessed with meta programming capabilities in a language, be it templates, comptime, macros, whatever.

I program mostly in C, if I need 'meta' programming I just write another C program that processes C source code (I've written a simple C parser), then in my build script I build in two stages, build meta program, run it, build rest of program.

Simple, effective, debuggable (the meta program is just normal C), infinite capabilities - can nest this to arbitritary depths, need meta-meta programming? Make a program that generates a meta program.

miguel_martin 27 days ago |

I agree with some other's in this thread: this is example is not great, but I get why it was used: to compare with X-macros. How about something that would require code-generation e.g. via libclang?

For example, what does https://miguelmartin.com/blog/nim2-review#implementing-a-sim... look like with C++26's std::meta::info?

My guess is: libclang is more suited for this situation if you care about compile times, even if Python is used.

Panzerschrek 26 days ago |

Its misleading to call it "cost". In the C++ world only runtime cost matters. If using reflection allows to generate faster result code, it doesn't matter how long it takes to compile.

Moldoteck 26 days ago | |

our company doesnt do compile on push on the server. It only does it when approved by a subset of ppl. The reason is we have a limited amount of servers and compile takes about 40min/variation. It's very annoying considering at prev job compile took about 10 min in total (project was organized better+ better servers) and there wasn't a limit at all-> compile at each push to gerrit.

I'm now trying to migrate from msbuild to cmake+sscache+PCH for std libraries while also trimming unnecessary includes to reduce suffering in the future - if not for me then at least for future developers. So I would say compile time is important for development. It causes other limitations too (like bugfixing becomes a huge commit with several squished fixes together to avoid recompiles, messing up git history or slower context switching when developing several features in parallel)

pjmlp 26 days ago | |

It has a direct impact on the amount of emails and slack messages I get to reply to.

SuperV1234 26 days ago | |

Utter BS. Compilation times matter for productivity, developer motivation, iteration speed, CI turnaround time, and so on.

I'm sure you wouldn't say "it doesn't matter how long it takes to compile" it if took days. So where do you draw the line? Regardless, it matters.

Panzerschrek 26 days ago | | |

Even days of compilation may be an acceptable price for good optimization, as long as debug builds or builds with minimal optimizations are fast enough.

mentos 27 days ago |

Curious to see if Epic Games ever refactors their reflection in Unreal Engine to use C++ 26 reflections or not.

LugosFergus 27 days ago | |

That'll never happen. The engine's entire serialization system is built around their custom reflection layer and UHT. Not to mention how this would affect licensees. PLUS, they just laid off a bunch of people, and the leftovers are focused on Tim's Verse fiasco. I hate to use jargon here, but there's no "business value" to switching.

EDIT: and based on these compilation time results, this would be a major setback for building the engine, which already takes an eternity.

mentos 26 days ago | | |

Yea from my discussion/research with ChatGPT it seems compilation times would suffer.

drzaiusx11 26 days ago |

No surprise here that the macro + char* approach wins hands down. I'm not really an active C++ user but I did use a VERY similar trick in my custom C code generator DSL (writing in Ruby) just this week. Easy and no "magic" involved.

dataflow 27 days ago |

I don't see how a library like Enchantum could handle everything reflection does. (How) does it figure out duplicate enum values, for example? And (how) does it discover arbitrarily large, discontiguous ranges? And (how) does it do these on MSVC?

SuperV1234 27 days ago | |

In short, it probes enum values in a pre-defined range (e.g. [-256; 256]), and parses the `__PRETTY_FUNCTION__` macro at compile-time to extract the name of the enumerator.

Once you have that in place, you can easily detect duplicates, etc...

Of course, there are major limitations, as it's all a big hack: https://github.com/ZXShady/enchantum/blob/main/docs/limitati...

Similarly interesting is Boost.PFR, which gives you reflection superpowers since C++14: https://github.com/boostorg/pfr

psyclobe 25 days ago |

I made a magic_enum abi stub for this: https://github.com/psyclobe/magic-enum-reflect

king_geedorah 27 days ago |

Another win for X macros and for C style in general, though the author didn’t declare it as such.

SuperV1234 27 days ago | |

Author here. It isn't a clear "win" at all, there are tradeoffs to each approach.

spacechild1 27 days ago | |

The downside is, of course, that it's ugly and very awkward to use.

That's the essence of C++: you're basically trading ergonomics for compile times.

uecker 27 days ago | | |

Are X macros awkward? I find them very straightforward and clear.

psyclobe 27 days ago | |

Pretty much. Was hoping it would've been a 'reflection slam dunk' but no... same 'ol same 'ol.

psyclobe 27 days ago |

Man that aucks was looking forward to some kind of speed improvement. Using magic enum atm and I guess we'll continue to do so.

C++ build times are hard pill to swallow when migrating from c. This is just another reason we'll probably stick to writing c as t the company where I work. It's like asking someone to give up instant compilation for cleaner easier to read apps?

Also now that we have cleanup handlers in c (destructors) even less of a reason to move...

TZubiri 27 days ago |

"Enum to string"

We've come full circle huh?

Why do you need this, logging? In that case I would rather reflect the logging statement to pribt any variable name, or hell, just write out the string.

If saving for db, maybe store as string, there's more incentive for an enum in the db, if that's a string you might as well. At any rate it doesn't seem a great idea to depend on a variable name, imagine changing a variable name and stuff breaks.

SuperV1234 27 days ago | |

Logging, debugging, auto-generation of UIs/editors, etc... This is an extremely common operation and for a good reason.