C++ Exceptions: Under the Hood (2013)

C++ Exceptions: Under the Hood (2013)(monkeywritescode.blogspot.com)

115 points by arcatek 4 years ago | 111 comments

Working with and implementing C++ exceptions for 30 years now, including implementing exception handling for Windows, DOS extenders, and Posix (all very different), and then re-implementing them for D, I have sadly come to the conclusion that exceptions are a giant mistake.

1. they are very hard to understand all the way down

2. they are largely undocumented in how they're implemented

3. they are slow when thrown

4. they are slow when not thrown

5. it is hard to write exception-safe code

6. very few understand how to write exception-safe code

7. there is no such thing as zero-cost exception handling

8. optimizers just give up trying to do flow analysis in try-exception blocks

9. consider double-fault exceptions - there's something that shows how rotten it is

10. has anyone yet found a legitimate use for throwing an `int`?

I have quit using exceptions in my own code, making everything 'nothrow'. I regret propagating exception handling into D. Constructors that may throw are an abomination. Destructors that throw are even worse.

WalterBright 4 years ago | |

P.S. I implemented Structured Exception Handling for Win32, with help from a couple very smart people. Microsoft completely changed it for Win64, and the documentation on it is completely unhelpful to nonexistent.[1] I simply gave up on it. D on Win64 uses the exception handling mechanism I invented for the 32 bit DOS extender from the old Zortech days. It works fine, except that it cannot interact with C++ exceptions thrown by VC++ code.

[1] I attended a presentation on it by MS soon after Win64 came out. All I could think of at the end was "what's a cubit". I understood exactly nothing about it.

Iwan-Zotow 4 years ago | | |

Ah, Zortech C++. That indeed brings back some memories.

Walter, what do you think about Herb Sutter "new" C++ exceptions (which is basically about throwing an int/a word) ?

grandinj 4 years ago | |

On the other hand

() they solve a handful of use-cases really really well.

() if you are writing relatively decent C++, most code is pretty much exception safe already

() lots of abstractions are dangerous when mis-used

() they are sufficiently low cost that they almost never show up in the fairly extensive perf profiling I do on a large real-world application (LibreOffice). And LibreOffice throws exceptions __a lot__

(*) Except for toolchain writers, nobody cares how they are implemented

tialaramex 4 years ago | | |

> LibreOffice throws exceptions __a lot__

They're terrible for this. C++ Exceptions are a perfectly good exception mechanism, but what C++ programmers are trained to do with them isn't exceptions but error handling and they're not suitable for that.

"I tried to create the file but it already existed" is an error - and now you're going to write the unhappy path code, this is the wrong place to have exceptions.

"I tried to create the file but the OS now says the abstract concept of files is alien to it" is an exception. You are not prepared for this eventuality, the best you can do is explain to the user as best possible what happened and hope a human knows what to do.

dataflow 4 years ago | |

Don't you need exceptions though? How do you terminate arbitrary operations without exceptions?

Like say you call an algorithm (like std::sort) and during a callback (e.g. in the comparator) you decide to cancel the operation (perhaps user-requested). With exceptions it's easy; you just throw an exception and then catch it. No need to touch or even know the intermediate callers. But without exceptions what do you do? You have to go modify or reimplement the source code of every intermediate function, which is a giant waste of effort at best, and in reality a likely vector for introducing code duplication, brittleness, and bugs.

ilammy 4 years ago | | |

But with exceptions what do you do? You have to go modify or reimplement the source code of every intermediate function to be correct and safe when an exception is thrown at every point where it can be thrown, which is a giant waste of effort at best, and in reality a likely vector for introducing code duplication, brittleness, and bugs.

The point is, retrofitting exceptions onto existing codebase is a lot of pain.

Interruptible functions have the API they have because they have been designed with exceptions in mind for interruptions. If there were no exceptions, the callbacks would have had a different API. A special return value could be used to signal an interruption.

ribit 4 years ago | | |

I agree that a modern high-level programming language model needs an ergonomic error model. But C++ exceptions are not the only way to go. You can have error model that have similar (or even better ergonomy) than C++ while not having any of the drawbacks (like extremely complicated runtime stack, slow exception handling, messed up control flow etc.). Basically, in my personal opinion, any error handling that involves automatic stack unwinding is a failure.

tux3 4 years ago | |

What do you think of languages that use sum types for error handling, but can still unwind in a few scenarios?

Reasonnable compromise, or should we get rid of all unwinding always? And if so, do we abort() or do we ask users to handle any and all possible errors.

AndyKelley 4 years ago | | |

Now that I've been using Zig's error handling language primitives for some time, I've come to realize what the paradigm really is: a way to encode a "forwards" and a "backwards" at the same time, for the same block of code.

The usual way control flow progresses is forwards, but when an error occurs, it goes backwards, over the defers and errdefers.

C++ exception handling and other languages with destructors force you to do this declaratively, but then don't give enough control over exactly the situations they matter in: setup and teardown.

Meanwhile with explicit control, you just encode exactly what happens in the "backwards" control flow. No surprises, no trying to figure out what happens based on declarative rules. Once I figured this out, I was able to use it to simplify the logic of some things in the self-hosted compiler that are extremely error prone in the C++ implementation:

* Lazy source locations: passing in "none" for a source location, and then handling the "error.SourceLocationNeeded" and then doing the expensive calculations to find the source locations before retrying the operation.

* Generic instantiations: returning "error.GenericPoison" for when a type parameter cannot be determined without information from the callsite. In this case, the analysis is cleanly aborted and function marked as generic.

I'm pleased with how this turned out, and I've started to think of other languages in terms of how they map to this "forwards" and "backwards" control flow concept.

WalterBright 4 years ago | | |

I've read the proposals for it. It certainly looks good, yet exception handling looked good 30 years ago, too. I haven't used sum types myself, and often it takes years to discern whether things are really good ideas or not.

What I personally use is the "poisoning" technique. This involves marking an object as being in an error state, much like a floating point value can be in a NaN state. Any operation on a poisoned object produces another poisoned object, until eventually this is dealt with at some point in the program.

I've had satisfactory success with this technique. It does have a lot of parallels with the sum type method.

mbrubeck 4 years ago | | |

Rust is one such language. I wrote a bit about exception safety in Rust here: https://users.rust-lang.org/t/c-pitfalls-hard-to-avoid-that-...

In short, while the problem is mitigated somewhat compared to C++, it's still one of the most common causes of bugs in unsafe Rust code.

Rust programs can choose to abort on all panics, rather than unwind. Firefox does this, for example.

otabdeveloper4 4 years ago | |

Forcing the programmer to manually write stack unwinding code is not a solution.

That's like saying "garbage collecton is slow and complex, just use malloc() and free() instead".

jstimpfle 4 years ago | | |

Nobody said to manually write stack unwinding code or to only use malloc() and pair each of them with free().

There are other very good solutions that involve explicit structure. For example

- do not free things at all, just reserve a big chunk of address space and let the OS populate it as needed. When the process quits the OS frees everything automatically.

- do the same thing for parts of the program but implement the "OS part" in the program itself. There are variations of this known by terms such as "memory arena" or "pools". Basically, just take care to group allocations by end of lifetime. Then you can free everything in one go without tracking each lifetime individually in a stack frame (which is insane).

blub 4 years ago | |

Those look like problems for compiler implementers (tiny subset of users) or those writing code with very tight performance requirements (large amount of C++ code does not have such reqs). In spite of the reasons given, exceptions are successfully used (in C++ too) for error handling, because they can be much nicer that shuffling error codes/result types up the stack.

Really, as an end-user the issue with exceptions in C++ is another:

a) it's impossible to figure out what throws by looking at code.

b) it's (nearly) impossible to ensure that something doesn't throw

This means on one hand that one has to assume that any code can throw and manage resources appropriately, which is by now known and there are well-established idioms around it. On the other hand though it also means that the silliest error from a tiny library can bubble up into the event loop/main function and terminate an application.

Swift's syntax for exceptions illustrates what I mean, even though Swift does not unwind the stack.

neeeeees 4 years ago | |

> they are slow when not thrown

I thought the not-thrown case was pretty much zero-cost, at least in newer versions of Clang... How much of a slow down are we talking here?

WalterBright 4 years ago | | |

The main reason is the optimizer abandons trying to figure out flow-of-control when half the expressions can throw and present a path to the catch blocks. Furthermore, this all inhibits en-registering variables, because exception unwinding doesn't restore registers.

If you want your code to be fast, use 'nothrow' everywhere.

I don't know about newer versions of Clang, but I recall Chandler Carruth mentioning that LLVM abandons much optimization across EH blocks as infeasible.

WalterBright 4 years ago | |

At least with D a thrown exception must be a subtype of `Throwable`. I.e. you can't throw an `int`, oh, and the incredibly confusing throwing of an object that itself can throw.

minipci1321 4 years ago | |

> 10. has anyone yet found a legitimate use for throwing an `int`?

I use that a lot in constexpr computations -- to stop the compilation, I usually do 'throw __LINE__'.

-- Using a more complex type is not warranted -- there is no catching end in constexpr.

-- And in case the same routine ends up called non-constexpr, it will be easy to identify the place that called 'throw' -- line numbers are unique without additional effort. Just don't put two throws on the same line.

WalterBright 4 years ago | | |

That works until you incorporate code that throws errno.

omegalulw 4 years ago | |

+1 plain old return error codes and the related modern status codes are the way to go. Lots of people say this is more work. I would your comment does a good job explaining why that work is immensely useful.

josefx 4 years ago | | |

> plain old return error codes and the related modern status codes are the way to go

Why yes I just love to get a "Error one of the billion files this application tried to load wasn't available ErrorCode: ERR_MISSING_FILE_FUCK_WHO_KNOWS_WHICH". What I like about exceptions is that they make information that can't be encoded in a 32 bit integer value available to top level error handlers.

stinos 4 years ago | |

I have quit using exceptions in my own code, making everything 'nothrow'.

Assuming not all code you use is your own, how does this work in combination with other code (like the STL) which is not nothrow?

WalterBright 4 years ago | | |

Generic code (like you'd find in a library) is usually done with templates. Templates in D infer `nothrow`, giving them the advantage of being implicitly `nothrow` when their arguments are also nothrow. Inferring attributes this way is a major way D works.

renox 4 years ago | |

That's funny: I thought you were a proponent of exceptions because I read (a lot of years ago) a mail from you which said "who is going to check that printf failed"?

And this remain true: a lot of things can fail (arithmetic operations, every IO, etc) so the error system must be very "lean" otherwise the "happy path" is drowned in the error propagatio/handling code..

freesoftware 4 years ago | |

I'd suggest to use Lisp, which is a under-rated yet powerful language.

[Why?] https://gigamonkeys.com/book/introduction-why-lisp.html

[Exceptions] https://gigamonkeys.com/book/beyond-exception-handling-condi...

cjfd 4 years ago | |

"has anyone yet found a legitimate use for throwing an `int`?"

I could image throwing an int when writing a shell utility and throwing the return value of main as an int but I suppose doing that usefully would be pretty rare. Usually one cares more about whether a shell utility is successful or not not so much about the precise reason it failed. So, while I could imagine doing that, I don't see myself going for that option too likely.

WalterBright 4 years ago | | |

Yes, but you couldn't mix that code with code that throws an `int` for other porpoises. It becomes a global straightjacket for your code.

Const-me 4 years ago | |

> 10. has anyone yet found a legitimate use for throwing an `int`?

When an integer is the only value I need in the catch handler, I sometimes throw them, but only negative integers.

https://docs.microsoft.com/en-us/openspecs/windows_protocols...

saagarjha 4 years ago | |

> has anyone yet found a legitimate use for throwing an `int`?

Not sure if you consider this legitimate, but I have seen code that throws an errno.

daenz 4 years ago | |

Can you describe your ideal error handling mechanisms? Or at least other mechanisms that feel more correct?

WalterBright 4 years ago | | |

One technique I try first is to write code that cannot fail. For example, a sort function should never fail.

Consider the case of running out of memory. One option is to pre-allocate all the memory the algorithm will need, then it can't run out of memory. Another option is to regard out-of-memory as a fatal error, not one that needs to be thrown and caught.

Another example is UTF-8 processing. Early on, I did the obvious when invalid UTF-8 sequences were discovered - throw an exception. But this got in the way of high speed string processing (exceptions, even in the happy path, are slow). But what does one do anyway with such input? abort the display of the text? Nope. The bad sequence gets replaced with the Unicode "replacement character". This turns out to be common practice, and now my UTF-8 processing code cannot fail! And it's smaller and faster, too.

It's a fun challenge to figure out how to organize the program so it can't fail.

WalterBright 4 years ago | | |

I'm not going to recommend any technique I don't personally have years of experience with. Too many times a paper that makes something look great tends to have fatal flaws that only emerge years later. Sort of like WW1 strategies that sounded good but in practice produced only mud and dead bodies.

As I mentioned in another comment, I've had good success in the trenches with the poisoning technique.

gHosts 4 years ago | | |

I would go with the Erlang approach. Just die FFS. Let the process monitor restart you if you deserve to live.

jokoon 4 years ago | |

Who invented exceptions in the first place? Did they first appear in Java, encouraging C++ to imitate a java feature?

josefx 4 years ago | | |

Software wise it seems to originate from Lisp[1].

[1]https://en.wikipedia.org/wiki/Exception_handling#History

nicolasbrailo 4 years ago |

Author here; worth noting this article was written a decade ago, and while the concepts it describes are probably still useful (or so I'm told) the text is starting to show its age. Most notably, the examples are completely broken for x86-64, as I didn't have a 64bit processor when writing this.

cecilpl2 4 years ago |

> When the personality function doesn't know what to do it will invoke the default exception handler, meaning that in most cases throwing from a nothrow method will end up calling std::terminate.

This is an interesting tidbit that cost me a week of debugging recently - a try/catch block at the top of the call stack wasn't catching an exception.

We set up an exception handler that calls main in a try/catch block, so that any thrown exceptions can be caught, processed, and dispatched to our crash-logging system.

But destructors are marked nothrow by default. So we had a case where an object was destroyed, and about 10 levels down from its destructor some other system threw an exception, intending it to be caught by the top-level catch block.

But during stack unwinding we passed through the nothrow destructor and std::terminate got called before unwinding got to the top-level try/catch.

jcelerier 4 years ago | |

How can that take a week to debug ? gdb would stop at the std::terminate call in your dtor, and catch throw would allow you to see exactly where the exception was thrown

cecilpl2 4 years ago | | |

Well first of all I wasn't using gdb since it's not available on the platform this code was running on.

Second, the std::terminate call doesn't get called from the dtor, it gets called from the stdc runtime in the call frame of the throw (with OS code in between). The stack isn't actually unwound at this point, it's more like the stdc runtime is walking up the call stack looking for a landing pad at each frame.

Third, I didn't know about how this all worked, so I was trying to piece it all together for the first time.

Yes, I saw the throw happen. But the symptom was then that the program just... terminated.

adzm 4 years ago |

Implementation in Windows is quite different. Especially x86 vs x86-64 which has much less overhead. Also gets quite complicated with the Windows built-in Structured Exceptions/ asynchronous exceptions which can be translated into c++ exceptions -- and even more complexity due to different compiler options that handle these!

AshamedCaptain 4 years ago |

[2013]ish if I remember.

Also note the ABI for C++ exceptions followed by G++ et al is actually documented as part of the Itanium C++ ABI :

https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html

gpderetta 4 years ago | |

The doc is actually incomplete, it refers to implementation defined ABI entry points for the actual unwinding. The actual unwind tables and compensation opcodes are, IIRC, part of the DWARF standard, but last time I looked at it that standard was also incomplete and left a lot unspecified. Many of the details (including bugs that have now become part of the ABI) are basically folklore.

IIRC Ian LAnce Taylor had a series of blog posts that did shed light on a lot of these details.

zabzonk 4 years ago |

Under the hood of GCC specifically.

arcatek 4 years ago | |

That's the article I used when I implemented exceptions in an LLVM-based compiler, so it's applicable to more than just GCC.

nicolasbrailo 4 years ago | | |

Is your work public? If my article was useful, I'd love to have a look at what you did!

cogman10 4 years ago | |

Doesn't GCC support multiple exception handling options?

mhh__ 4 years ago | | |

I'm not sure exactly what you mean but there was a switch to DWARF EH ages ago (GCC 2?)

zabzonk 4 years ago | | |

I'm not sure what your point is - mine was that the original post is specifically about GCC, not Standard C++.

pjmlp 4 years ago |

Very interesting read, however it is under the hood on a specific implementation.

monkeycantype 4 years ago |

Just dropping by to say hi to a fellow c++ing monkey

nicolasbrailo 4 years ago | |