ABI – Now or Never [pdf](open-std.org) |
ABI – Now or Never [pdf](open-std.org) |
a. Having a real binary compatibility story is beneath C++, but
b. The accidental ABI compatibility that exists today is too widely adopted to break.
But platforms define the final memory and calling conventions so that can’t be part of any language spec - this is not unique to C++.
Windows has its own ABI, which it has had for a long time, so they can’t change it, so on x86 windows it will always be that.
For example, C++11 broke ABI at this level by changing the representation of std::string.
Applications talk via sockets to reduce coupling, thus fragility.
The reference is a 1h YouTube video.
Simple attempts to fix don't really work. Not even sure an ABI break will be enough, but it would at least be a minimum requirement.
But unique_ptr in public interfaces feels like code smell. I have done it, but not proudly.
I wrote up a reddit post for a possible workaround for removing the overhead. It's standard C++, no ABI break is required. It's not without caveats though: https://www.reddit.com/r/cpp/comments/do8l2p/working_around_...
Compilers have an attribute to remove this overhead, but it's an ABI break to do it.
Here's a simple example. Suppose we define std::string with the following layout (for simplicity I'm removing the template stuff, SSO, etc.):
class string {
public:
// various methods here...
size_t size() const { return len_; }
private:
char *data_;
size_t len_;
size_t capacity_;
};
When a user calls .size() on a string, the compiler will emit some inlined instructions that access the len_ field at offset +8 bytes into the class (assuming 64-bit system).Now suppose we modify our implementation of std::string, and we want to change the order of the len_ and capacity_ fields, so the new order is: data_, capacity_, len_. If an executable or library links against the STL and isn't recompiled, it will have inlined instructions that are now reading the wrong field (capacity_).
This is what we mean by the C++ ABI. This is a simple example, but there are a lot of other changes that can break ABI this way.
Nevertheless, certain language changes can force a breaking change to any existing ABI (or even all of them, and the C++ committee does not work in a vacuum. They work with existing implementations and must agree with implementers before making changes to the standard.
For example, there was a change to the definition of std::string in C++11 that forced a break in all commonly used ABIs (MSVC, Itanium at least). This was deemed necessary, but the cost of it to real-world programs has proven higher than anticipated, and may be a regretted decision (it apparently still causes problems and requires special flags even today).
That said in general ABI compatibility is expected of all changes, without very good reasons - security seems like a good one, performance alas not - I assume std::string got the small string optimization because they were already going to break ABI for the threading issues.
Of course it doesn’t help that c++ is still C like so implementations/use of data layout gets compiled directly into the client application :-/
struct iface
{
virtual ~iface(){ }
virtual void doSomething() = 0;
static std::unique_ptr<iface> create();
}
I don't know good ways to replace unique_ptr here.Requiring library users to #include the concrete implementation is not good: inflates compilation time, pollutes namespaces, pollutes IDE's autocompletion DB.
Can go C-style i.e. pass double pointer argument to the factory, or return raw pointer. But then user needs to remember to destroy the object.
But it would often be better to make a move-only value type, and return that instead.
On another platform, libstdc++ is mostly backward compatible, within reason.
The C++ standard is not "officially" concerned by stability, except that in practice people in the committee care a lot (because some major implementations care a lot) so some modifications are rejected because they would break the ABI currently used in practice.
Customers who needed stability were staying on ancient compilers. MS probably would rather have them using new versions, and exercising new features.