Exploring Polymorphism in C: Lessons from Linux and FFmpeg's Code Design (2019)(leandromoreira.com) |
Exploring Polymorphism in C: Lessons from Linux and FFmpeg's Code Design (2019)(leandromoreira.com) |
struct dict dict_driver_ldap = {
.name = "ldap",
.v = {
.init = ldap_dict_init,
.deinit = ldap_dict_deinit,
.wait = ldap_dict_wait,
.lookup = ldap_dict_lookup,
.lookup_async = ldap_dict_lookup_async,
.switch_ioloop = ldap_dict_switch_ioloop,
}
};
defines the virtual function table for the LDAP module, and any other subsystem that looks things up via the abstract dict interface can consequently be configured to use the ldap service without concrete knowledge of it.(those interested in a deeper dive might start at https://github.com/dovecot/core/blob/main/src/lib-dict/dict-...)
I remember being impressed by this approach, so I shamelessly copied it for my programming game: https://github.com/dividuum/infon/blob/master/renderer.h :)
ffmpeg -i input.wav -filter_complex " [0:a]asplit=2[a1][a2]; [a1]lowpass=f=500[a1_low]; [a2]highpass=f=500[a2_high]; [a1_low]volume=0.5[a1_low_vol]; [a2_high]volume=1.5[a2_high_vol]; [a1_low_vol][a2_high_vol]amix=inputs=2[a_mixed]; [a_mixed]aecho=0.8:0.9:1000:0.3[a_reverb] " -map "[a_reverb]" output.wav
That said, keeping those interfaces clean and consistent as the codebase grows (and ages) takes some real dedication.
Also recently joined the mailing lists and it’s been awesome to get a step closer to the pulse of the project. I recommend if you want to casually get more exposure to the breadth of the project.
I'm surprised that this article never mentioned the term.
C++ programmers will know this pattern from the `virtual` keyword.
This paragraph completely derailed me — I’m not familiar with golang, but `interface` in Java is like `@protocol` in Objective-C — it defines an interface without an implementation for the class to implement, decoupling it entirely from the implementation. Seems to be exactly the same thing?
Transitively, it most definitely uses OO techniques. Furthermore, by having such a clean C ffi (in both directions) it allows for the weaving of the Lua based OO techniques back into C code.
https://stackoverflow.com/questions/15832301/understanding-c...
*int (*encode)(*int);
Why not compile your snippets? Heads up to the author.Benno Rice gave a fantastic talk a few years ago called "What UNIX Cost Us," which he starts off by showing how to write some USB device code in macOS, Windows, and Linux. It only takes a few minutes to demonstrate how pretending that everything is a file can be a pretty poor abstraction, and result in far more confusing code, which is why everyone ends up using libusb instead of sysfs.
"Everything is a file" is not a bad abstraction for some things. It feels like Linux went the route of a golden hammer here.
The specific reason I mentioned it was because his initial example was about how much more ceremony and boilerplate is needed when you need to pretend that USB interfaces are actually magic files and directories.
int (func)().
Maybe you meant: int * (*func)(void)?
Don't mean to be pedantic. Just wanted to point it out so you can fix it.
Other than that, yeah doing by hand what C++ and Objective-C do automatically.
Secondly, Apple and Microsoft, do just fine with Objective-C and C++ for their video codecs, without having to manually implement OOP in C.
Also, this is a nice way to get the damn banana without getting lost in the jungle.
- instead of setting the same function pointers on structs over and over again, point to a shared (singleton) struct named "vtable" which keeps track of all function pointers for this "type" of structs
- create a factory function that allocates memory for the struct, initializes fields ("vtable" included), let's call it a "constructor"
- make sure all function signatures in the shared struct start with a pointer to the original struct as the first parameter, a good name for this argument would be "this"
- encode parameter types in the function name to support overloading, e.g. "func1_int_int"
- call functions in the form of "obj->vtable->func1_int_int(obj, param1, param2)"
[1]: https://learn.microsoft.com/en-us/windows/win32/com/com-tech...
[2]: https://www.codeproject.com/Articles/13601/COM-in-plain-C
No, that is essentially what Linux does in this article (and by the looks of it also ffmpeg).
struct file does not have a bunch of pointers to functions, it has a pointer to a struct file_operations, and that is set to a (usually / always?) const global struct defined by a filesystem.
As you can see, the function types of the pointers in that file_operations struct take a struct file pointer as the first argument. This is not a hard and fast rule in Linux, arguments even to such ops structures are normally added as required not just-in-case (in part because ABI stability is not a high priority). Also the name is not mangled like that because it would be silly. But otherwise that's what these are, a "real" vtable.
Surely this kind of thing came before C++ or the name vtable? The Unix V4 source code contains a pointers to functions (one in file name lookup code, even) (though not in a struct but passed as an argument). "Object oriented" languages and techniques must have first congealed out of existing practices with earlier languages, you would think.
The Power of Interoperability: Why Objects Are Inevitable
Manifold is a very interesting project that adds a lot of useful features to Java (operator overloading, extension classes, and a whole bunch more). I don't know if it's smart to use it in production code because you basically go from writing Java to writing Manifold, but I still think it's a fun project to experiment with.
1: https://github.com/SpongePowered/Mixin/wiki/Introduction-to-...
Go can't declare adherence up front, and in my view that’s a problem. Most of the time, explicitly stating your intent is best, for both humans reading the code and tools analyzing it. That said, structural typing has its moments, like when you need type-safe bridging without extra boilerplate.
Which makes a million times more sense to me, because realistically when do you ever have a structure that usefully implements an interface without being aware of it?? The common use-case is to implement an existing interface (in which case might as well enforce adherence to the interface at declaration point), not to plug an implementation into an unrelated functionality that happens to expect the right interface.
A signature declaration resembled an abstract base class. The target class did not have to inherit the signature: just have functions with matching names and types.
The user of the class could cast a pointer to an instance of the class to a pointer to a compatible signature. Code not knowing anything about the class could indirectly call all the functions through the signature pointer.
Or, in math speak, Oop means polymorphism but polymorphism doesn't mean oop.
That is why they are here to stay, and even all mainstream FP and LP languages offer features that provide similar capabilities, even if they get other names for the same thing.
It is like saying an artifact is useless, only because it get named differently in English and Chinese.
> Manifold is a Java compiler plugin, its features include Metaprogramming, Properties, Extension Methods, Operator Overloading, Templates, a Preprocessor, and more.
Neat tool. It is like having a programmable compiler built into your language.
If "Two" didn't have a "name: string" member, then the error would be on the call to "test".
interface Foo {
name: string
}
class One implements Foo {
constructor(public name: string) {}
}
class Two {
constructor(public name: string) {}
}
function test(thing: Foo): void {
//...
}
test(new One('joe'));
test(new Two('jane'));Also the article doesn’t actually mention OOP. You can use polymorphism without fully buying into OOP (like Go does).
The great thing about C is its interoperability, which is why it’s the go to language for things like codecs, device drivers, kernel modules, etc.
Additionally Metal is implemented in Objective-C, with Swift and C++ bindings.
I don’t know Go to be frank, just had a very shallow look at it once because of an interview, and apart big names behind it, it didn’t shine in any obvious way — but that’s also maybe aligned with the "boring" tech label it seems associated with (that is, in positive manner for those who praise it).
I wonder if there are many cases where C++ will devirtualize and Rust won't.
But then again Rust devs are more likely to use static dispatch via generics if performance is critical.
That's pretty powerful:
• any type can be used in dynamic contexts, e.g. implement Debug print for an int or a Point, without paying cost of a vtable for each instance.
• you don't rely on, and don't fight, devirtualization. You can decide to selectively use dynamic dispatch in some places to reduce code size, without committing to it everywhere.
• you can write your own trait with your own virtual methods for a foreign type, and the type doesn't have to opt-in to this.
Put another way, in C++ the dynamic dispatch is implicit, so you might write code which (read literally) has dynamic dispatch but the optimizer will devirtualize it. However in Rust dynamic dispatch is explicit, so, you just would not write the dynamic dispatch - it's not really relevant whether an optimizer would "fix" that if you went out of your way to get it wrong. It's an idiomatic difference I'd say.
I'm not sure I follow - pretty much 99% of usage of C++ in the last, like, 20 years has been around making sure that you get static dispatch with polymorphism through templates. It's exceedingly uncommon to see the `virtual` keyword unless you have, say, some DLL-based run-time plug-in system going on.
var _ AssertedInterface = &MyType{}That virtually never happens. Seriously, what would be the odds? It’s so much more usual to purposefully implement an interface (eg a small wrapper the writer thingy that has the expected interface) than to use something that happens to fit the expected interface by pure chance.
It’s not a structural vs nominal problem but other, typescript is structural but has the implements keyword so that the interface compliance is checked at declaration, not at the point of use. You don’t have to use it and it will work just like Go, but I found that in 99% of cases it’s what I want: the whole point of me writing this class is because I need an interface implementation, might as well enforce it at this point.
For those used to the language it was seen as "lighter" and easier to add OO like abstractions to your C usage than bog down in the weight and inconsistencies of (early) C++
Since very little is ever removed from C++, all the inconsistencies in C++ are still there.
Without signatures, we have to use some kind of delegating shim which takes the virtual function calls, and calls the real object. It could be a smart pointer.
With signatures, we don't use smart pointers, just "pointer to signature" pointers. However, I suspect those pointers had to be fat! Because, surely, to delegate the signature function calls to the correct functions in the target object class, we need some vtable-like entity. The signatures feature must generate such a vtable-like table for every combination of signature and target class. But target object has no space reserved in it for that table pointer. The obvious solution is a two-word pointer which holds a pointer to the object, and a pointer to the signature dispatch table specific to the signature type and target object's class.
If we can use concepts to do this, with a smart pointer that ends up being two words (e.g. pointer to its own vtable, and a pointer to the target object), we have broken even in that regard.
#include <iostream>
using namespace std;
template <typename T>
concept Speaker = requires (T t) {
t.speak();
};
class Duck {
public:
void speak() const {
cout << "quack";
}
};
class Dog {
public:
void speak() const {
cout << "auau";
}
};
class Cat {
public:
void speak() const {
cout << "miau";
}
};
template<Speaker T>
void speaking_animal(const T& animal) {
animal.speak();
cout << "\n\n";
}
template<Speaker... T>
void speaking_farm(const T&... animals) {
auto space_adder = [&](auto creature) -> void {
creature.speak();
cout << " ";
};
(space_adder(animals), ...);
}
int main() {
Duck duck;
Dog dog;
Cat cat;
speaking_animal(duck);
speaking_animal(dog);
speaking_animal(cat);
speaking_farm(duck, dog, cat);
}
Live example, https://godbolt.org/z/vPhf13xEhC++'s vtables are also, in my experience, especially bad compared to Objective-C or COM ones (MSVC btw generates vtables specifically aligned for use with COM, IIRC). Mind you it's been 15 years since I touched that part of crazy.
It is a simplification of OLE, and by the time the idea came up to use that approach, there were tons of OLE code since Windows 3.1.
By the way it wasn't gone away, after how Longhorn went down, it became the main API delivery mechanism on Windows, sadly improving the tooling has never been a pritority other than half-finished attempts.
Moreover, everything here can be done without a concept.
This version of the code builds with g++ -std=c++17. We just get worse diagnostics if we try to use something as a Speaker which doesn't conform.
#include <iostream>
using namespace std;
class Duck {
public:
void speak() const {
cout << "quack";
}
};
class Dog {
public:
void speak() const {
cout << "auau";
}
};
class Cat {
public:
void speak() const {
cout << "miau";
}
};
template<typename T>
void speaking_animal(const T& animal) {
animal.speak();
cout << "\n\n";
}
template<typename... T>
void speaking_farm(const T&... animals) {
auto space_adder = [&](auto creature) -> void {
creature.speak();
cout << " ";
};
(space_adder(animals), ...);
}
int main() {
Duck duck;
Dog dog;
Cat cat;
speaking_animal(duck);
speaking_animal(dog);
speaking_animal(cat);
speaking_farm(duck, dog, cat);
}
I was thinking about more something along these lines. But note the double indirection: we end up passing the smart pointer animal_pointer by reference.We achieve the "signature thing" though in that we take these animal objects and effectively get them to to conform to the common animal_pointer abstract base without their cooperation.
#include <iostream>
using namespace std;
class Duck {
public:
void speak() const { cout << "quack"; }
};
class Dog {
public:
void speak() const { cout << "auau"; }
};
class Cat {
public:
void speak() const { cout << "miau"; }
};
class animal_pointer {
public:
virtual void speak() const = 0;
};
template <typename T> class animal_pointer_impl : public animal_pointer {
private:
T *obj;
public:
animal_pointer_impl(T *o) : obj(o) { }
virtual void speak() const { obj->speak(); }
};
void animal_api(const animal_pointer &p)
{
p.speak();
cout << '\n';
}
int main() {
Duck duck;
Dog dog;
Cat cat;
animal_pointer_impl<Duck> p0(&duck);
animal_pointer_impl<Dog> p1(&dog);
animal_pointer_impl<Cat> p2(&cat);
animal_api(p0);
animal_api(p1);
animal_api(p2);
}
animal_api is a regular function, which represents some external API that we don't get to recompile.Also now you have to build enough of a C API to expose the features, extra annoying when you want the API to be fast so it better not involve extra level of indirections through marshalling (hello, KDE SMOKE)
At some point you're either dealing with limited non-C++ API, or you might find yourself doing a lot of the work twice.
int val;
bool valueSet = getFoo(&val);
if (valueSet) {}
printf(“%d”, val); // oops
The bug can be avoided entirely with C++ if (int val; getFoo(&val)) // if you control getFoo this could be a reference which makes the null check in getFoo a compile time check
{}
printf(“%d”, val); // this doesn’t compile.In C++ we can declare variable in the while or if statement:
https://en.cppreference.com/w/cpp/language/while
https://en.cppreference.com/w/cpp/language/if
It's value is the value of the decision. This is not possible with C [1].
Since C++17 the if condition can contain an initializer: Ctrl+F if statements with initializer
https://en.cppreference.com/w/cpp/language/if
Which sounds like the same? Now you can declare a variable and it value is not directly evaluated, you also can compare it in a condition. I think both are neat features of C++, without adding complexity.
[1] Also not possible with Java.
Also in Java: https://www.geeksforgeeks.org/for-loop-java-important-points...
{ int val; if (getFoo(&val)) {
...
}}
Both ways of expressing this are weird, but stating that this can't be achieved with C is dishonest in my opinion. {
int val;
if (getFoo(&val)) {
}
printf("%d", val);
}
The bug is still possible, as you've introduced an extra scope that doesn't exist in the C++ version.Also, this was one example. There are plenty of other examples.
But i agree the C++ if(init;cond) thing was new to me.
If you rewrote it in a more modern way and changed the API
std::optional<int> getFoo();
if (auto val = getFoo()) {}
There are lots of improvements over C’s type system - std.array, span, view, hell even a _string_ classThe problem is all the folks that insist coding in C++ as if it was C, ignored all those C++ improvements over C.
"In the strict mathematical sense, C isn't a subset of C++. There are programs that are valid C but not valid C++ and even a few ways of writing code that has a different meaning in C and C++. However, C++ supports every programming technique supported by C. Every C program can be written in essentially the same way in C++ with the same run-time and space efficiency. It is not uncommon to be able to convert tens of thousands of lines of ANSI C to C-style C++ in a few hours. Thus, C++ is as much a superset of ANSI C as ANSI C is a superset of K&R C and much as ISO C++ is a superset of C++ as it existed in 1985.
Well written C tends to be legal C++ also. For example, every example in Kernighan & Ritchie: "The C Programming Language (2nd Edition)" is also a C++ program. "
That is rather dated, they do things like explicitly cast the void* pointer returned by malloc, but point out in the appendix that ANSI C dropped the cast requirement for pointer conversions involving void, C++ does not allow implicit void conversions to this day.
The "well written" remark is relevant.
Many style guides will consider implicit void conversions not well written C.
Naturally we are now on C23, and almost every C developer considers language extensions as being C, so whatever.
all the time -- I call it "crowd-sourcing intelligence" :-)
> Well written C tends to be legal C++ also
and python2 can be written to be compatible with python3, but neither is a subset of the other
if (Foo* f = GetPtr(); f->HasValue()) {} // wrong; f can be null
vs if (Foo* f = GetPtr(); f && f->HasValue()){}
Is probably the biggest pitfall. Especially if you're used to this: if (Foo* f = GetPtr())
{
f->DoTheThing(); // this is perfectly safe.
}So using casts that hid implicit int declarations for years is "well written"?
> Many style guides will consider implicit void conversions not well written C.
I could not find a "Many" guide, Linux Kernel and ffmpeg seem to advocate against pointless casts.
> Naturally we are now on C23,
So irrelevant to the creation time of ffmpeg and only applicable to intentionally non portable libraries.
It is possible that you confuse while- and for statements?
My post was about while and if and you repeatingly bring up for. The for statement is another statement.
Good luck to WG14 (or maybe a faction within it?) as they seem to have decided to go make their own C++ competitor now, it's a weird time to do that, but everybody needs a hobby.
I hope you realize that this is example does not need any complex C++ features.