How can C Programs be so Reliable? (2008)(tratt.net) |
How can C Programs be so Reliable? (2008)(tratt.net) |
It makes you exceptionally paranoid about failure states and practically requires a bit of thought and planning before attempting any non-trivial change.
The mindset of "it's fine to ignore all error conditions and let the default exception handler print a stack trace to the user" results in software that is annoying to the user.
That is the reason it should be used as a learning tool. So that you know the nitty gritty details without anyone "managing" it away from you.
Like running in a weight vest.
Edit: ah, perhaps you meant, in addition to using raw C, one should also learn how to use such static analyzers & cie
The "purpose" here is to instill a sense of paranoia around error handling, so that the resulting program from the PoV of the user appears to handle whatever bizarre combination of inputs the user put in.
It's the difference between telling the user "Failed to open foo.txt (file not found). Do you want to create it?" and ending the program with a 100-line stack trace with "FileNotFoundException" buried somewhere in there.
As the article says, checked exceptions are not the solution here.
I've literally never seen, in a professional working environment, exception languages (Java, C#, Python, etc) actually check that the file they tried opening was actually opened, and if it wasn't, directing the user with a sensible message that allowed the user/operator to fix the problem.
In C, the very first time that you fail to check that `fopen` returned non-NULL, the program crashes. Then you check if it returned NULL, and need to put something in the handling code, so you look at what `errno` has, or convert `errno` to a string.
I will bet good money that you could grab the nearest C#/Java/Python/JS/etc engineer to you, ask them to find the most recent code they wrote that opened and read/wrote a file, and you'll find that there is literally no code to direct the user if the file-open failed. The default runtime handler steps in and vomits a stack trace onto the screen.
In C, you are forced to perform the NULL-check, or crash. Sure, many devs are simply going to have a no-op code-path for the error cases, doing `if ((inf = fopen(...)) != NULL) { DoSomethingWith(inf);}`, but proceeding on success is a code-smell and easy to visually spot as an error.
The exception languages make it virtually impossible to spot the code-smell of handled (or improperly handled) exceptions, and make it easy because the dev can just read the stack trace, create the file needed, and proceed with programming the rest of the app.
What a good program must do when a file open failure is encountered is direct the user in some way that they can fix the problem. For example "file doesn't exist. Create it [button], Choose a file [button]", or "Permission denied. Try running as a different user.", or "File is locked. Are you already running $PROGRAM?", or "$FOO is a directory. Specify a file.".
[EDIT: Yes, seeing a stack trace in a shipped product is one of my personal bugbears that I feel very strongly about. If it's a stack trace for an expected error (like failure to open/read/write a file) I absolutelydo get annoyed by this public display of laziness. And yes, this is one of those hills I'll die on before I leave it!]
Even the simplest c/line programs annoy me no end when the application simply dumps a stack trace to the screen. Sure, I can dig into it, but the average user is going to ask for help on stackoverflow, just to figure out what must be done to fix the error.
I believe language like Pascal and Python are good to initiate you to algorithmic. After that, I agree you would need to dive into languages such as C, Rust and why not Assembly language to have a better understanding of your machine.
In our case, after one year of coding with Pascal, we spent the remaining years in focusing on C and C++ (Builder)
After that, it's time to move to pragmatic choices (job market requirements in terms ofdevelopment stack)
It was not a problem for me as I was learning Turbo C and Visual C++ 5/6.0 a year or two beforehand.
Everyone else in the class, though, were sooooo frustrated with their "89 errors, 103 Warnings" all because they forgot to add a semicolon in the code.
Truth is they were not getting anywhere to understanding how to write or care about the quality of the code.. leading to proper planning, etc. They would keep changing something to reduce the errors/warnings.
Personally, I think every person has their own journey into the world of programming. For me, I was happy for it to be C, with a bit of Pascal and Visual Basic. For someone else, perhaps Scheme and Javascript. Another maybe Java.
Some developers/programmers, in my opinion, are not for C. controversial ... I know.
How is that better for developing a sense of paranoia around error states?
"Throw it at the wall and see what sticks" does not exactly lead to "extreme paranoia managing errors".
Today developers don't build that skill, so you see applications that just fails silently everywhere or produce nonsense errors. The better developer tools you have the less your developers will need to learn UX skills to be able to do their work.
If you run this
use std::fs::File; pub fn main() { let fp = File::open("test").unwrap(); }
You get this:
thread 'main' panicked at /app/example.rs:11:33: called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" } note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
The language incentives you to propagate errors up to main and then just let it fail.
The consequences of doing something incorrect or nonportable is sometimes that the expected behavior occurs. This can be validated by testing on the couple of platforms (or just one) that the program supports, and kept working.
Another thing we need to consider is that reliable is not the same thing as robust, let alone secure. A program that appears reliable to a user who uses it "as directed", giving it the expected kinds of inputs, will not necessarily appear reliable to a tester looking for ways to break it, or to a cracker trying to exploit it.
A truly reliable program not only handles all its functional use cases according to its requirements, but is impervious to deliberate misuse, no matter how clever.
Security flaws are found in reliable, well-debugged programs used in production by millions.
A practical example of this is word size issues: A program that casts pointers to ints everywhere is perfectly reliable on 32-bit machines, but it will die horribly on any LP64 machine, which are most 64-bit machines. Related are endianness issues, which is why projects have tended to stop supporting big-endian systems: They're just too rare to scrounge up anymore, and unless you're actively testing on them, bugs can slip in which will not be caught on little-endian hardware.
Similarly, OS developers stop supporting architectures when they can no longer find working examples of them. This is because emulators have bugs, and without a source of truth (working hardware) it's very hard to determine if a bug you just found is in the OS or the emulator; add unreliable hardware to that and things just get worse. Bob Supnik (former DEC VP, creator of SimH) has a PDF:
You can do the same thing with tagged unions to implement a poor man's sum types. It is significantly more verbose than in a language that has syntactic support for this, but you get similar compile time safety guarantees.
Completely opposite experience here. C is great for explorative coding because it's just structs and functions.
There's no agonizing about whether some piece of code should go here or better there, wrapped in this or that concept or high level abstraction.
Instead you start with an empty screen and incrementally build your ideas from small building blocks, much like in Lisp or Forth.
Completely disagree. The lack of screwing around selecting abstractions forces you to make something productive right away and not stress about refactor.
Retrospective: https://cacm.acm.org/opinion/retrospective-an-axiomatic-basi...
Here is a pdf of the retrospective along with the original paper : https://harrymoreno.com/assets/greatPapersInCompSci/2.2_-_An...
How Did Software Get So Reliable Without Proof? C.A.R. Hoare 1996 (?) http://users.csc.calpoly.edu/~gfisher/classes/509/handouts/h...
That's something I can understand because when I wanted to buy a motorbike I was advised to ride a bicycle first since it's more difficult to control. Except that the White House called recently for companies to not use non memory safe languages such as C to build software.
Just sensationalist journalism.
Interpreting this as "stop using C/C++" isn't much of a stretch. Yes, it is not a demand. Anticipating such a demand isn't a bad bet, however.
Who is an authority, anyhow? The White House is citing NIST, DHS, Microsoft, Cambridge DSCT, Google and others. Whom do you offer?
I don't like this myself. We're rapidly building tools that could conceivably solve memory safety in C/C++ code bases. I don't want C pilloried by Authority and its group thinking ways.
https://stackoverflow.blog/2024/03/04/in-rust-we-trust-white...
In 6 hours, I will share the link to the official and related White House PDF document.
What language are you talking about? Go to godbolt and try that with any of the compilers there for C or C++.
It is also hard to handle errors more meaningfully than instantly terminating the process at the first whif of something going sideways.
And once you do write such a thing, try making automated tests to exercise it!!
How many programs actually check the return value of close() ?
Sure, this sounds a bit Linux/POSIX specific. There are only a few billion devises running such code, perhaps I’m overreacting…
I think a critical difference is that in C the program is more liable to simply crash if errors aren't correctly handled, whereas in Java/Python/etc the program can just log a stack trace and keep on truckin', even if the bug is actually quite severe. In some cases a crash is preferable - e.g. if something goes wrong in a text editor while saving data, it's a lot better for the user if the program crashes versus the alternative where the editor runs as normal but saving doesn't work. Crashes in C also bring more urgency for developers to actually fix the bug compared to a try/catch in Python that simply buries it "until I get a chance to debug it properly." (But crashing also leads to a lot of frustration when the error wasn't that important and the C program should have just kept going.)
Exceptions are exceptionally good at error handling - they always do the correct default (bubbling up if not handled, bringing a stacktrace with them, and by default they auto-unwrap the correct return value, not making the actual business logic hard to decipher), plus they make error handling possible on as wide scope as needed (try block vs a single return value).
I absolutely fail to see how a “random” C program would fair better, I’m sure the errno state is not checked at every line, as it is a trivial human error to leave that out. You can’t forget exceptions, which is how it should be!
If anything, that java/python text editor will catch every exception at the top level, save the file and exit with an error message and it will be the C program that either randomly crashes, or ignores some exceptional state.
One bug took several months to track down. This is cognitive dissonance at its finest.
2) C programs, at least the ones we use now, are a product of a lot of use and debugging
3) It didn't take too many debugging sessions as a C programmer to learn to program a bit more carefully.
4) and the more it gets used, the more error codes it encounters, and the more robust the handling gets. I think a dirty secret of software engineering isn't that the most complicated/heavily used code gets the most and most useful comments, it's that it also get the most error handling/detection code, and for the vast majority of non-core loop code: error handling ..isn't.
No. Original programmers that just happened to use C were better, not the other way around.
So yes. C programmers were better.
Exactly. Sorry for my unclear wording.
> A year or two after I'd joined the Labs, I [Rob Pike] was pair programming with Ken Thompson on an on-the-fly compiler for a little interactive graphics language designed by Gerard Holzmann. I was the faster typist, so I was at the keyboard and Ken was standing behind me as we programmed. We were working fast, and things broke, often visibly—it was a graphics language, after all. When something went wrong, I'd reflexively start to dig in to the problem, examining stack traces, sticking in print statements, invoking a debugger, and so on. But Ken would just stand and think, ignoring me and the code we'd just written. After a while I noticed a pattern: Ken would often understand the problem before I would, and would suddenly announce, "I know what's wrong." He was usually correct. I realized that Ken was building a mental model of the code and when something broke it was an error in the model. By thinking about how that problem could happen, he'd intuit where the model was wrong or where our code must not be satisfying the model.
It can tell you that i = i++ increased i by 2, for instance. That might not even be true of another instance of i = i++ in the same object file being debugged.
There is no substitute for knowing what the C will do before it is run.
Additionally, of course anything can be done with enough time and effort but the costs add up when you are doing things manually rather than letting the compiler handle it. Compiler has had many person-years of effort spent in ensuring the output is correct, can we do the same for all the code we write?
For example, if I'm coding in C# it's easier for me to understand the impact of passing our resources that need to be disposed and good patterns to handle that after Rust has made me lose hairs on this concept.
You can tell people to be careful drivers all you want, but what really saves lives is airbags, crumple zones, and seatbelts.
> You can tell people to be careful drivers all you want, but what really saves lives is airbags, crumple zones, and seatbelts.
Could just be that the frequency of repeated accidents (and the related injuries) is too low to instill paranoia. The analogy is not the same as with programming in C, where the frequency is "multiple times a day", and not "less than once in a lifetime.
IOW, I feel that
> The fact that so many of the issues still persist suggests it [extreme paranoia] does not.
is inaccurate.
#include <stdio.h>
int main() {
int x = 5;
int y = &x;
FILE *f = x * y;
fputs("hello", f);
}I mean, you said:
>> Knowing the type of everything won’t save you
but the compiler is trying to save you! You have to actively work against it in order to hang yourself, and you blame the language?
Now its an exact line number with a little description of what went wrong and sometimes a suggestion on how to fix it.
I'm my experience, C developers sometimes just dislike exceptions in other languages because exceptions defy C code path expectations. That was my first instinct when going from C to other languages. And so additional arguments against exceptions are put forth, including performance issues (which are real) and this idea of paranoia being better than language tooling (which is rather suspect imo).
I feel you are mischaracterising my position, which was to have graduating students work in C for a non-trivial amount of time before they moved on to a new language.
The argument that paranoia is better than language tooling is entirely absent from my arguments.
While writing code it's trivial to ask yourself if the next statement will include a call to a function which is not defined within a compilation unit under your control.
If it will, then you lookup documentation for that function to determine what it expects and how it can fail. Most of my functions which interact with such functions look like long sequences of this:
/* close the file */
ret = -1;
do {
errno = 0;
ret = close(fildes);
} while (0 != ret && EINTR == errno);
if (0 != ret) {
perror("Error");
goto off_ramp;
}
I've caused plenty of bugs in my career, but I can say with confidence that 0 of them had to do with ignoring/skipping proper error handling. You have to do so intentionally.Then a few years ago I started to write code in a primitive language without exception handling. I miss exceptions now.
> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize.
> Eschew flamebait. Avoid generic tangents.
In particular you seem to be responding to a strawman, since nowhere did I say C error handling was better than structured exceptions. The parent asked what was functionally different between laziness around checking error codes versus laziness around try/catching.
> in C the program is more liable to simply crash
I simply disagree with this statement, as silent failure is also very common in case of C, which is probably the worst option. (Especially that it may cause memory safety issues, that may not even materialize until much later).
No need to take my comment that seriously though, I had no bad intention whatsoever.
They were exceptional because they were exceptional, not because they were C developers.
C is typically described as a "low-level" programming language, where the "low-level" normally refers to the supposed distance from the language to the actual hardware. But as many incidents with UB demonstrate, this distance is still quite larger than expected. I think there is another sense of the word "low-level", which is the amount of abstractions that are either built into the language or allowed for users, and C doesn't have a lot of them.
Combined together they represent related but distinct axes of controllability, and C only achieves a modest level of controllability in one axis but not in another. The ideal language with controllability in comparison should minimize the distance to the machine and maximize an amount of abstraction to control anything below the language instead.
Learn assembly TOO, not instead. I did, as part of computer architecture course. Very valuable. I think you should learn everything from transistor level up if you want to do serious programming. I don't think you need to actually use it, but it's sometimes very handy to know those things.
Hosted C does in the form of the standard library. C does have a freestanding variant though and its book keeping is generally limited to knowing struct member offsets.
That's usually true, but you can write C++ and Rust codes that more closely map to machine code as well. Most C code exhibits that only because you can't have enough abstractions to disrupt that mapping. It is good when you do need that kind of correspondence, but most applications rarely need them, and even performance-sensitive applications don't need them all the time. C does give you a knob, but that knob is stuck in a lower but not lowest position.
Java had a very bad model of checked exceptions (the OP was written in 2008). A correct way is to make it a part of the type system, though it doesn't have to be a sum type like Rust, and make any error-related code path as convenient as possible to use.
> In C, you are forced to perform the NULL-check, or crash.
You don't necessarily crash if you failed to perform a NULL check! That's literally the single biggest problem with C's undefined behaviors. C looks like forcing checks only because most programmers do understand crashes are bad, so they do prepare for trivial or demonstrated crashes. But that's not guaranteed, and they can't easily prepare for non-crash failures without additional tools.
In the case of using the result from `fopen`, I don't know of a platform where a dereferencing of NULL (which happens in a separate translation unit, which is already compiled and linked, and will not be subject to LTO and other optimisations) within the various read/write/seek/tell functions doesn't result in an immediate crash.
I fully admit that this is applicable only to this particular example, and to all the functions in the stdlib. Everywhere else (code you wrote, that will be subject to aggressive optimisation, for example), you may not necessarily crash on a NULL dereference.
In the sense of instilling a sense of paranoia, the relative frequency of crashing due to UB is high enough that it does develop the sense of paranoia.
> In the sense of instilling a sense of paranoia, the relative frequency of crashing due to UB is high enough that it does develop the sense of paranoia.
Paranoia isn't a cure however. A good programmer will and arguably should develop an instinct to avoid C for most cases instead. I too have written tons of C codes, and yet I feel really uneasy about using C at all. I can't believe that C merely induces the sense of paranoia.
Although in a very different content, I have seen "dereferencing" a null pointer in C++ not crash immediately, if you dereference it to call a nonvirtual class member function, e.g,
t->foo();
Depending on how this gets compiled and the implementation of `foo()`, the segfault may not come at the line above, where technically `t` is being dereferenced. It may come inside `foo`, or somewhere further down the call chain. The resulting crash may not even manifest as a segfault.In C++, Visual Studio will bombard you with all sorts of silly C++ Core Guideline advice if you try to write simple and straightforward code.
Finally, the standard libraries and 3rd party libraries might also get in the way if 'simple and straightforward' clashes with the idiomatic style of the language.
I'd only be worried if I were in the business of selling software written in C/C++ to the government but a few campaign donations to politicians would probably get that fixed.
There are many advanced systems in government written in C++ that would be impractical to deliver in any other language even if you were starting from scratch today. Extreme data intensity and throughput requirements are actually causing systems written in memory safe languages like Java to be replaced with C++ systems currently. Rust is often not a good fit for the software architectures required unless you are comfortable writing a lot of awkward unsafe code.
The government is pragmatic about programming languages, not ideological. They use both Rust and C++ in new systems but not for the same purpose, both have unique strengths in certain roles that ideologues are loathe to acknowledge. I use Rust and C++ the same way.
If you have to write "a lot of awkward unsafe code" in Rust, you're doing something wrong. I use C++ rather than Rust, but any time I'm implementing any sort of non-trivial data structure or inter-thread communication, I encapsulate it in a "safe" interface that is impossible to misuse (at least in debug mode) and clearly deliniates the boundary at which e.g. a memory safety or data race audit would have to cover. In general, minimizing the volume and surface area of "unsafe" code is a generally useful heuristic in any systems language.
I don't think this is possible at all in C, which doesn't have classes, and the sophisticated following of pointers to find a method.
The base case here is user-space virtual memory, which is de rigueur for high-scale data intensive applications. Objects not only don't have a fixed address over their lifetime even if you never (logically) move them, they often don't have an address at all, and when they next have an actual memory address it may materialize in another thread/process's address space. And hardware can own references to and operate on this memory outside the object model (e.g. DMA). The silicon doesn't respect the programming language's concept of an object because it doesn't know objects exist. You have to design non-trivial async scheduling and memory fix-up mechanics to make all of this reasonably transparent at the object level. It is actually a pretty elegant model, complex compiler negotiations notwithstanding.
And of course, the reason we put up with the implementation complexity is that there is a huge gap in scalability/performance between this and the alternatives. Same reason people use thread-per-core software architectures. It works around some fairly deep limitations in the silicon and OS.