Deconstructing K&R C Is Dead (2015)(c.learncodethehardway.org) |
Deconstructing K&R C Is Dead (2015)(c.learncodethehardway.org) |
It's an older home wiring technology that works fine for years if undisturbed, is still present and working OK in homes all over, was invented in the early days of electrified homes, requires considerable skill to install properly, tends to be unsafe if not handled skillfully, is expensive and delicate to modify, has no hidden components, allows interesting wiring layouts because conductors are separated, ...
One could go on with the obvious parallels. (I learned on a PDP-11.)
- A base ten number system
- Lack of useful structure in the symbols and names for numerals, and lots of weird inconsistencies in number names
- Inconsistent, confusing, and arbitrary names/notation for basic mathematical operators and functions
- Use of inferior Gibbs/Heaviside vector algebra instead of Clifford/geometric algebra
- Very poor notational conventions in many advanced math/physics fields
- A highly irregular calendar
- Poorly designed measurement systems
- English spelling
- Very distorted dominant world map projections
- Most nutrition “science”, including federal dietary guidelines
- Bogus forensic “science” used to imprison innocent people
- The methodology and writing style used in political science
- Many essentially debunked economic models which continue to be taught
- A legal system chock full of incidental complexity and inconsistencies
- Inadequate species taxonomies
- Poor color models used in art/design
- Even worse, specification of colors using proprietary, arbitrary Pantone chips
- Lots of poor/obsolete metrics used for evaluating lighting
- Audio mastering with heavy-handed dynamic range compression
- Lectures as primary pedagogy in high school/college
- Grammar drills as a method for teaching foreign languages
- Modern zoning requirements in many countries
- Many unsafe and inefficient street design requirements
- The rigid design of modern shoes (let’s not even start on heels)
- Terrible user interfaces for most household appliances
- Mediocre user interfaces for many musical instruments
- An inefficient and dangerous typewriter / computer keyboard (which persists on tiny phone screens!?)
- Unhealthy design of office furniture, car/airplane seats, child strollers, etc.
- .....
Some of this stuff is decades old. Some is thousands of years old.It's much more like comparing crawling (machine code), walking (assembly), C (bi-cycling), and higher level languages (faster to write, more built in safety features, etc).
Each is a good fit for a given role, and sometimes you need to get through tight spaces where using one of the lower impact methods is more effective; or maybe you just can't afford something 'nicer'.
Use the correct tool for the job.
This is a rather strange and insulting article. I'm not sure why Zed can't help "old programmers" nor do I understand why he's angered that individuals know about undefined behavior in C. Is there any background to this or did he have the misfortune of being insulted on IRC?
Edit -- I googled for a bit and discovered this was in response to someone doing a pretty good job technically reviewing the book for free! http://hentenaar.com/dont-learn-c-the-wrong-way Perhaps the title was a bit inflammatory.
Zed's rebuttal is at https://zedshaw.com/2015/09/28/taking-down-tim-hentenaar/ and is a great example of how not to react to constructive criticism. My favorite part is his safercopy function and the lack of size_t.
And finally, to leave us all with a quote from Zed's rebuttal:
"Over this next week I’m going to systematically take down more of my detractors as I’ve collected a large amount of information on them, their actual skill levels, and how they treat beginners. Stay tuned for more."
Wow.
All I can say is the order of topics, the choice of topics and the quoted explanations would make for a very confused beginner. Especially the crusade he seems to have against strings and functions called incorrectly. That makes me think he should be teaching the language, not the language he wishes it were. Of course these are selective quotations so I can't draw too many conclusions.
Going on my time teaching C, I wouldn't even mention Duff's device or safer, better strings at this level. There's better ways to introduce defensive programming, along with a discussion of the pros and cons.
Oh, I'm past 50 so am clearly "doomed" and beyond help. Not that I'm sure what I need help with. Oh well. :)
If you read the first part of that same sentence, it should give you a clue.
I disagree. Low level languages, especially C, are the easiest to master. K&R book is the only book you need to read to know everything about C. All you need after you understand the fundamentals is a bit of discipline.
C++ on the other hand is extremely difficult to master. Just have a look at the rules for Rvalue references and you will see what I mean.
It may be easier for a complete novice to write some code that doesn't crash in C++ than it is in C, but not mastering it, or even be good at it.
I am a huge C fan but this is not true at all. C has tons of pitfalls, especially with modern UB-aggressive optimizing compilers. There are a lot of rules you need to be aware of that are not naturally-occurring results of the fundamentals.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf is the most recent draft before the official, purchase-only C11 was published according to http://www.open-std.org/jtc1/sc22/wg14/www/standards. I don't know if it's identical, but it should be close, and it's free.
Absolutely not.
The problem with C is that even a C master can't necessarily write correct code, because C is a very programmer-unfriendly language, making developers remember to do various actions manually and perform error-prone calculations.
C++ is definitely harder to master (after many years, I can't say I master every corner of the language), but it's much easier to write correct code in C++ and it will be just as fast, run on as many platforms, etc, etc.
C lost this battle a long time ago, it's surviving because of nostalgia, still having good street cred and inertia. The number of domains where one must use C is shrinking and now that we also have Go and Rust this will accelerate. All for the better, really.
But man, the guy is insecure to the point of requiring therapy or something. He seems obsessed with his image and status, and the slightest criticism will cause him to lash out in an immature and ridiculous manner. Past rants have him making lewd comments about penis-sizes and challenging others to a physical fight[1].
It's a shame, because if he just relaxed a bit and took criticism gracefully, he'd probably find himself to be a bit more valuable to the community and employers, and would actually be a pretty decent dude. Instead, his writing seems to reek of a constant need to validate and defend himself.
This is probably an unfair comparison, but I can't help but think of Terry Davis: a brilliant programmer hindered by mental issues. Schizophrenia is obviously not the same as insecurity, but I think the situation here is somewhat similar.
[0] http://programming-motherfucker.com/
[1] http://harmful.cat-v.org/software/ruby/rails/is-a-ghetto
I do agree that we should be moving away from C and C++, though. It's pretty simple, really: C was a pretty good language in 1978. We didn't know a lot of things in 1978 that we do now in 2016. It now makes sense to revisit those decisions in light of nearly 40 years of practice. The so-called "PL Renaissance" has given us a whole host of new languages which have steadily chipped away at the dominance of C and C++, and I think this is a healthy trend that ought to continue.
> "You're right, but you're wrong that their code is bad." I cannot fathom how a group of people who are supposedly so intelligent and geared toward rational thought can hold in their head the idea that I can be wrong, and also right at the same time.
Zed, you're right, period. But I think you probably just hurt people's feelings because they revere Kernighan and Ritchie and this is one prominent item of their legacy.
> But C? C's dead. It's the language for old programmers who want to debate section A.6.2 paragraph 4 of the undefined behavior of pointers. Good riddance. I'm going to go learn Go (or Rust, or Swift, or anything else).
Amen. The union of those three are likely to address all use cases that C handled in the past.
BTW the blog post would be clearer if titled: " 'Deconstructing K&R C' is dead". Gotta love mixing up C with natural language operator precedence ambiguity. :)
Not a single actual quote from any of his detractors, for the reader to judge for him or her self if their criticisms have any validity.
The categorical declaration of "I cannot help old programmers," without providing the evidence he has for this claim. Lots of name calling, though.
No link to the original content, to determine for ourselves whether or not it was fair to K&R's work.
I suppose Zed just meant this to be personally cathartic, and didn't realize he posted it on a public web site where other people can read it?
Yes. I can't figure out exactly what he's ranting about. He writes "I will make it clear that my version of C is limited and odd on purpose because it makes my code safe." Does this mean he defined a safer subset of C? (There are lots of those. I've taken a crack at that myself [1], but it's politically hopeless. Rust is the way forward.)
Why would anyone want to write K&R C today? It's awful. It didn't even check function parameter types. Struct fields were just offsets; you could use one on a pointer of the wrong type and the compiler wouldn't complain. (Considering that Pascal predated C by some years, and had a sane type system, this was kind of lame. But they were trying to compile in 64K of 16 bit words in one pass. That was an adequate excuse in the 1970s.) The first ANSI C at least had a sane type system.
[1] http://www.animats.com/papers/languages/safearraysforc43.pdf
He's done these kinds of rants repeatedly. It's his counter-productive style. I can't judge his arguments on a technical level, (I do think his introduction to various language guides are excellent.) but these kinds of rants surely just alienate more people than they persuade?
There is nothing wrong with carefully crafted C code for applications were it is the best suited tool. Sure, there are sharp edges. True you can write crappy, security nightmare code.
You do make some good points. I agree Go is fantastic. Rust is coming along as well. However, C still runs the world. That's not changing anytime soon. Not with the explosion of IoT and GPU type devices. And, hello Linux kernel and all the glorious command line tools on nix.
Try using Go or Rust (love both, x2 for Go) to allocate say a hundred GB of memory for some huge/fast in-memory data processing. Let me know how far you get.
Your rant is as polarizing as those who are blind to C's flaws (yes, there are a few). Stop saying "don't write C", that's just childish. Rather, what about "let's write better, less security flaw prone C."
As an engineer, one ought to choose wisely when choosing tools. This means pros and cons and balanced unemotional decision making. Not a holy war against a given tool.
And I am a professional programmer.
Let's do C where C makes sense.
(Edit: fixed typos)
Good riddance.
Unfortunate he uses this categorization. The problem is a mindset that can exist in any generation.
Z: K&R's strcpy is broken, e.g., you can forget to null-terminate the string. Mine is safer.
Ohters: It's not broken, of course it'll do something unpredictable if you break its preconditions.
Z: strcpy is still broken.
Others: Your function will break too if you pass it the wrong length.
Z: This cannot happen, K&R strcpy is broken, mine is safe.
I wish that was true, but you will be surprised how many things you use everyday are written in C. Even the ones you would never imagine.
Node.js for example, a large part is in C. Redis, C. Memcached, C. PHP itself is written in C.
Make of that what you will, but it seems to me that given all of the other ways that C can blow up due to programmer error, it seems reasonable to expect programmers to pass a valid string to a string function.
Mind you, we're talking about the stdlib here. You can swap this stuff out. Some people do: djb is a fairly well-known example.
a. I haven't written a program in C in over 10 years. I wrote software 5 days a week for those 10 years.
b. I wouldn't want to write a program in C now.
c. The first "high level" programming language I learned was C, from a book (not K&R C), while travelling in Asia, without a computer. It taught me well, but I immediately went on to other languages.
e. I can't shake the idea that there is some value to knowing that low level stuff, even though I don't use it much myself.
Maybe linux kernel hackers will keep it alive. I know game programmers use it a lot as well. But for the majority of us, it's kind of an arcane skill now.
That's fine. Perhaps the kind of programs you have been writing and not a good fit for what C is great at doing. That does not take away from C or its use for appropriate work.
When all else fails... come back and say this again ! But for the time being ignorance be bliss.
"But C? C's dead. It's the language for old programmers who want to debate section A.6.2 paragraph 4 of the undefined behavior of pointers"
Someone has to build the low-level stuff. Dear boys in too-tight pants and a hippie mustache: your high-level things and gluten-free snacks does not grow on trees.
Some of us where already doing it in much better languages, before C had any meaning outside AT&T walls.
Next time before getting pissed off about the response you get, think what could it be that you have said or done that may have triggered it.
You can document its shortcomings, its dangers and all the headache-inducing choices. But while you're doing that, people all over the world are building wonderful and terrible things with it.
So you've moving on to Go or Rust? Great! Good choices! But remember that there are people who may disagree and be wrong and also do something interesting with that wrongness.
If anyone is interested in what he removed, you can find it here: https://web.archive.org/web/20150101224641/http://c.learncod...
Zed is frustrating sometimes.
https://web.archive.org/web/20150106191636/http://c.learncod...
Plan9 dialect of C is another example. There is portable mk package, with includes core libs (libbio, libutf, etc. which also served as core libs for earlier versions of Golang) to appreciate what C supposed to be.
I would paraphrase - attention seeking by attacking classics is a poor style.
"I’ve more or less kept my mouth shut about some of the dumb and plain evil stuff that goes on in the Rails community. As things would happen though I’d take notes, collect logs, and started writing this little essay. As soon as I was stable and didn’t need Ruby on Rails to survive I told myself I’d revamp my blog and expose these fucks."
and:
"After Mongrel I couldn’t get a gang of monkeys to rape me, so forget any jobs. Sure people would contact me for their tiny little start-ups, but I’d eventually catch on that they just want to use me to implement their ideas. Their ideas were horrendously lame. I swear if someone says they’re starting a social network I’m gonna beat them with the heel of my shoe."
So that is very much his style of writing.
We detached this subthread from https://news.ycombinator.com/item?id=11727718 and marked it off-topic.
I'll take it as dead when the Linux kernel, or it's futuristic replacement, is written in something other then C.
If you are talking about at the user-space level, then yes I can see that. But you shouldn't assume your single use case, higher level user space apps, is the only use case.
There's no argument that the Linux kernel is currently written in C. But that doesn't prove that nothing exists that can replace C.
Right now C is only the tried and true solution. The rest are possibilities only.
I don't think Zed's doing anything wrong. He's saying what he thinks needs to be said, he's challenging the complacent, and he's not pulling any punches. If you don't like his attitude there's plenty of other people to listen to. I appreciate that he's out there making noise, getting people to re-think their assumptions about programming.
If you live life by particular principles sometimes you have to take the hard road. You can't argue it hasn't been an interesting path.
Screw passion, I'd rather have discipline and a healthy interest instead.
>Has he at least mellowed out since 2007?
YesI don't have an opinion on the actual topic, but whether someone's goal is others' perception of them, or they are just poorly optimizing it as a proxy for their sense of self worth, attacking every criticism head on could undermine how others perceive them, or it could waste their time and mental energy compared to other things they actually care about more than the measure of their achievements based on others perceptions.
The fact that C arrays decay to pointers without any bounds is single-handedly responsible for a huge chunk, possibly even the majority, of all RCEs, worms, malware, and exploits. Ever. In the history of computing.
It was a bad design.
It was a bad design in 1978.
It was known to be a bad design in 1978.
Other languages knew that checking array bounds was important, including for security. The internet made the impact of using C much more devastating but people were exploiting buffer overflows in the 80s to great effect. Some of C's predecessors/contemporaries passed a length as the first part of an array so bounds-checking was possible, though that has the downside of not being able to pass slices of an array without copying.
C could have included an arrayref type that was a length + base pointer, and let array l-values decay to an arrayref instead of a pointer. Then taking a slice of an array would not require copying elements. You could still take the address of an individual element. This would not have required much work to implement, even in 1978! Maybe the first compilers didn't insert array bounds checks, but at least the entire design wouldn't preclude them. Let's say you even spell arrayref as []. It would mean sizeof() works on arrays passed to functions.
void wat(int[] values) { for(int i = 0; i < sizeof(values); i++) { printf("look ma, no buffer overflows! %d", values[i]); } }
(Yes, I know this is not K&R syntax)
Maybe you can forgive C for the stupid header compilation model (why let the compiler do what you can make the programmer do by hand?). You can understand why they might not have foreseen the need for namespaces. D&R didn't invent the macro system so that's not even their fault.
What is unforgivable is the horribly stupid design of C's arrays.
I actually think it would be beneficial if the standards committee added arrayref now. It won't fix all the busted C code but at least you could start improving the #1 problem. Compilers could eventually adopt a flag to prohibit arrays from decaying directly to pointers. You'd probably have to introduce lengthof() to avoid confusion and use some other syntax to declare one, maybe array(int) or something.
When C was designed, and even today, there are systems without pipelining, where it is expensive (in time) to de-reference a memory address and follow that pointer.
I do not argue that the design you suggest would be safer, and even have advantages for slicing; but that's really not the kind of program that C was intended to service writing.
Also, C is supposed to scale down to //really// simple systems. Systems that lack indirect addressing modes, caches, MMUs, etc. It is literally intended to be a thin veneer over actual assembly for those systems, and why so many operations are specified in terms of /minimum standard unit size/ (for portability of that almost machine code between systems).
What you advocate is more like what C++ actually /should/ have been; a reason to use something more than C to gain advances in safety and ease of design.
This model enables binary-only distribution of libraries, you get the code as a .a (or lib, .so, .dll or whatever) and the API declaration as a header file.
You can write code against a library without having the library, using only the header. You can't do the final linking of course, but you can write the code.
The alternative, I guess, would be to embed this information in the library itself, and have the compiler extract it, which sounds as if it would have been scary from a performance point of view 40 years ago (and also somewhat hard).
I used to think it was hopeless, especially as each new language that came out required garbage collection or worse targeted the JVM. Perhaps cloud services will motivate people to fix this stuff since now computing costs are a hard line item on the books.
We surely did know that Burroughs was selling an operating system written in ESPOL, later NEWP in 1961. Nowadays Unisys still sells them as MCP.
We did know that the Flex machine was written in ALGOL 68RS in 1980.
We did know that VME was written in S3 in 1970.
We did know that Pilot was written in Mesa in 1977.
We did know that Lillith was written in Modula-2 in 1997.
There are lots of other examples.
The main difference was that UNIX and consequently C, source code were available for free because AT&T could not sell it, while people had to pay for the other ones or they were behind research walls.
There's still a lot of C code out there, and a lot of new C code still being written.
I don't agree: it's been a long process, but the trend is unmistakable. It's hard to remember now, but in the early '90s C and C++ were completely dominant. Nowadays they're much more specialized: you're as likely to build your company on Java or even Python/Ruby as you are to build it on C++. People talk about how it's hard to hire C++ engineers nowadays, while in the '90s "C++ engineer" was pretty much synonymous with "programmer". And so on.
We also have nearly 40 years of infrastructure built on C, which needs to be maintained and updated.
This is the same old argument advocating for rewriting everything from scratch just because someone somewhere managed to develop a new flavor of the month.
There are plenty of reasons why the whole world still has a heavy demand for COBOL and FORTRAN developers, and the development of new flavor of the month isn't a good enough reason to eliminate this demand.
I'm not saying rewrite everything for no reason. I'm saying that there are reasons, and we've gotten a very good idea of what those are over the last 40 years.
But C++ won't be easy to replace, and I'm not sure it needs to be, since rewrites are highly risky, time consuming and disruptive. With some luck and depending on how the language evolves we might be moving from C++ to a safer C++.
But using the C++ features that make it safer than C is only an option in small security motivated teams.
Sadly the majority of C++ teams, at least in the enterprise space, tends to use it as "C with classes" thus voiding most improvements the language has to offer over plain C.
I think that C should rapidly be moving toward obsolescence, and I hold K&R in great esteem.
This is hilarious, because programmers by and large love to pride themselves about being stoic, logical, and practical in lieu of letting emotion dictate what they do.
Since when do programmers give a shit if people's precious fee-fees contradict what is technically correct? (The best kind of correct!)
Programmers at least the ones I've seen in my life are not from Vulcan :). In other words, we humans, are all driven by our emotions like it or not. The problem is that some people chose to believe that they are pure rational beings, therefore they are always right.
Well, then, in that case you shouldn't you reall refer to it as K&&R
I'm currently working on a couple of bugfixes for a Rust program I wrote last year which regularly allocates north of 500GB of RAM per-node on a cluster. It's wicked fast (regularly matching or beating comparable workloads implemented in C/C++), and Rust's ergonomics and safety guarantees made it very easy to extract much greater amounts of parallelism than the previous C++ version had, while never once having to chase down a bug from memory corruption, data races, or iterator invalidation.
Um, what's wrong with that in Rust?
> Rather, what about "let's write better, less security flaw prone C."
We've been trying this for the past 40 years and we've completely failed to stem the constant tide of new game-over security flaws. I think it's time to admit that if we couldn't do it in 40 years, we've failed.
> Try using Go or Rust (love both, x2 for Go) to allocate
> say a hundred GB of memory for some huge/fast in-memory
> data processing.
Why would this be a problem in Rust? It literally doesn't impose any overhead on memory consumption, at least not any that C doesn't (e.g. padding). Dropbox has clusters of machines that manage exabytes of data whose core is written in Rust.There is no fundamental reason why this should be slower or harder in Rust. Rust generally compiles down to more or less the same code C does.
There are reasons why this could be slower in Go, but it really depends on what program you're writing, so it might even just work fine. If you don't hit the GC, for example (and Go gives you ample opportunities to not hit the GC), data processing should be quite fast. But it depends.
I'd love to hear real-world experiences with such systems in Go.
The busiest node traffic-wise had average GC time over the past 20min of 3.4ms every 54.5s. 95th percentile on GC time is 6.82ms.
That node is sitting at 36GB in-use right now, and has allocated (and freed) an additional 661GB over the past 20min.
Can't really speak to how fast this is vs other environments, but it's smooth sailing overall. /shrug
A consistent theme throughout the article is that he's actually more interested in teaching people to write C well than fight with pedants. He's not torching his book, he's updating it and removing the contentious chapter. "let's write better, less security flaw prone C." is exactly what he's trying to say - the "don't write C" bit at the end is more about it being a dinosaur than a childish huff, though there is a little of that in that comment.
I don't mean that's what you intended, but that's the direction it points in, which it isn't in the long-term interests of HN to allow.
Perhaps if you don't have an opinion on the actual topic, you should refrain from commenting on it the best thing you have to add are a series of ill-advised ad hominems.
No idea how it compares with others; and not sure if it is representative, but to me that sounds pretty decent.
I doubt that. Kernels, drivers, embedded devices (not IoT), GNU world, are all highly C oriented. Want to develop for a customer with unknown unix variant? Want to develop a tool everyone are going to use, either on Linux/BSD/Solaris? C is the only option.
> but it's much easier to write correct code in C++ and it will be just as fast
Writing correct and fast C++ code at the same time was never an option; even today, with "safe" pointers, people are still confused how to correctly use shared_ptr<>.
> now that we also have Go and Rust this will accelerate
Some places where C is still a strong contender:
* good tooling - debuggers, memory leak detectors, years of experience with compilers on various platforms
* well understood language - C has dark corners and they are documented well
* interfacing with everything else - from devices to libraries and languages
Once upon a time most Unix software was written in C, shell, and awk. Then Perl came along. Did that diminish C? No. Then Java. Did Java diminish C? No. Then Python. Did Python diminish C? No. (You can throw C++ somewhere in there; not sure where. Though IME C++ use really seemed to explode with Windows developers migrating to Linux.)
In each case the universe of software expanded, but C was never diminished. People who think Rust, Go, or whatever will diminish C are ignorant of history. Of course, maybe the predictions will bare out. But I seriously doubt it, and it will be despite their underlying premises, not because of them. Rather, much more likely is an expanded ecosystem.
As I explained else thread, there's nothing intrinsic to the C standard which makes it unsafe. Compilers are free to add bounds checking at every point in the program; in most cases it would be just as cheap as in C++ or even Rust. It would require much rebuilding and retooling, but not much rewriting existing software. (Relying on undefined behavior is dangerous not only because of optimizations, but because undefined behavior can also preclude automatic bounds checking.)
That C compilers don't do that is a function of 1) baggage and 2) other functional constraints, like strong ABI compatibility. But neither of those are set in stone. People who think C is hopelessly unsafe make the same mistake every C newbie (and some die-hard C-is-just-assembly people) do: conflating the language semantics with implementation and machine details.
People assumed that clang would quickly overcome GCC because it was so new and nimble. But clang still hasn't unequivocally really overtaken GCC, and certainly hasn't obsoleted GCC. Rather, the competition merely spurred GCC to evolve faster. I see much the same happening with C.
In the future, look to systems like OpenBSD, FreeBSD, and Alpine Linux, which are more free to upgrade their toolchain and runtime environments with backwards-incompatible changes, to field enhanced C environments with better bounds checking and mitigations. Approaches like stack canaries and ASLR are only the tip of the iceberg for what's possible.
> Compilers are free to add bounds checking at every
> point in the program; in most cases it would be just as
> cheap as in C++ or even Rust.
It would not be as cheap as in Rust because Rust uses an explicit standard library feature (iterators) to obviate the need for bounds checks in the vast majority of loops to begin with. But in C indexing is pervasive within loops, so you'd need to come up with much cleverer compilers that could manage to prove that bounds checks were unnecessary (compilers can already do this in some cases, for C/C++/Rust, but it's not perfect).Likewise, one could make integer overflow in C well-defined, but this would also make C slower than Rust because the use of iterators means that Rust doesn't need to check for overflow on each loop iteration. Via language (or rather, library) features, Rust reclaims the performance that it otherwise would have lost to C by dint of being free of undefined behavior. I think you'd have a hard time doing this in C without rewriting every `for` loop in existence.
C++ is split between multiple factions. I'm doubtful that the one programming in C with classes is interested in learning e.g Rust.
Whereas the guy we're discussing seems to actually think that Paul Graham, billionaire, wakes up every morning thinking "how, today, can I enable the vicious internet slander campaign against Zed Shaw? (I am so intimidated by his genius)."
What would happen if I decided to pay you back for HN Paul?
What would happen if I started honestly reviewing your startups’ products?
If I just picked the worst ones, and then started tearing them in half?
What would happen if I went on every HN hiring post and started
posting dirt about the various shit HR practices your companies have?
What if I took all this writing and got my friends
you’ve fucked over to help me broadcast it?
What if I started posting this writing as replies to many of your comments?
What if I started offering to advise new coders, the millions I teach a year
(yes, millions Paul) to avoid all of your company’s startups?
What would happen if I just started putting anti-YC ads on my properties?
What if I started telling everyone how you take
7% and don’t give startups any real guidance?
What would happen if I started talking about the crazy bullshit I know
has happened at YC startups I’ve worked for and others have told me about?
Er… nothing?The interesting question is what will mobile devices, the IoT and embedded devices in general be programmed in? C and C++ are popular choices today, so the trend is not really "unmistakable".
Everything else usually has other languages available as well, one just needs to search for what is out there.
Some of today's dominant platforms are developed mostly in Java (see Android), and web development targets the LAMP stack. This means that the business is centered in ventures that exclude most languages, not because there is technical merit on other alternatives.
I'm sure it's possible to gather some individuals that are more than willing to badmouth Java and Python with a passion with the same ease we see here people complaining about C.
Oberon for ARM Cortex-M4, Cortex-M3 Microcontrollers and Xilinx FPGA Systems
http://www.astrobe.com/default.htm
Pascal and Basic for lots of micro and pico-processors
http://www.mikroe.com/compilers/
Ada,
http://www.ghs.com/products/ada_optimizing_compilers.html
http://www.ptc.com/developer-tools/apexada
Java for MCUs
A lot of the things rust brings to the table aren't always relevant on embedded platforms. Dynamic memory allocation on embedded is the exception, not the rule. Everything is statically allocated, so memory management is relatively simple -- everything sticks around forever.
The things that make C/C++ good for embedded are sorta what make it unfortunate for general purpose use. The things that make Rust/Go/Swift good for general purpose use make it unfortunate for embedded use.
Rust is a better choice, and it's designed with this in mind. It's still young, though, and I think there might be some as-yet-unsolved issues (these are things I've vaguely heard of and could be totally off-base) like binary size, ease of dealing with raw pointers, etc.
If I was doing this sort of programming for a personal project, I'd probably try using Rust, because I like it.
Dunno about Swift, though IIRC the current reference implementation may also currently rely on GC.
Rust is very promising language, but unless you plan to write everything from scratch, you have to depend on 3rd party driver implementations for most stuff. Databases for example. There isn't a single database vendor that has Rust drivers.
Rust is unsuitable as a replacement for C because its memory management is poorly thought out (ie. its a joke). Here's the relevant paragraphs from the Rust FAQ. Really?
"Rust avoids the need for GC through its system of ownership and borrowing, but that same system helps with a host of other problems, including resource management in general and concurrency.
For when single ownership does not suffice, Rust programs rely on the standard reference-counting smart pointer type, Rc, and its thread-safe counterpart, Arc, instead of GC.
We are however investigating optional garbage collection as a future extension. The goal is to enable smooth integration with garbage-collected runtimes, such as those offered by the Spidermonkey and V8 JavaScript engines. Finally, some people have investigated implementing pure Rust garbage collectors without compiler support."
No, it's a blatant ad hominem.
The guy invested his time and effort trying to improve the world by writing a technical book, which he then proceeded to give it away for free, and to this we see people like barbs replying with personal attacks accusing the author of being mentally disturbed to the point of requiring therapy.
This is a personal attack at its worst.
Perhaps the issue here is the C programming language and how teaching it can be improved, not what insults and personal attacks a random user online is able to throw at the author of a technical book.
People have more to lean from writing on undefined behavior than puerile complains regarding comments on penis sizes and ironic accusations of immaturity.
> The guy invested his time and effort trying to improve the world by writing a technical book, which he then proceeded to give it away for free
And I think this is certainly laudable, especially since they seem to have helped so many people. But he also called the Rails community "pricks, morons, assholes, and arrogant fucks who didn’t care about the art or the craft." and I think he should be held accountable for that, amongst other things.
I wrote a comment about the author's behaviour in public forums and in blogs, something I think he should be held accountable for, and something which I believe hurts both him and the communities he participates in. I believe this is relevant, and I'm entitled to discuss this here.
This is a discussion on a book on the C programming language written by someone, and here you are going full throttle on your personal vendetta against the author while saying absolutely nothing regarding the book or the programming language.
> Sure, I'm discussing his character
Precisely.
Go vent your frustrations somewhere else.
It's questionable whether people wanted that performance though, at least when it resulted in less security. About bounds checking in ALGOL 60: https://en.wikipedia.org/wiki/Bounds_checking
A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interest of efficiency on production runs. Unanimously, they urged us not to—they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous.
There's no question about it, the "ANSI C Rationale" makes it very clear what they considered "the spirit of C"[1]:
> - Trust the programmer.
> - Don't prevent the programmer from doing what needs to be done.
> - Keep the language small and simple.
> - Provide only one way to do an operation.
> - Make it fast, even if it is not guaranteed to be portable.
> The last proverb needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule. An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine.
> One of the goals of the Committee was to avoid interfering with the ability of translators to generate compact, efficient code. In several cases the Committee has introduced features to improve the possible efficiency of the generated code; for instance, floating point operations may be performed in single-precision if both operands are float rather than double.
[1] http://www.lysator.liu.se/c/rat/title.html Quoted section is found here: http://www.lysator.liu.se/c/rat/a.html#1
-- http://www.memorymanagement.org/mmref/lang.html
Adding runtime bounds checking of automatic storage arrays (i.e. arrays on the stack) is relatively easy in C, at least until the compiler runs into illegal type punning. The real problem in implementing these compiler safeguards comes with crossing translation units, or with heap blocks. There's a reason languages like Rust and Go rely heavily on static linking and stack allocation; it's more difficult or more costly to implement those safeguards when the compiler can't see all the source code, or pointers pass through an opaque layer. Nothing in C precludes automatic bounds checking of all array access, via fat pointers or lookup tables. Fabrice Bellard's Tiny C compiler implemented precise bounds checking for both automatic and dynamic storage-allocated objects a decade before UBSan and ASan. Even deriving an invalid pointer crashed the app at the precise point where it happened. That widely-used C compilers don't do that is a strong hint there are other, real-world constraints in place.
Also, in language like Java it's not uncommon to see people reinventing dynamic heap allocation using char arrays, susceptible to all the same overflow problems. When you see people doing that, that should be a hint that a language like C might work well.
I don't understand all the C hate. Then again, I have no problem employing various languages according to the task, or creating DSLs. I suppose if I was wedded to a single language or to the idea of a single language, C would look much worse to me.
This is untrue: Rust certainly does not do any optimisations linking statically by default, nor is there a difference between putting an array on the stack or on the heap. While it is true that code can benefit from whole-program optimisation, it isn't the default in either language, just like it isn't the default in C.
> That widely-used C compilers don't do that is a strong hint there are other, real-world constraints in place.
Yes. Those constraints are self-inflicted wounds caused by the fact that C wasn't designed for this. If you have a proper iterator API, a culture of unsigned array indexing, widespread use of a size_t equivalent instead of int for loops, etc. etc. these issues vanish.
This is C snake oil, sold by the C community that usually omits the fact that computers build a decade before the PDP-11 like the Burroughs, already had much better systems programming languages like ESPOL and NEWP, two Algol derivatives.
Algol 68 already had slices and modules (Algol68 RS) among many other nice features, running in early 70's hardware.
There are a few other examples.
> like binary size, ease of dealing with raw pointers, etc
The majority in the size of typical Rust binaries is the huge amount of space (400 kb or so) that it takes to statically link jemalloc. But if you're building for a device that doesn't support dynamic allocation then you're not going to be including jemalloc, so binary size shouldn't be a problem.As for raw pointers, they're exactly as capable as raw pointers in C, though they're deliberately more verbose as well, because even in embedded contexts one should be favoring references over raw pointers, since references are still fully checked for safety even in embedded mode and yet are represented by raw pointers at runtime and hence have zero runtime overhead.
BTW I am not implying that is Rust's fault, actually I can't think a syntax that would make it less verbose, and I am a huge Rust fan.
something like C::function_from_c() is one option
In that case, you can simply not use the dynamic allocation features of Rust, just as you can simply not use malloc() in C.
> GC languages
Rust is not a GC'd language, it essentially uses RAII to deterministically determine at compilation time when memory will be freed. > Dynamic memory allocation on embedded is the exception
Rust fully supports running entirely without dynamic allocation. There is a subset of the standard library defined explicitly for this purpose.Right, I was referring to Go and Swift with the comment about GC. Since rust didn't have that problem, I mentioned why major adoption might not be forthcoming.
I understand that you can statically allocate all you want in Rust, but it's memory safety, particularly with regards to object ownership is one of its major selling points. Object ownership and lifetimes are trivial when everything is static.
The other arguable selling point to rust is its standard library, but much like C++ the standard library would be left aside in most embedded applications.
So it doesn't buy you much of anything at all, but it takes some work to setup, plus you are fighting the momentum that C/C++ has. Unless there is some other compelling reason to use it, I don't expect much adoption.
This is a library thing, not a language thing.
If single ownership is enough for you, go ahead and use it. But if you need a different memory management strategy, that is available too.
Rust, the language, provides a single clear memory management strategy. It also provides the ability to design your own abstractions for different strategies, and implements some of these in the stdlib.
C/C++ have refcounting and GC libraries too. Does that make them a joke?
> They really don't know what direction to go.
That's incorrect. A systems language needs to be flexible; it cannot dictate that everyone must do something a single way.Does the presence of libraries for refcounting or even GC in C (most famously Boehm) mean that C has an incoherent story around memory?
> Rust's multiple methods of memory management makes it strange, at best,
> for programmers to decide how to build a program or an API.
It does not. Each has their place. Need single ownership? Use a type that has it. Need multiple ownership? Use a type that has it. > Are the owners of Rust planning on adding a GC? Really?
Not really. There's a few different things here: first is integrations with other systems that have a GC. As an example, consider Servo: it has to interact with Spidermonkey's GC, since it interfaces with JavaScript code. Consider the opposite system: Rust embedded inside of another language, let's say Python, where you want to be able to talk to Python's GC for various reasons.The second is something like Bohem: if a system wants to use GC for some reason, they have an interface to add a GC'd type. But Rust proper, the language, will not have GC. It's completely contradictory to the goals of the language.
To address this specifically: Current plans for Rust are mostly along the lines of adding the bare necessities in the stdlib to allow GC implementations to be written.
There are mostly-niche use-cases for having a GC in Rust. I've written some of the motivation here[1] (note that that blog post is about a pure library GC independent of Rust, which is different from what is planned; but the motivations are similar).
One major use case is if you want to talk to a language which has a GC. Say you're writing a native extension to a Ruby or Node and want to deal with the GCd types within Rust code in a safe way without pausing the GC. Or if you're writing an interpreter for a GCd language. Or you're writing some code that deals with complicated cyclic graph-like datastructures.
These are all pretty niche, but the workarounds in these cases aren't pretty so it's nice to have some form of GC capabilities in Rust. This is not a price you pay by default, and it's not something that affects anyone but the people who need these types. It will probably take the form of some low level APIs that use LLVM stack rooting to collect roots, and some traits in the stdlib, which can be used by an independent GC library (not part of the stdlib) or a language bindings library (also not part of the stdlib).
Rust itself will never get a GC as part of the language.
[1]: http://manishearth.github.io/blog/2015/09/01/designing-a-gc-...
LTO notwithstanding, once you add those more sophisticated constructs, iterating the language becomes more difficult. You don't hit upon the best method for implementing various types the first time, or the second time, or even the third time. glibc is backwards compatible for programs compiled over 15 years ago (GCC's fixinclude hacks notwithstanding). You'll never see that with Rust's or Go's standard library, just like you never saw that with C++.
My point wasn't that static linking was necessary. My point was that static linking is indicative of other tradeoffs that most people don't understand. Static linking isn't just about making packaging easier. It's also about making it easier to write and implement the compiler and standard environment.
My more abstract point is that people who think C is on its last legs don't understand the whole picture. There's nothing intrinsic to C that makes it unsafe. Febrice's compiler was perfectly capable of implementing the C standard to the letter. What makes C unsafe are the requirements found in the niches where C exists, and those requirements don't magically disappear because the name of the language changes.
Rust supports unsafe code, but implementing code in Rust which is rigorously robust in the face of OOM situations, or where you need to implement use-case memory management strategies requires relying almost exclusively on unsafe code. (Try using Rust without boxing, for example, as is necessary if you want to catch OOM.) If you don't need those things, you probably don't need a low-level language, either. I love C, but I also love language like Lua with lexical closures and stackless coroutines. To me, languages like Rust and even C++ exist at a middle ground that is very unappealing to me.
C isn't standing still, either. Strategies like SafeStack (see http://dslab.epfl.ch/proj/cpi/) can provide substantially the same safety guarantees as Rust in terms of real-world attack vectors, without having to modify any existing C software, and without giving up performance.
None of this is to say languages like Rust are useless. Just that the harms and inevitable demise of C per se are, IMHO, greatly exaggerated. And if and when a language like Rust grows in usage, I doubt it will supplant C so much as open and populate virgin territory.
That paper indicates that you do in fact give up performance, and the performance is comparable to existing SFI techniques. SafeStack itself is insufficient to prevent UAF problems with the heap. CPI prevents them, but with significant overhead. And you still don't get full memory safety.
It's not necessary, you can plug in a custom allocator that works differently and use boxing as usual.
There are plans for more robust custom allocator APIs that make this even easier to handle.
Also, really, even if Rust didn't have this, the situation wouldn't be worse than C. In C you have to malloc and free things manually. In Rust you can do that too. Rust's abort-on-OOM is an stdlib thing (which can be overridden as previously mentioned).
The performance hit is generally negligible, especially with abstractions like iterators in Rust that avoid them entirely, and standard optimisations that can lift the checks out of loops... optimisations that do not need any of the things you say that the compilers want. The cost of calling code in a different dynamic library (e.g. getting the dynamic symbol address and then doing the actual call) is going to be much greater than whatever bounds checks it does in almost all situations.
> There's a reason languages like Rust and Go rely heavily on static linking and stack allocation.
As I just said, this is factually false. Static linking is entirely orthogonal to bounds-checking optimisations (neither Rust nor Go do whole program optimisations when linking statically, so it can't be the motivation for it), as is putting data on the stack. GC seems even more irrelevant, especially to Rust which doesn't have one.
> My point was that static linking is indicative of other tradeoffs that most people don't understand. Static linking isn't just about making packaging easier.
But it isn't indicative! In Rust's case, linking statically is for packaging: the reason is the ABI is unstable, so dynamically linking is very annoying to manage and many of its benefits are inhibited.
> There's nothing intrinsic to C that makes it unsafe.
The forever-growing list of CVEs caused by basic mistakes in C code says otherwise. Things like overrunning a buffer or reusing a freed pointer are not at all caused by domain specific constraints, they're the price one pays for using 40 year old technology. You can see this in modern tools that try to assist with getting safer C: they are often using things that didn't exist when C was created. (And, don't get me wrong, C is here to stay, even if all new C development was stopped today, and so efforts to make it safer are very good, but at some point we have to face the reality of C/stop the C-apologism.)
> Febrice's compiler was perfectly capable of implementing the C standard to the letter.
This is essentially meaningless for two connected reasons: the major problem with C is the holes in the standard (undefined behaviour)---not compiler bugs---and, people want fast code, they need optimisations, which often exploit undefined behaviour.
> (Try using Rust without boxing, for example, as is necessary if you want to catch OOM.)
Boxing or not is irrelevant to safety: using Box allows in fact more aggressive `unsafe` code (one can rely on address-stability to correctly sidestep the compiler's normal checks). Rust-the-language knows effectively knows nothing about the stack or heap when reasoning about safety: it does reason about stack scopes, but it doesn't care where the data is actually positioned in memory: Box<T> is isomorphic to a plain T in this respect.
In any case, the power of Rust is the ability to wrap code into safe abstractions: if there is a particular feature the standard library doesn't provide (yet), external libraries have the power to create APIs that have the same level of safety, maybe with a bit of `unsafe` internally. You can see this even in "use-case memory management" situations like a kernel: http://os.phil-opp.com/modifying-page-tables.html
You put your finger on the problem: "modern UB-aggressive optimising compilers". C, the language, is actually quite simple (if not easy). The crazy stuff that compiler writers have been doing recently while aggressively mis-reading the C standard is the problem and does make things very complicated.
Why "misreading"?
From 1.1:
"The X3J11 charter clearly mandates the Committee to codify common existing practice."
Their emphasis, not mine. So is there a mandate to use the definitions of the standard to invalidate common existing practice? Clearly not. Yet that is what is happening.
More from the standard (defining UB):
"Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behaviour."
Does it say "Undefined behaviour gives implementors license to add new optimisations that break existing programs"? Clearly and unambiguously not.
> More from the standard (defining UB):
Your quote is not from the normative text of the standard, but from the non-normative rationale. Note however that it explicitly says that programs that contain undefined behaviors are erroneous, and that the implementation is not required to emit diagnostics for the UB. Pretty clearly this allows implementations to optimize erroneous programs into whatever they think is funny this week.
The normative text of the standard is pretty unambiguous:
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
http://www.iso-9899.info/n1570.html#3.4.3Utter nonsense. I use that word carefully, but in this case it is absolutely appropriate.
Compiler optimisations per an old but very useful definition aren't allowed to change the visible behaviour of programs (in terms of output, obviously they are allowed to change execution times).
For example, even just a couple of years ago the compilers I used would execute a loop that sums the first n integers. Nowadays compilers detect this and replace the loop with the result. While this isn't particularly useful, because probably the only reason you're summing the first n integers in a loop is to do some measurements, it is (a) a perfectly legal optimisation and (b) happened after 1990.
Unsurprisingly, you left out the second part of the (later) definition:
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable
results, to behaving during translation or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message).
Notably absent is "use the undefined behaviour to shave another 0.2% off my favourite benchmark".No it doesn't say that. It says that they are either "nonportable" or "erroneous". I'll take "nonportable" for 400, please.
Maybe I live in a C reality distortion field. :)
void free_circularly_linked_list(struct node *head) {
struct node *tmp = head;
do {
struct node *next = tmp->next;
free(tmp);
tmp = next;
} while (tmp != head);
}
Can you spot the undefined behavior?I've written up a demo with your code, running it through several analysers:
https://gist.github.com/technion/1b12c9b4581e915241d9483c5c2...
The tl;dr here is that tis-interpreter is a fantastic new tool, as it correctly complains about this.
Edit: I also note a departure from yester-year, where every linting tool would only manage to complain about unchecked malloc() returns.
I wasn't aware of the tools that do this automatically. Just found one and it looks promising.
It is not part of the normative definition, which says "for which this International Standard imposes no requirements". In ISO standards, notes are without exception non-normative.
Although I think they really should add your proposed text as an additional example, as their current set of examples is evidently confusingly incomplete :-)
Options like GCC's -fwrapv/-ftrapv and -fno-strict-aliasing are examples of language extensions that are essentially implementation defined UB.
Edit: Of course you could argue that things where hardware difference are a likely motivation such as signed integer overflow ought not to be UB in the first place, but instead left as implementation defined in the standard, but in that case your issue is with the C standard committee, not with implementers.
void free_circularly_linked_list(struct node *head) {
struct node *tmp = head->next;
while (1) {
if (tmp == head) {
/* Has to be a separate case since even assigning
* a dangling pointer is UB I believe? */
free(tmp);
break;
} else {
struct node *next = tmp->next;
free(tmp);
tmp = next;
}
}
} void free_circularly_linked_list(struct node *head)
{
struct node *a = head->next;
while (a != head) {
struct node *b = a->next;
free(a);
a = b;
}
free(head);
}
Great example, by the way!Let's say head value is "10" and the memory at "10" is {..., next: "10"}
After the first iteration we will have:
Head: "10" Next: "10" Temp: "10"
With "10" pointing to freed memory. But why do we care? We are not dereferencing it, are we?
(I think I am missing something very obvious)