C considered dangerous

45 points by johnramsden 7 years ago | 62 comments

rwmj 7 years ago |

> He asked: why is there no argument to memcpy() to specify the maximum destination length?

I'm confused by this. The third argument provides the destination length, so what good would a "maximum destination length" do? I guess he must mean that because the length is often computed, you'd need a fourth argument to ensure the length isn't greater than some sane upper bound. But you can easily fix that using an if statement around the memcpy.

vardump 7 years ago | |

Perhaps because the memory buffers might be of different size.

Maybe memcpy_oobp (out of bounds protection) signature could be:

  memcpy_oobp(void* dst, size_t dst_size, void* src, size_t src_size);

Then again, I guess you could just as well do:

  memcpy(dst, src, min(dst_size, src_size));

But having to explicitly specify both destination and source sizes might have prevented a lot of buffer overwrite bugs.

sebcat 7 years ago | | |

> But having to explicitly specify both destination and source sizes might prevented a lot of buffer overwrite bugs.

A good way to prevent this is to have a buffer abstraction, where the size is a property of the type, e.g.,

    typedef struct {
      size_t bytes_used;
      size_t capacity;
      void *data;
    } buf_t;

    int buf_init(buf_t *buf);
    void buf_cleanup(buf_t *buf);
    void buf_copy(buf_t *dst, buf_t *src);
    /* ... */

Of course, it doesn't prevent people from using memcpy directly.

rwmj 7 years ago | | |

I guess so. One of the LWN comments mentions a Microsoft function memcpy_s defined as:

    memcpy_s (void *dest, size_t destSize, const void *src, size_t count);

which is effectively equivalent to your memcpy_oobp function.

However the Microsoft function also returns an error code which must be checked (because count might be larger than destSize), thus providing another way for the programmer to screw up. I'm not sure if this is better or worse than just copying the min() as in your second example. It probably depends on the situation.

rurban 7 years ago | |

Just use memcpy_s. This has the destbuf size argument. It's even in C11, but you need the safeclib or MSVC, as no libc cares about the safety annex.

deng 7 years ago |

Thankfully, compiler warnings and static analyzers have become much better in recent years. For instance, gcc can now warn about a missing 'break;' mentioned in the article (you need to add a special comment like '/* fall through */' if it's intentional). Also, clang-tidy is getting better with each release. I highly recommend using it, although the initial configuration will take some time, depending on the code base.

xroche 7 years ago |

Alas! strlcpy and strlcat are still not present in the glibc, despite numerous attempts, mainly for religious reasons (ie. "BSD sucks").

And yes, having something like "if (strlcat(buffer, src, sizeof(buffer) >= sizeof(buffer)) { abort(); } " is much better than buffer overrun. But security does not always seem to be a real concern, compared to politics.

yason 7 years ago |

C is dangerous partly because assembly language is dangerous. We will always need some layer on top of assembly that is mostly unchecked and reflects back to how cpu instructions work. This is probably something we must live with until we have processors with the notion of type checking.

C is dangerous partly because of swaths of undefined behaviour and loose typing. Eliminating much of undefined behaviour either by defining the behaviour or forcing the compiler to refuse compile undefined behaviour could be of some help. There are still classes of undefined behaviour that cannot be worked around but narrowing that down to a minimal set would make it easier to deal with it. Strong typing would help build programs that won't compile unless they are correct at least in terms of types of values.

C is dangerous partly because of the stupid standard library which isn't necessarily a core language problem as other libraries can be used. The standard library should be replaced with any of the sane libraries that different projects have written for themselves to avoid using libc. It's perfectly possible not to have memcpy() or strcpy() like minefields or strtok() or strtol() which introduce the nice invisible access to internal static storage, fixed by a re-entrant variant like strtok_r(), or require you to do multiple checks to determine how the function actually failed. The problem here is that if there are X standards, adding one to replace them all will make it X+1 standards.

Yet, good programmers already avoid 99% of the problems by manually policing themselves. For them, C is simple, productive, and manageable in a lot more cases and domains than it is for the less experienced programmers.

pjmlp 7 years ago | |

Ironically other systems programming languages developed outside AT&T walls since 1961 did not suffer from the majority of C's pain points regarding memory corruption.

I really wish Bell Labs had been allowed to sell UNIX.

IshKebab 7 years ago |

Terrible title. It's not remotely news that C is dangerous. This talk seems to be about ways of mitigating the dangers. Why not call it "Mitigating the dangers of C" or something else that is less of a tired cliche?

pjmlp 7 years ago | |

Because "Making C Less Dangerous" is the actual title of the talk, and "Towards less dangerous C" is part of the agenda?

fithisux 7 years ago |

The title is completely misleading.

xvilka 7 years ago |

Hopefully Zig [1] language will become a better alternative to C in upcoming years. Not talking about higher level code where Rust or Go can be a better choice.

[1] https://ziglang.org/

pjmlp 7 years ago | |

No language can become an alternative to C in the context of UNIX like OS because no one is going to re-write them from scratch, given their symbiotic nature.

Even if the complete userspace of Aix, HP-UX, *BSD, GNU/Linux, OS X, iOS, Solaris,.... gets re-writen in something else, there will always be the kernel written in C.

Hence why improving C's lack of safety is so important to get a proper IT stack.

abainbridge 7 years ago | |

The problem with Zig is that they changed almost everything. I think there's a high risk they introduced new design problems that we won't know about fully until Zig has been used in anger for 10 years.

I've always felt that C is near the sweet spot. I'd rather see a minimal change to C that broke backwards compatibility (because it has to) and fixed the top ten simple problems.

amelius 7 years ago |

Why don't they use valgrind?

deng 7 years ago | |

The kernel has CONFIG_HAVE_DEBUG_KMEMLEAK.