Fuzzing Perl: A Tale of Two American Fuzzy Lops(geeknik.net) |
Fuzzing Perl: A Tale of Two American Fuzzy Lops(geeknik.net) |
> We study the relative efficiencies of the random and systematic approaches to automated software testing. Using a simple but realistic set of assumptions, we propose a general model for software testing and define sampling strategies for random (R) and systematic (S₀) testing, where each sampling is associated with a sampling cost: 1 and c units of time, respectively. The two most important goals of software testing are: (i) achieving in minimal time a given degree of confidence x in a program's correctness and (ii) discovering a maximal number of errors within a given time bound n̂. For both (i) and (ii), we show that there exists a bound on c beyond which R performs better than S₀ on the average. Moreover for (i), this bound depends asymptotically only on x. We also show that the efficiency of R can be fitted to the exponential curve. Using these results we design a hybrid strategy H that starts with R and switches to S₀ when S₀ is expected to discover more errors per unit time. In our experiments we find that H performs similarly or better than the most efficient of both and that S₀ may need to be significantly faster than our bounds suggest to retain efficiency over R.
Also, for (say) C programs of moderate size, I believe current program analysis techniques do not achieve anything near 100% branch coverage. In the KLEE paper they manage to get ~90% on average for the GNU coreutils using concolic execution – which is really impressive! But the coreutils are tiny programs. For larger programs the cost of symbolic execution is high and random testing can often get you more bugs faster.
I'm so glad new programming languages are making strides which prevent this sort of thing outright. They don't prevent all bugs, but they sure prevent some of the most damaging ones.
My first guess is that the two main things causing this is number of lines of code (attack surface), and being implemented in C. The reference implementation of Python is also in C. GHC is partially implemented in C. Being implemented in C still seems common, and C is prone to these kind of bugs. Perl is a complex language, giving the code base a large attack surface. If anything, I'd guess languages with smaller implementations have fewer bugs, since one of the most robust findings on these sorts of matters is that the number of bugs is proportional to the number of lines of code independently of other factors.
Simply the enforced restrictions on memory management. Of course some of those languages are implemented either in C, or in some language which implements that memory management. But if you implement a language with forced GC and bound checks (python for example), you have only two places where use-after-free and overflows can originate. (ignoring native extensions) It's much simpler to review / fix them globally rather than in each usage site. Putting other restrictions like strict type hierarchy, lifetimes, escape analysis, etc. in place makes the languages more immune by design.
Older/newer split doesn't make a huge amount of sense though. There are lisps, there's ADA, there are lots of other very old languages which avoid those issues by design.
Basically, I think the "newer languages" comment refers to the implementation language (C being the most common today, but probably not for more than a few more years) rather than the language itself (Perl in this case). Though I think you're entirely right that surface area plays a big role. I would be willing to bet that Perl 6 will be more secure (by some definition of "more" and "secure") than Perl 5, because the surface area of Perl 6 that is in C is much smaller (the VM, with most of the language itself being written in a subset of Perl called NQP, or Not Quite Perl).
I fixed the publicly reported bugs in 2 minutes. I cannot fix the other bugs since they were not reported to cperl (the perl5 fork which is doing the actual development of perl5). The perl5 security team is doing horrible work, so I would prefer to get the reports also, for independent and usually better fixes.
Brian Carpenter and Dan Collins provided excellent afl work lately for perl5.
2 of them were actually security relevant, one stack overflow, one heap overflow, all 6 issues already fixed in git.
Regarding your comment about "deserve security": yes. perl5 would deserve a bit security, but all they do is theatre. dozens of known security fixes are ignored.
You'll get some fluctuation, a bug may come up half an hour earlier or not. But results tend to be pretty reproducible. If you find a bug with a specific test and tool in x hours then the next time you try for at least x+1 hours you'll find it again.
KLEE result was interesting. I'm wondering about combining program analysis with semi-random testing that generates its values within the ranges, etc that come from program analysis. Might also spread the effort along the control flow graphs and such. Alternatively, as in dynamic analysis, do the kinds of instrumentation in software, annotations or compiler-level, that catch situations that are probably risky combined with input from random testing likely to set it off. Might speed up process but I imagine someone is already doing one or both of these.
Good example: There was a bug I found in openssl that Libfuzzer was unable to find. The Libfuzzer developer was quite interested in this and has now adopted new mutation strategies: https://github.com/google/sanitizers/issues/710
And since most of the old dynamic languages (and also the new conservatives ones) are refcounted, the most common error is use-after-free, and then buffer overflows (heap or stack).
off-by-one is common to all systems.
Since I know perl5 and perl6 internals inside out I wouldn't bet that perl6 is more secure than perl5 at all. The perl5 is more hairy and eroded over time, but the perl6 vm is not really battle-proof and way too slow to be serious.