LLVM merges SafeStack(github.com) |
LLVM merges SafeStack(github.com) |
LE: indeed, it's quite clear from the mentioned article (http://dslab.epfl.ch/pubs/cpi.pdf). So this provides great exploit protection.
> With SafeStack alone, an attacker can overwrite a function pointer on the heap or the unsafe stack and cause a program to call arbitrary location, which in turn might enable stack pivoting and return-oriented programming.
And you need additional features (such as CPI from the paper you and the commit message link to) for full protection.
It's used as a way to defeat DEP (Data Execution Prevention); with DEP the attacker can no longer write code into memory and then execute it, so instead they just set up the stack cleverly so they can carry out a return-oriented payload (most commonly, these payloads just disable DEP and then move on to a more traditional second stage).
More info:
The paper that introduced the name ROP (though some would argue that the techniques existed before this paper): https://cseweb.ucsd.edu/~hovav/dist/geometry.pdf
Wikipedia: https://en.wikipedia.org/wiki/Return-oriented_programming
"This is lower than the overhead of stack cookies, which are supported by LLVM and are commonly used today, yet the security guarantees of the safe stack are strictly stronger than stack cookies."
Win-win for everyone
> This is because objects that end up being moved to the regular (unsafe) stack are usually large arrays or variables that are used through multiple stack frames. Moving such objects away from the safe stack increases the locality of frequently accessed values on the stack, such as CPU register values temporarily stored on the stack, return ad- dresses, and small local variables.
I wonder if this speedup effectively hides the performance overhead of SafeStack.
Google will be able to do likewise for NaCL apps.
- Most real-world exploits these days are based on use-after-frees, heap buffer overflows, and other heap-related weirdness, rather than stack buffer overflows. It's nice that SafeStack mitigates that attack vector though (but if you disable stack canaries in favor of it, you actually reopen the door to exploit certain types of vulnerabilities...)
- A (the most?) common method to proceed from memory corruption to return-oriented programming is to redirect a virtual method call or other indirect jump to a stack pivot instruction. SafeStack alone does nothing to prevent this, so it doesn't prevent ROP.
- However, the general code-pointer indirection mechanisms described in the paper, of which SafeStack is an important component, could make ROP significantly harder, because you would only be able to jump to the starts of functions. This guarantee is similar to Windows's CFG (although the implementation is different), but SafeStack makes it harder to bypass by finding a pointer into the stack (either on the heap or via gadget).
- In practice, interoperation with unprotected OS libraries is likely to seriously compromise the security benefits of the combined scheme, because they will store pointers into the real stack, jump directly to code pointers on the heap, etc. JIT compilers are also likely to be problematic.
- In addition, there are more direct ways for an attacker to work around the protection, such as using as gadgets starts of functions that do some small operation and then proceed to a virtual call on an argument. The larger the application, the more possibilities for bypass there are.
- Still, "harder" is pretty good.
Edit: By the way, the point about function start gadgets makes questionable the paper's claim that "CPI guarantees the impossibility of any control-flow hijack attack based on memory corruptions." Also, if you want to guarantee rsp isn't leaked, it isn't enough to keep all pointers near it out of regular memory: they also have to be kept out of the stack itself, because functions with many (or variable) arguments will read them from the stack - at least, I don't see a claim in the paper about moving them - so subverting an indirect call to go to a function that takes more arguments than actually provided (or just changing a printf format string to have a lot of arguments) will cause whatever data's on the stack to be treated as arguments. Ditto registers that either can be used for arguments or are callee-saved. That means frame pointers have to be disabled or munged, and any cases where LLVM automatically generates temporary pointers for stack stores - which I've seen it do before - have to be addressed.
If you do move non-register arguments to the safe stack then the situation is improved, but you still have to watch out for temporaries left in argument registers.
All 5 of them?
No they won't, because NaCl is native x86. Unless you mean PNaCl, which is a different technology with much less adoption than NaCl.
So technically bother, then? NaCL programs are just as susceptible to buffer-overflow as conventional programs, its just they are better sandboxed. Exploit mitigation is a belt-and-braces thing, and I can't see why Google wouldn't be enabling this pass right now as we speak.
You have something compiled in NaCL like the Flash plugin and it can control the camera and stuff and you are hoping that an attacker can't feed it some malformed JPEG or something that makes it use the caps it has in a bad way etc.
Here's the only thread I could dig up on buffer overflows in NaCL: http://permalink.gmane.org/gmane.comp.security.nativeclient....
They might be able to apply that over all of android - and that will automatically apply over java based apps.From there it's just a matter of incentives, to rapidly create change in rest of the apps.
Now they already can, technically (via fat binaries, they've already been through multiple architectural transitions), the interesting part is they could now go through these transitions without developers having to be involved.
It also means the end of Universal binaries as Apple can thin the compiled app for each target device.
the PPC switch over was painful, as was the 32 versus 64 bit era. They want to avoid that in the future.
Here's a nice article describing it:
> This means that apps can automatically “take advantage of new processor capabilities we might be adding in the future, without you re-submitting to the store.”
http://thenextweb.com/apple/2015/06/17/apples-biggest-develo...
int *p = cond ? &a : &b;
...later enough that this isn't trivially optimized into two stores...
*p = 1;
will probably not be flagged (it depends on what optimization passes have run before the SafeStack pass), but will put the pointer in the stack or a register that may be saved.I imagine any user with over 1000 karma probably has at least a few tricks, and could mention off the top of their head a few things that help quite a bit when using the site. At the same time, I think it's best they aren't mentioned too often, as they offer a sort of natural advantage to those that have been around an while and contributed to discussions (it's the nature of many of these helpers that they actually help discussion), and putting those tools into the hands of a novice user may actually be detrimental to the site as a whole.
I play writing LLVM backends, and its entirely viable to even do things like rewrite float80 and other assumptions that a front-end has made for a particular target.
Use of SIMD intrinsics are a tougher nut, but I've actually been playing with them at an IR level for hobby stuff and I declare its not intractable.
How so? I recall it being amazingly painless.
By your analysis, Gmail could implement a feature that lets a sender run arbitrary JavaScript in the recipient's browser, and this would have no security impact as long as the JavaScript sandbox was not escaped. But in reality this would be a huge breach, because there are valuable things inside the sandbox that attackers should not have access to.
Put another way, this wouldn't help defend Chrome from NaCl, but it would help defend the NaCl app from it's clients. This would be in Google's interest to implement because it would make the platform more attractive to developers.
I see your point. I guess you're saying, there could be a photo editing app in which Alice can send pictures to Bob, and Mallory might send a malicious picture to Alice that coerces her client into betraying all its photos.
Why would Google want to not bother applying a belt-and-braces exploit mitigation that costs 0% CPU?
increment register 16
return
in the ambient machine language. Call that "dual use data".
ROP searches the memory for sufficient "dual use data" and then builds an ac-hoc compiler with "dual use data" as target language. Then the attack software compiles to "dual use data" and then runs the compiled code.Of course one may ask: can we always find enough "dual use data" to build a Turing-complete set of instructions as a compilation target. Turns out that with high probability that is the case.
The key novel idea in ROP is to use instruction sequences in unintended ways. ROP is a refinement of ret2libc, improving on it by returning into arbitrary locations in functions rather than their entry points. That, and of chaining together gadgets with returns. Hence the name.
I used mp3s and jpgs as extreme examples of data that was never intended to be executed, but still can be interpreted as code. In ROP, you don't care about the intended meaning of the bytes that make up "legitimate functions" (or any other data you may use) for it's unlikely to have the sought functionality. Instead you use you search for "dual use code" too, and piece together the functionality you need.
There will be ISAs sufficiently different that you cannot make bitcode generated when targetting one not be massaged to fit another, but they are not mainstream. The mainstream are all increasingly similar 64-bit targets.
I know, I play with a hobby llvm backend that retargets.
But NaCL is also used to isolate "built in" embeddables e.g. Flash, which I have used as an example of a NaCL plugin that comes with chrome in both the previous posts?
Another example is the voice activation plugin that got them into so much trouble recently...
Imagine you walk past your colleagues computers saying things like "let's google something naughty" loudly...
And now let's extend that to playing an audio snippet that invokes a stack smash? :)
Are not google-bundled plugins allowed to run even when the user doesn't allow third-party NaCL plugins to run?
Its belt and braces. Why wouldn't chrome bother to enable a pass that costs 0% runtime performance? After all the money and time Google have sunk into other runtime checks like ASan, why wouldn't they also enable this?
It made me think: what other 'exploit mitigations' do Google put into NaCL, even though its sandboxed? So I quickly gooogled to see if they use ASLR inside NaCL, for example. Here's what I found:
https://code.google.com/p/nativeclient/issues/detail?id=2962
So they added ASLR, and they made a nice point:
"for the threat model where a NaCl module might process untrusted input, it would be nice to provide ASLR for the NaCl module so untrusted input won't (easily) be able to take control of the NaCl module (if/when the NaCl module has bugs). (this is a different threat model that the usual one in NaCl, where the NaCl module is untrusted. here we are trying to make sure that a NaCl module, which is executing code on the behalf of some domain/web site, isn't easily pwned, even if the NaCl runtime is itself okay.)"
Unless you store your MP3s and jpegs in .text, the memory pages all that stuff is in are marked not executable and will only cause a crash if you jump to it. Regardless of whether the bytes make useful instructions.
Remember, x86 can be parsed differently depending on offset. You jump into the middle of a multibyte instruction you get an entirely different instruction stream. And x86 doesn't have any real protection against that.
It's possible to have executable data, but if you do, you generally have bigger problems: the exploit can simply write a complete first-stage into the data and execute it directly, and not bother going through return-oriented contortions.
The reality is that gadget harvesting is about analyzing program text --- actual binary machine instructions --- not about looking for ways to interpret JPEGs or MP3s or (I wrote DOCX and then PDF and then thought "huh bad examples") RTF files as instruction streams.
It's also true that you can exploit insane x86 encoding to synthesize unintended instructions, but that's (I think) less important than the simpler idea of taking whole assembled programs, harvesting very small subsequences, wiring them together with a forged series of stack frames, and achieving general computation.