LeftoverLocals: Listening to LLM responses through leaked GPU local memory(blog.trailofbits.com) |
LeftoverLocals: Listening to LLM responses through leaked GPU local memory(blog.trailofbits.com) |
If your data is that sensitive, run it on dedicated hardware. Papering over this with mitigation over mitigation is a fool’s errand: both a genuine waste of compute resources and guaranteed to be a game of cat and mouse.
Notably:
> NVIDIA: confirmed that their devices are not currently impacted
> ARM: also confirmed that their devices are not currently impacted.
Does anyone know if nvidia's virtual gpus improve the isolation at all?
> Apple: Despite multiple efforts to establish contact through CERT/CC, we only received a response from Apple on January 13, 2024.
> Apple did not respond or engage with us regarding the disclosure.
Well at least they are consistent at not giving a flying f*ck about working with bug reporters, no matter who you are. I have reported 5+ radars in the past and have never received any response, not even a confirmation.
Sorry. That was a long time ago.
Literally too incompetent to follow even basic security 101 practices. A time shared device must be sanitized between users to prevent state leakage. There is no reason to believe that a security culture that clueless when developing a universally shared, high criticality device can be believed if they claim to do better elsewhere. Their process is either so incompetent or so inconsistent that their claims can not be believed without external audits.
In this case: Apple, Qualcomm, AMD, Imagination.
Edit: Added Imagination as noted by reply.
This was definitely happening with OpenAI's web interface. It might have been happening via API calls too but it's been a while and I don't remember.
Nothing ever came of the report as far as I know.
These vulnerabilities will continue happening. What I don’t understand is how anybody can be surprised at this point. If anyone out there missed the first dozen instances of this: workloads on modern hardware can’t be isolated.
The gear was made to frag noobs in Counterstrike at 300 FPS.
Notably Intel and Nvidia were not impacted. I wonder if the security hardening that Google worked on with Nvidia for Stadia helped prevent this
Imagine a OS forgetting to replace your general purpose registers across context switches. Only a rank incompetent and useless security process would let something like that get all the way through to deployment.
For example Windows takes over GPU memory control, it virtualizes it, and allocates it to various applications, zeros it, etc...
You’d be surprised by how many security issues exist in GPU drivers
Just like someone can sue the maker of a trampoline made out of laptops when they cut themselves on them.
But if you are saying Amazon and other cloud providers should be sued, I agree, companies that use parts not fit for the application they are going for should be sued out of business.
That said, the existence of the parts isn’t a problem and I hope manufacturers keep making high performance parts for those of us who don’t use them for crazy and inappropriate things. Manufacturers could be slapped on the wrist for selling their consumer chips as server chips, but I think this is less of a problem because anyone who falls for it is already a walking catastrophe.