Aggressive Attack on PyPI Attempting to Deliver Rust Executable

Aggressive Attack on PyPI Attempting to Deliver Rust Executable(blog.phylum.io)

148 points by iamspoilt 3 years ago | 102 comments

woodruffw 3 years ago |

I understand that this is meant to be an eye-popping press release (and implicitly a product spotlight), but some of these claims make me gag.

It's not an attack "on" PyPI, or even an attack at all: someone is just spamming the index with packages. There's no evidence that these packages are being downloaded by anyone at all, or that the person in question has made any serious effort to conceal their attentions (it's all stuffed in the setup script without any obfuscation, as the post says). The executable in question isn't even served through PyPI (for reasons that are unclear to me): it's downloaded by the dropper script. Ironically, serving the binary directly would probably raise fewer red flags.

Supply chain security is important; we should reserve phrases like "aggressive attack" for things that aren't script kiddie spam.

agolio 3 years ago | |

The most "aggressive" part is that those sweet package names like "colorslib" are being stolen.

komali2 3 years ago | | |

My biggest curiosity here is how they generated over a thousand package names ranging from feasible to interesting. I expected gibberish.

Lol, maybe, "chatgpt, give me a thousand feasible pypi package names"?

lelandbatey 3 years ago | | |

Thankfully, they're not actually being stolen because all the packages were already taken down; they're available for legitimate use again: https://pypi.org/project/colorslib/

asperous 3 years ago | |

I think it's a serious threat, especially with LLMs now because people can make believable packages at scale. Not everyone vets their packages thoroughly

codetrotter 3 years ago | | |

Speaking of LLMs. Since LLMs like to hallucinate every now and then, an LLM could also hallucinate names of packages that it tells people to install. And those packages could in turn have been squatted by malware authors.

And in this way, malicious packages may be unintentionally downloaded by users even when those malicious packages did not yet exist when the LLM was trained. Just because the hallucinated package name was randomly later taken by someone malicious.

woodruffw 3 years ago | | |

You've always been able to make "believable" packages at scale. PyPI doesn't enforce uniqueness: you can crank out malicious near-duplicates of any package you please.

freeqaz 3 years ago | | |

I agree that it is a threat. I don't think this instance is (it's too noisy).

I wrote a comment on the NPM thread earlier (https://news.ycombinator.com/threads?id=freeqaz) that I'll quote here:

> "While being flooded with spam is never good, it gets immediately noticed and mitigated. It's harder for open source projects to spot and stop rare one-offs"

This is the real problem that NPM and other ecosystems face. A determined attacker that is trying to "poison" a popular Open Source package just has to feign as a maintainer long enough to succeed[0]. Defeating these types of attacks will require rethinking how we think about trust of packages.

Projects like Deno are one approach (fork the ecosystem) while projects like Packj (mentioned elsewhere here), Socket.dev, and LunaTrace[1] are taking the other angle (make it harder to install malware).

It's hard to say which approach is better right away. (Probably a hybrid of both, realistically) It's just non-trivial to fix this in one clean swoop. It's messy.

0: https://www.trendmicro.com/vinfo/us/security/news/cybercrime...

1: https://github.com/lunasec-io/lunasec

wheelerof4te 3 years ago | | |

Me, I just use the stdlib and my local packages.

There's something beautiful in knowing you're using pure, clean Python. Much easier to install, also.

worik 3 years ago | |

No. This is very concerning.

Attacking a popular repository like this does not have to have a high hit rate.

"Script kiddie spam" is now computers get compromised. Unsophisticated mass attack.

This sport of thing, combined with woeful security and fragile systems are causing havoc the world over.

ashishbijlani 3 years ago |

We’ve built Packj [1] to detect packages with install hooks, embedded binary blobs, and other such malicious/risky packages. It performs static/dynamic/metadata analysis to look for "suspicious” attributes.

1. https://github.com/ossillate-inc/packj

lrem 3 years ago | |

Why are these things riskier than the plain Python code you likely don't read, but go ahead and execute?

ashishbijlani 3 years ago | | |

A number of academic researchers (including us) have studied malware samples from past open-source supply chain attacks and identified code/metadata attributes that make packages vulnerable to such attacks. Packj scans for several such attributes to identify insecure or "weak links" in your software supply chain (e.g., missing or incorrect GitHub repo, very high version number, use of decode+exec, etc.). Full list here: https://github.com/ossillate-inc/packj/blob/main/packj/audit...

twodave 3 years ago | | |

It's relative, but I assume it's flagging for certain class of known malicious patterns. There's nothing stopping you from writing malicious python code, but essentially that script will only run while you expect it to in most cases unless it interacts with the OS in some way.

It doesn't make plain Python code you blindly execute any safer, but at least you've explicitly given those packages your trust. I believe this is more geared toward detecting compromises of those packages you have given that trust.

throwaway81523 3 years ago |

Wonder if that is related to the malware spamming of NPM that I saw something about last night.

Python used to have a "batteries included" philosophy which tried to put most important stuff into the distro, reducing the number of external dependencies any given app needed. They seem to have abandoned that now, leaving us to fend for ourselves against the malware.

NPM spam: https://www.scmagazine.com/analysis/devops/npm-repository-15...

wheelerof4te 3 years ago | |

"They seem to have abandoned that now, leaving us to fend for ourselves against the malware."

Yes, along with reducing the stdlib and directing us to PyPI for "alternatives".

throwaway81523 3 years ago | | |

Dumpster diving anyone? Npm always felt that way and PyPI is catching up.

belinder 3 years ago |

How is the rust part relevant?

throwthere 3 years ago | |

Chatgpt recommended it for the upvotes.

wheelerof4te 3 years ago | | |

"The most beloved programming language used to build and ship malware (PHOTO/VIDEO/NSFW)"

yabones 3 years ago | |

We like when malware is written in a memory-safe language.

dtgriscom 3 years ago | | |

Encrypt my files, but please don't waste my RAM while you do so.

mftb 3 years ago | |

If a payload is native it's potentially more of a problem than a script. If the payload had been c or c++, I wouldn't have been surprised if they noted that either.

sidlls 3 years ago | |

We should only refer to Rust when it's included in positive events? How is it not relevant here? It was used to build executables to inject, likely for malicious purposes. Given its newness and all the other hype around it, I'd say it's very relevant.

puffoflogic 3 years ago | | |

Why not also be sure to mention what OS was used for the build, and what linker, and what file format, and what model of computer was used, and what its default ui language was set to? What is so special about programming language used that sets it apart from those other factors that could be mentioned?

baguettefurnace 3 years ago | |

just like if a tesla is involved in a car crash, headline must mention Tesla

butterNaN 3 years ago | | |

Isn't that because sometimes the tesla software might be at fault?

HL33tibCe7 3 years ago | |

It’s unusual for malware.

selfmodruntime 3 years ago | | |

Yes, but only for the time being. I‘ve recently published a paper on the topic. Rust and Golang are getting immensely popular with malware authors.

almet 3 years ago |

It's still the same story : PyPI still doesn't have a way to automatically detect interactions with the network and the filesystems for the submitted packages. It's a complex thing to do for sure, but that would be a welcome addition, I guess.

blibble 3 years ago |

why does pypi/pip still not have namespacing?

Maven sorted this out 20 years ago

what's a bit sad is the python packaging's authority survey from a few months ago seemed to be mostly interested in vision and mission statements

rather that building a functional set of tools

woodruffw 3 years ago | |

Namespacing is not a security boundary: it's a usability feature that helps users visually distinguish between packages that share the same name but different owners. I don't think it would meaningfully affect things like package index spam, which this is.

(This is not a reason not to add namespacing; just an observation that it's mostly irrelevant to contexts like this.)

blibble 3 years ago | | |

obviously, but it allows delegation of trust onto other systems (like the DNS)

example: the package named "aws" on pypi was created by some random guy and has been abandoned for years

if pypi/pip supported namespacing that would be info.randomdude.aws instead

and amazon's packages would be under com.amazon

not being able to namespace internal packages is another security issue that is substantially improved with proper namespacing

to be blunt: not supporting it at this point is reckless and irresponsible

(I note you're part of pypa!)

Riverheart 3 years ago | | |

It can be if you implement it to be so. Just let people create an allowlist of approved vendors for their organization or project from those namespaces. This handles not having to approve individual packages from trusted entities like Google, Microsoft, etc. Update the list when new vendors are needed. Reuse elsewhere as necessary.

Maybe the list can be hosted on an internal server for other employees to reuse. Hosting all the packages internally is overkill. Trusting the world by default is overkill.

Now "pip install gooogle/package"

"Hey User, gooogle/package is not from a trusted namespace. Did you mean google/package which is similar and trusted? Or would you like to add gooogle to your local trust file?"

The lack of any kind of curated feeds that only lists verified or popular packges is tragedy. There should be a reasonable way of allowing clients to protect themselves from a typo.

lelandbatey 3 years ago |

Interestingly, all the packages, even the ones from today, have all been taken down. So too have all the files that were being hosted on Dropbox.

photochemsyn 3 years ago |

Wow this site runs a lot of JavaScript, speaking of aggressive data collection.

https://blog.hubspot.com/website/data-mining

ianai 3 years ago |

Even animals in the wild agree to peace around the watering hole.

readthenotes1 3 years ago | |

Someone forgot to tell the crocodiles...

https://journals.plos.org/plosone/article?id=10.1371/journal...

eternalban 3 years ago | |

You forgot some of those animals have fangs.

It's like NYC's side walks. Compare pedestrian behavior at say SoHo (daylight) and say LES (nighttime). Amazingly enough, the partying and inebrieted pedestrians at night all file politely in the correct bimodal L|R formation. During the day, it's a rather wild and somewhat uncivilized dynamic slalom formation. My theory: Fangs. The night creatures know someone potentially dangerous maybe in the midst.

MadSudaca 3 years ago | |

The problem is assuming we’re better than animals.

ianai 3 years ago | | |

I was rolling the shame or guilt card.

timeon 3 years ago | |

By animals you mean something like mammals?

steponlego 3 years ago |

Yet another attack that requires the biggest malware vector, MS Windows. LOL

puffoflogic 3 years ago |

I read TFA three times and I still have no idea what they meant by "Rust stage 1 executables".

In these cases I frankly assume that they don't either.

fortran77 3 years ago |

But I thought Rust was supposed to be safe?!