Post-mortem of this weekend's NPM incident

Post-mortem of this weekend's NPM incident(blog.npmjs.org)

132 points by txmjs 8 years ago | 43 comments

bhuga 8 years ago |

This is a good post-mortem with clear, policy-based remediations. Nicely done.

I wonder why they are only preventing republishing for 24 hours. Is there a good reason to allow a package namespace to be recycled with less than, say, a week? Is it based on the assumption that the only case where it comes up is during an incident, and 24 hours is enough time to assume an incident will be resolved? I'm curious what went in to that number.

smt88 8 years ago | |

Why allow namespace recycling at all? The potential harm is high and the potential benefit is some slight convenience.

If npm packages used a Github-style "author/package" format, name collision would never be an issue again.

colanderman 8 years ago | | |

When your code deployment model is effectively "download stuff from random websites", I feel like namespace recycling is the least of your worries.

(That is to say, trusting that any given named package that `npm install` downloads is what you think it is is really no different that trusting `wget https://example.com/thecode.tgz`. Even if you verify that the domain hasn't switched hands, you have no guarantee that the author's pipeline wasn't compromised, or that the author didn't add malware themselves. There's a reason Debian, Red Hat et. al. put a lot of effort into ensuring integrity of their repositories.)

Klathmon 8 years ago | | |

>If npm packages used a Github-style "author/package" format, name collision would never be an issue again.

They have that, and many are finally starting to take advantage of it (with babel being the most prominent with their latest version)

But this doesn't completely "fix" the problem, since the exact same conflicts can still happen with the "author" name (if someone takes "google\" there are going to be some very upset californians)

Ajedi32 8 years ago | | |

Seems like they're only allowing name reuse in the case of spam packages. Not allowing name reuse in that case might result in lots of names being rendered permanently unusable by automated spambots.

Assuming no actual users are depending on packages which are literally just spam, I don't really see an issue with reusing the names of those packages.

pmarreck 8 years ago | | |

You already sound smarter than whoever leads the Node Package Mess.

JonathonW 8 years ago | | |

I'd like to see a package registry with (1) Github-style author-namespaced packages, and (2) package signing (i.e. if an author starts signing packages with a different key, I'd like to know about it). Maybe integrate the latter with Keybase to help users decide if they should trust a key.

I don't know how you gain any kind of critical mass trying to compete against a well-established registry like npmjs, though.

parenthephobia 8 years ago | | |

I think package URIs should include a secure hash of their contents.

Although you won't get updates without asking for them - I'm not sure that's a bad thing - you can be assured that you'll either get the package you were expecting or no package at all.

detaro 8 years ago | |

They allow deletion of packages for 24 hours without staff involvement, there is nothing said about a time limit on republishing after deletion?

bhuga 8 years ago | | |

From the response steps:

> Our first action, which began immediately after the incident concluded, was to implement a 24-hour cooldown on republication of any deleted package name.

josephorjoe 8 years ago |

So, a spammer uploaded something containing copied data from a legitimate user and npm deleted everything from that user. Oy.

Seems like npm might want to review the policy that allows stuff like that to happen.

Even if a user violates the spam policy (which, to be clear, it seems the affected user in this case did NOT do), that hardly seems to be appropriate grounds for deleting everything the user has ever published on npm.

That is a policy that is just begging for griefing.

detaro 8 years ago | |

> Seems like npm might want to review the policy that allows stuff like that to happen.

That's one of the things the post mentions as what they are doing.

DanBC 8 years ago | |

Are "joe jobs" still a thing?

https://en.wikipedia.org/wiki/Joe_job

> A joe job is a spamming technique that sends out unsolicited e-mails using spoofed sender data. Early joe jobs aimed at tarnishing the reputation of the apparent sender or inducing the recipients to take action against them [...]

daurnimator 8 years ago | | |

Yep. I had one against me mid last year.

cwmma 8 years ago | |

It wasn't a policy it was a spam heuristic

josephorjoe 8 years ago | | |

I meant the policy which allowed this to happen:

`In the course of reviewing and acting on spam reports, an npm staffer acted on this flag without further investigating the user and removed the user and all of their packages from the registry.`

Specifically, a policy that allows removing "all of [a user's] packages" based on something related to the user rather than on the packages themselves.

Feels like there should be a disconnect between decisions made about a 'user' and those made about a 'package'.

Once the package is published, there should be an understanding that the package belongs to npm and npm's users, even if the original publisher retains some authority over it.

And if there is cause to ban a user, it should not automatically mean that packages published by the user are affected (aside from removing whatever authority the user had).

cremp 8 years ago |

> we have policies against ever running SQL by hand against production databases—but in this case we were forced to do so

Uh... Add in the fact that staff are now trigger happy, since a single button can do a lot of damage.

dumbmatter 8 years ago |

Our first action, which began immediately after the incident concluded, was to implement a 24-hour cooldown on republication of any deleted package name.

Why not infinity hours? I don't get it.

Vinnl 8 years ago | |

If it's a spam package that gets deleted, that would mean you'd quickly run out of available names.

h1d 8 years ago | | |

Why can't they just reuse when it is apparent the case is harmful (as in, people complain and check number of downloads and dependent packages) by blocking the name and disallow reuse for any other cases?

kylemuir 8 years ago |

> Our first action, which began immediately after the incident concluded, was to implement a 24-hour cooldown on republication of any deleted package name

I don't understand this. Why hard delete packages at all? Soft deleting feels like it would be easier and would stop people republishing with the same name.

They could also bake their warning process for dependent libraries (i.e. "this package is gone!") into the soft delete process.

carsonreinke 8 years ago |

I feel like a project that could help with this to identify package importance by the dependents and downloads.

carsonreinke 8 years ago | |

I think I might actually try this out.

kodablah 8 years ago |

> At the time of Saturday’s incident, however, we did not have a policy to publish placeholders for packages that were deleted if they were spam.

I see this acknowledgement, but I cannot find where they will remedy this by putting placeholders in place of spam removals. As a concession, maybe only placeholders for spam removals of packages that are older than X days or depended on (explicitly or transitively) by X packages. Did I miss where the remedy for this spam-removed-package-reuse was in the blog post?

avianlyric 8 years ago | |

They have added a 24hour re-publishing cooldown for all package removals regardless of reason. Exceptions are made for the original publisher and npm staff.

Explained somewhere near the bottom of the post, basic rational is that it gives them time to notice fuckups and fix them.

kodablah 8 years ago | | |

This does not alleviate the issue where you can reuse package names. I suppose they believe what they mark as spam packages won't be used enough or is already bad enough that reusing the name is harmless. And they probably also believe that they can catch fuckups in a day. I don't think either are necessarily true and are only true in this case because it hit popular dep trees. But what happens when something is erroneously marked as spam that's not as popular and the downstream dependents don't realize in 24 hours? If the problem is that "placeholders" are too heavy, then they could be made lighter weight or put some rules around when they will add them and when they won't.