GitHub Packages Is Down(githubstatus.com) |
GitHub Packages Is Down(githubstatus.com) |
It’s acceptable for debian mirrors lag a few minutes or hours behind. The same thing is much harder to accept when the rate of change is much higher. Different requirements, different tradeoffs.
Say that version 1.2.1 of a library is release right as you do your build, that won't go into production within that 24 hour window anyway. If it is a security fix, then, like Debian, you pull that from another repository, which is under tighter control.
Given the OP, note that packages on crates.io don't (and can't) reference Github. Crates.io has its own storage, and the only way to upload a crate to crates.io is if 100% of its dependencies are also on crates.io.
/s
the user's take was "why don't they use GitHub packages?"
"still running today" doesn't mean 100.0% uptime.
What's perhaps even more surprising to me is that, after a repeated track history of severe and frequent Microsoft-Github outages in the last three years, it is still a hard dependency for so much of the modern software stack.
This is where OpenAI is now feeling the effects of instability [1] on Azure since their recent outage. I expect them to also have issues like GitHub has every month.
Now even in the few areas where MS actually does have this pressure (e.g. Azure) they're struggling to make it part of the culture.
Poke, poke!
* GitHub's Packages Service is down
* GitHub Packages are down
The name is GitHub Packages. It's singular. The use of "is" is correct here. GitHub uses "is" in similar circumstances as well.
Today most companies need private package registries. Legacy networks are a resource drain. Nobody else uses your private packages nor do you want anybody else to host a mirror and authentication is required anyway.
Plus the idea that GitHub is hosting everything in a single datacenter is laughable on its face.
Personally I find the notion that GitHub is somehow magically superior to the rest of the entire internet a bit silly.
I’ve worked on distributed systems my entire career, I have yet to find a single one that is completely immune to a datacenter outage, there is always some single point of failure not considered- often it is even known, everyone has the “special” datacenter.
Its also true that “market forces” push for better cost optimisation, which can, in cases, lead to being not sufficiently sized to cope with an outage of a whole DC- made worse are people who think cloud will solve this; because every customer will be doing the same thing as you during a zonal outage.
Regardless of that; you are basically suggesting that github, as a centralised system, is better equipped to deal with the distribution of packages than a literal distribution of package repositories?
That’s odd to me, maybe not “laughable”, but certainly odd.
Heck, even in Asia I did not have trouble with finding a good mirror.
Problem solved(ish).
If that flow was broken there’s only a handful of things it could be; specifically azure blob store or azure MySQL, but both of those should have layers of redundancy. Public anonymous download bypasses most everything else; no auth services, rails monolith, or metered billing. It does emit some events to the message bus for metrics, but that’d effect much more than packages if there was an issue with it.
As far as I’m aware this is the first time anonymous public package download broke since I left a few years back.
EDIT: I was wrong, the crates index does use a GitHub repo
Anyone could just change username/organization and break thousands/millions of build.
The binary hosting at https://cache.nixos.org/ is independent of GitHub, and so are the old-style channels at https://channels.nixos.org/. The new-style flake registry used to be fetched from GitHub but has now been moved to https://channels.nixos.org/flake-registry.json. Admittedly in a new-style situation you’re likely to be using unlocked flake references that refer to GitHub (e.g. Nixpkgs), but it’s on you to lock them and pull them into your Nix store in that case.
Of course, you also get GitHub references for upstreams that host their code there, but that applies to almost any distro(’s build system) except the oldest of the old-timers which host the source for the whole distro on their own infrastructure, like Debian. (I happen think the old-timers are right here, but that’s beside the point.)
This is actually the second (maybe third; I didn't even know about GitHub's outage a week and a half ago so idk how Nix was or was not affected) time in three months or less that a partial GitHub outage or GitHub change has taken down Homebrew while leaving Nix unaffected.
No, that's not what I'm saying. I'm explaining why "inferior"-quality alternatives sometimes win: the market prefers a different metric. In this case, ease of operation, ease of setup, and price are more important than sheer uptime.
But even so, at least the mirrorlist.txt file that appears in the mirror:// URI must be available for it to work, right?
https://manpages.debian.org/bullseye/apt/apt-transport-mirro...
You can still use it in vanilla Debian, but they don't make their mirror list available easily in the correct format, so you would have to basically curl + awk the URLs into a text file and use that.
My guess is that Debian itself probably sees less than 1% of the traffic on their mirrors compared to Ubuntu and they haven't been as motivated to make this change.
Setting up your own mirrors for internal use isn’t overly difficult either, and it is definitely a trade-off as you pointed out.
However, it basically works for everyone, whether or not they are fully aware of it.
I have also run my own mirrors with minimal fuss. I haven’t had a business need to use GitHub packages, but I am glad it exists, as it is another tool to do a thing that needs doing in the right circumstances.
For crates.io specifically, the packages are stored in S3, whereas the index is currently stored as a bog-standard Github repo (not as a Github Package), and in the near future the crates.io index will also move to crates.io itself (https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-spar...).
First, the index is very large, and it only ever gets larger over time. I just cloned and compressed the crates.io index (https://github.com/rust-lang/crates.io-index), which resulted in a 58 MB archive (note that I did remember to delete the .git directory).
Second, the index changes very often. Every time anyone ever publishes a new version of a package, that changes the index. For crates.io, this happens hundreds or thousands of times per day.
Third, the index is append-only.
Fourth, the index is extremely frequently requested. Any time the user manually asks for an update, or any time the user adds a new dependency, the local copy of the index needs to be updated.
Putting it all together, since the index is constantly changing and since users will constantly be asking for the latest version, this means that it would be very inefficient to serve the whole thing each time. Instead, a fine-grained solution is more efficient. In the early days of crates.io, this problem was solved by just storing the index in a git repo and letting git take care of fetching new diffs to the index (and the problem of "who pays for hosting" was solved by using Github). Now that the crates.io index is outgrowing this solution, it's moving to a more involved protocol where clients will not have local copies of the full index, but instead will only lazily fetch individual index entries as necessary, which is much faster (especially for fresh installs (including every CI run!)).
I imagine it's easier to get people to mirror curated, signed packages than, effectively, random code
crates.io also uses GitHub as an OAuth provider (and it's currently the only one offered), so if that broke then people wouldn't be able to publish crates, though downloading existing ones would presumably still work since you don't have to log in to do that.
[1] https://github.com/rust-lang/cargo/blob/master/src/cargo/sou...
[1] https://rust-lang.github.io/rfcs/2789-sparse-index.html
[2] https://blog.rust-lang.org/2022/06/22/sparse-registry-testin...
[3] https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-spar...
twice this year I've had to spend an hour or more with a user because a mirror was down. that's one more time than I've had to deal with a GitHub packages outage this year.
the latest was yesterday. they chose another pool of mirrors and the mirror continually chosen from that pool was down as well. finally I manually checked a mirror, made sure it was up and that signatures matched, then gave them that specific hostname.
the Linux package distribution system is not better. it's just different.
anonymous stuff on GitHub is usually limited to 60 requests per hour per ip address. if you're authenticated, it's several hundred if not several thousand.
In Python, we don't say "we don't host packages on a proprietary platform", we say "we have absolutely no clue where they are hosted and nobody audits them anyways, and we don't enforce package signing, and we'll just build from source with no build isolation what so ever, unless you remember to specify an obscure command-line option when installing... and have a nice day!"
Package signing is, well... I suppose that's another lesson from the '90s people will learn about soon enough. With a web of trust as broad as python or npm you'll just have everyone running around with signing keys and "trusting" any key they come across because none of it is built on personal relationships. When Archlinux asks me to confirm adding package keys, what am I going to do? Say no? I don't know these people, but I want my shit to work.
With systems like Python, I'd imagine that a solution to web of trust would be that some group of developers would organize a curated set of packages. So, for the cases where you need better security assurances, you'd use that. I mean, of course there's no guaranteed solution for the web of trust, but, in practical terms, something like that would be good enough for regulators.
There's already stuff like NumFOCUS. They don't particularly focus on the technical side of things, or endorsing more secure practices, but, in principle, they could. Maybe there will also be others once we have been bitten more times by some security breaches.
GitHub redirects you to the new name in the event of a rename and you look up the old one.
so it's not quite as bad as you're imagining but still not great.
fortunately GitHub is starting to require 2FA for very popular projects (starting with NPM) because of supply chain attacks like what you describe.
Every office I've ever worked at has that one guy who is really good at tickling Google with scripts until it puts you all behind a CAPTCHA.