GitHub degraded performance – resolved(githubstatus.com) |
GitHub degraded performance – resolved(githubstatus.com) |
When I get back from vacation we are moving our shit to the enterprise plan. $21/user-month is really not that big of a deal when you are running basically your entire business through the product.
I do agree that it's ridiculous to assume that we can manage Github's software better than their own engineers, but at the same time our infrastructure has proven itself to be extremely reliable over the last 4-5 years. Even hosting GH enterprise on public AWS/Azure is more ideal in my eyes now, because I can control the physical region and tenancy. There is an Azure datacenter within 100 miles of many of our home offices and I can ensure that our Github stack spins up there. Minimizing the amount of internet you have to transit to get to your applications can sidestep a lot of this stormy public cloud/internet weather bullshit.
This is exactly what we want though. We don't need the new fancy shit on a regular cadence. Issues, Code, PRs and 1 line checkbuild scripts are all we care about. Everything else is built into our software.
That by itself would be ridiculous, but there's more to it than that. Your GHE server won't have new code deployed to it hundreds of times per day the way github.com does. You probably won't be the target of ddos attacks either.
Very few of github.com outages are the result of maintenance errors.
I know outages are frustrating, but how does 30 minutes before 10am ruin an entire day? Maybe you’re just being hyperbolic, but people take coffee breaks longer than that.
Not everyone is on pacific time. This impacted us right in the middle of a standup call and disrupted our planning for the day.
Also, the problem is more that you don't know how long the outage is going to last at first, so you start finding other ways to occupy your time. Through the lens of hindsight, yes we are certainly being hyperbolic in those cases where it was only 30 minutes.
Are we entering an era where if we don't have hundreds of thousands of servers running 24/7 to host our services, with all the resource consumption and environmental implications that result, that we will no longer be able to remain productive as a society? Is this gradually becoming a new baseline for humanity from which we cannot reasonably downsize?
It's worth noting that you can take almost any software from before the late 90s to early 2000s, depending on the vendor, that is still available, and with a layer of emulation get it running in minutes.
The vast majority of software that is being built today for end users simply will not function in a short time frame because of aggressively built in dependencies on cloud based services, often with those dependencies designed to encourage customer lock-in and prevent piracy by forcing users to have active accounts and shift core logic from endpoints to cloud services.
Even moving past licensing servers and account capabilities, tools like Grammarly ship much of their analysis to the cloud, same for most translation services. Many modern text to speech services are cloud based as well (just look at how useless a modern cell phone becomes when you are without a data connection, for example).
I don't know what the statistics would look like, but I shudder to think how much of the world economy would grind to a halt if Amazon or another significant cloud provider had a sustained, multi-region outage (say 24-48 hours).
It's a god-damn mess, and we did it to ourselves.
We've been in this era for about 4 decades now. There are mainframes which do payment processing that, if they were to fail, would cause substantial harm to the global economy almost instantly.
You've no guarantees that your local'ish data centre is going to be hop-wise, route-wise and peering-wise any better than a DC 1500 miles away, in relation to your home or office ISP.
You're correct. In fact, as I type this reply my cloudflare diagnostics are indicating I am talking to a datacenter 200 miles further away than would otherwise be ideal. That said, its still within an extremely reasonable distance. This is a "risk" I am willing to take. It's certainly a better starting point than guaranteed 70ms minimums.
Why? It's totally reasonable that your own GHE instance will have better uptime.
Running GitHub.com is much, much harder than a private instance (DB scale-out, load, ...).
Our in-house Gerrit infra and CI has had a significantly better uptime than GitHub over the past year, but we have hundreds, not 60 million users and exabytes of storage :-)
If you can't work for a day just because Github is down, then there's bigger problems in your process that github being down. I'm sorry of that sounds harsh, but you're either being hyperbolic or you have some real issues to fix in your team or organisation.
Not really. You can mess with stuff when it suits you with a risk for downtime. Hosting yourself has the same advantage as disabling auto-updates - you are in control of when to break stuff.
You might have lost a day today, but how many days have you gained thanks to these tools the last month?
Oh boy, this is going to be a fun day.
[0]https://news.ycombinator.com/from?site=githubstatus.com
[1]https://hn.algolia.com/?dateRange=pastYear&page=0&prefix=fal...
Which is actually more than I expected, and seems like kind of too much.
i'm not sure I have any ID's to give to those API calls. Would have preferred something from the web UI.
I just went ahead and created new commits for them all. (Create an empty commit, or --amend to re-commit the last commit with a new timestamp).
Would prefer if there were an easier way to do it, but fighting with http API's I am not familiar with when they aren't immediately apparent and I'm not sure they'll work at all was not that easier way.
You can rest at ease. Nothing that is mission-critical for the world's financial infrastructure is hosted within one of these sorts of facilities. Facebook and Netflix might go down for days, but your Amex will still work at any merchant with a functioning internet connection.
I have been inside of [financial services organization]'s datacenter in [some midwestern state] which was purpose-built for the IT load. The strategy is "failure is not an option". It's essentially 1 gigantic, redundant life-support system for the one of the more sensitive computers on the planet. Amazon and Microsoft cannot afford to go to these lengths for the market they serve.
Financial services are important for moving money around, and processing electronic payments, but it doesn't matter how effectively you can process a wire if there are significant supply chain disruptions, and systems failures that take down the platforms that major retailers and distributors use for logistics.
Even if ATMs and bank networks remain up, what about the encashment and physical security services that those institutions rely on to move around actual physical money?
The economy is more than just financial services, and all of those financial services are just proxies for the real world goods that people need to survive for more than a few days in most urban centres.
What is really rough about GHE is that you can't choose a lot of the features or IMO baseline requirements like caching that you've probably come to expect from github.com, and may have been around for years. At least not until they can get GHE to parity with .com.
At the very least, they use just absolutely incompatible yaml files for their CI pipelines (in, of course, an incompatible location in the repo)
But probably the bigger obstacle would be their incompatible API (and incompatible auth to it); that means one cannot grab a cool "github bot/tool for doing $X" and expect it to do anything reasonable in GitLab
Ok, pick a ticket and do some work on it locally, when that's done do the same with another. I can go a full day without interacting with github because I'm working on a local branch. Make a note of what branches you need to push later. I can't possibly imagine throwing my arms up and saying the day is wasted because I have to work locally. It's completely unbelievable.
Based on my contact with GitLab's built-in other scanning tools, I wouldn't trust their vuln management further than I could throw it, so you're likely not missing much on that front
This is going to be my Slack status tomorrow.