Incident Report: Railway Blocked by Google Cloud [resolved]

Incident Report: Railway Blocked by Google Cloud [resolved](status.railway.com)

547 points by aarondf 23 hours ago | 346 comments

Subsequent thread: Incident Report: May 19, 2026 – GCP Account Suspension - https://news.ycombinator.com/item?id=48204770

r721 15 hours ago |

>We have resolved this incident and a post mortem is available here.

>https://blog.railway.com/p/incident-report-may-19-2026-gcp-a...

>May 20, 07:57 UTC

https://status.railway.com/incident/I23M92U0

dang 7 hours ago | |

Thanks! The post-mortem is currently on the frontpage here:

Incident Report: May 19, 2026 – GCP Account Suspension - https://news.ycombinator.com/item?id=48204770

gcr 12 hours ago | |

> Railway owns our vendor choices, and we ultimately own this one. Your customers don't care whether the failure was Google or Railway; they see your product. Your uptime is our responsibility, and we'll keep delivering on it.

This is an excellent closing statement.

sschueller 14 hours ago | |

It should be possible to sue Google for damages in such cases. This isnt a network outage or service failure which I would consider part of ToS.

VladVladikoff 11 hours ago | | |

What if the reason for their stuff being shut down was a payment issue like an expired credit card or maxed credit account? Unless I missed it skim reading their post I don’t see any information anywhere about their communications with Google.

Cthulhu_ 10 hours ago | | |

It's always possible to sue, but Google has good terms of service and lawyers - I'm 99% confident that a lawsuit would end up nowhere.

redwood 10 hours ago | | |

I can assure you that Google will be giving them significant commercial incentives as an apology for this behind the scene

quentindanjou 14 hours ago | |

Railway say the incident is resolved but many are still down (returning 502): on our side, we had to manually trigger a redeploy to fix it but I believe it should have been triggered automatically by Railway and I can't understand how they can mark this as resolved while many are still down.

In total, down for >11 hours on our side.

dangoodmanUT 22 hours ago |

It has been 0 days since GCP has taken down a startup (again).

You see this at least once a year. Never heard of this from AWS or Azure.

In all seriousness, this is why we don't use them. They have the most ergonomic cloud of the big three, then absolutely murder it by having this kind of reputation.

somewhatgoated 21 hours ago | |

On the other hand i can’t remember when there was a serious outage on GCP, unlike AWS/Azure who seem to go down catastrophically a couple of times per year.

abofh 21 hours ago | | |

I've been in AWS for almost twenty years at this point. It's been a long time since I've seen a global outage of the data plane on anything. The control plane, especially the US-east-1 services? Yes - but if you're off of east-1, your outages are measured in missile strikes, not botched deployments.

adamtaylor_13 20 hours ago | | |

Perhaps you don't notice GCP outages because so few companies rely on them?

pixl97 21 hours ago | | |

GCP never goes down because they banned all their customers.

plandis 21 hours ago | | |

GCP has had outages. From a quick search it looks like they had a global outage less than a year ago:

https://status.cloud.google.com/incidents/ow5i3PPK96RduMcb1S...

JoRyGu 21 hours ago | | |

AWS goes down catastrophically but are back up in minutes/hours most of the time (as long as they aren't down because Iran blew up their data center). That's obviously REALLY bad for certain industries, but I suspect for the vast majority of their customers it's not a big deal. We've been able to isolate the damage almost every time just by having AZ failover in place and avoiding us-east-1 where we can.

corpoposter 21 hours ago | | |

IIRC the Paris datacenter flood took down a whole “region” and some data was permanently unrecoverable.

nemothekid 20 hours ago | | |

>On the other hand i can’t remember when there was a serious outage on GCP

They had a really bad global outage a year ago. At least with AWS outages are contained to a single region.

onion2k 17 hours ago | | |

You can't have 100% uptime. It's unfeasible, especially for a startup. You should be telling your customers that downtime might happen, sometimes for reasons beyond your control, and that if it does then you'll do your best to recover and to compensate them for the inconvenience. You should cultivate a relationship with your early customers that makes them feel bad for you when there's an outage rather than angry about how it impacts them. Maybe even go as far as firing the customers who give you a hard time over it. That way if your cloud provider falls over it's really annoying but not a big deal.

Your cloud provider blocking your business from running is far worse.

mlhpdx 17 hours ago | | |

None of the AWS “outages” have impacted us. They have either been regional, in which case we stand down the region (we run multiple hot regions), or didn’t involve things we need to maintain operation.

I can’t imagine AWS ever doing such a cascading delete. I mean, they have made deletion protection a difficult thing to ignore even for individual resources.

blobbers 21 hours ago | | |

Unfortunately, if everyone goes down people are understanding. If just _you_ go down, then its oddly less forgiveable.

manyatoms 20 hours ago | | |

How is blackhole-ing a customer not considered an outage?

Izikiel43 20 hours ago | | |

I still remember the one where they nuked all the storage of I think an Australian insurance company I think, luckily the it department had done a multi cloud setup for backups

devmor 21 hours ago | | |

There was a pretty bad one last summer - their IAM system got a bad update and it broke almost all GCP services for an hour or so, since every authenticated API call reaches out to IAM.

It had lasting effects for us for a little over 3 hours.

danesparza 21 hours ago | | |

You can read the parent post, right?

overfeed 21 hours ago | |

> Never heard of this from AWS or Azure.

AWS does it more efficiently; it takes down many startups at a time when us-east-1 goes down.

stingraycharles 21 hours ago | | |

That’s an entirely different type of problem, and avoidable by just using us-east-2 (I still don’t understand why people default to us-east-1 unless they require some highly specific services).

xavdid 20 hours ago | | |

If my cloud provider brings my startup down, it's my problem. If they bring all the startups down, that's their problem.

yandie 20 hours ago | | |

During my 5 years of my startup, we had only 1 outage due to AWS because we picked us-west-2 as the primary reason. If anyone starting a company and picks us-east-1 as the primary reason, they should be fired. There's absolutely no reason to be in that region.

mgfist 20 hours ago | | |

And we all celebrate it since we can't do any work

Spooky23 20 hours ago | |

https://en.wikipedia.org/wiki/Timeline_of_Amazon_Web_Service...

Azure nerfed the front door of all Azure and O365 services last year.

All of these companies are great at what they did, and occasionally fuck up.

OsrsNeedsf2P 18 hours ago | |

AWS has throttled our service so badly that we couldn't operate. I was thinking of writing a blog post about how they stalled our growth for a month but it seems moot

rozap 21 hours ago | |

Yep, we also don't touch them for this same reason.

abrookewood 22 hours ago | |

Yep, agree 100%. Such a stupid move on their behalf.

jameson 22 hours ago | |

What was the reason GCP took down a startup previously?

__s 21 hours ago | | |

hn.algolia.com gcp blocked

https://news.ycombinator.com/item?id=46731498 https://news.ycombinator.com/item?id=33360416

Then I recall https://news.ycombinator.com/item?id=45798827

https://news.ycombinator.com/item?id=33737577

busterarm 21 hours ago | |

Hetzner and OVH also do this all the time.

It's AWS and Azure that are the outliers and tend not to care too much what their customers do with their infrastructure. AWS is perfectly fine with allowing me to run copies of 15 year old vulnerable AMIs copied from AMIs they've long since deprecated and removed. Even for removed features like NAT AMIs.

tjpnz 22 hours ago | |

AWS normally contacts you first.

kevin_nisbet 22 hours ago | | |

Do they?

The only anecdotal thing I've seen is we hired a vendor to do a pentest a few years ago, and they setup some stuff in an AWS account and that account got totally yeeted out of existence by AWS if memory serves.

cherioo 22 hours ago | | |

They better do. What is google doing?

tardwrangler 19 hours ago |

Everyone is eager to point a finger at Google, but I've been a user of Railway for a while now, and I've seen enough nonsense to want to hear what GCP has to say about this before drawing any conclusions. Let's just say Railway has had problems like this before, and the way their team handles them does not inspire any confidence.

Regardless of how it happened, for me, this is the straw that broke the camel's back.

valgaze 20 hours ago |

May 2024 UniSuper incident: https://cloud.google.com/blog/products/infrastructure/detail...

https://www.unisuper.com.au/about-us/media-centre/2024/a-joi...

A joint statement from UniSuper CEO Peter Chun and Google Cloud CEO Thomas Kurian

8 May 2024

UniSuper and Google Cloud understand the disruption to services experienced by members has been extremely frustrating and disappointing. We extend our sincere apologies to all members.

While supporting UniSuper to bring its systems back online, Google Cloud has been conducting a root cause analysis.

Thomas Kurian has confirmed that the disruption arose from an unprecedented sequence of events, where an inadvertent misconfiguration during provisioning of UniSuper’s Private Cloud services ultimately resulted in the deletion of UniSuper’s Private Cloud subscription.

This is described as an isolated, “one-of-a-kind occurrence” that has never before occurred with any Google Cloud client globally. This should not have happened. Google Cloud has identified the sequence of events and taken measures to ensure it does not happen again.

Why did the outage last so long?

UniSuper had duplication across two geographies as protection against outages and data loss. However, the deletion of the Private Cloud subscription triggered deletion across both geographies.

Restoring the Private Cloud required significant coordination and effort between UniSuper and Google Cloud, including recovery of hundreds of virtual machines, databases, and applications.

binarycleric 22 hours ago |

How the heck do these things happen, especially with companies with huge monthly spend? At my last job we had some suspicious workloads running on AWS and our TAM reached out to us before taking any action. Who wants to bet this was some AI automation gone wrong and because GCP seems to be allergic to actually contacting a human to get a response, this just sits in some support queue that outsourced workers look at after a few hours just to give a canned response?

BitWiseVibe 22 hours ago |

As someone who runs some public APIs, the amount of spam from Railway IPs is insane. They have horrible abuse prevention. Hopefully this encourages them to improve their operations.

nikcub 20 hours ago | |

This is the conflict at the center of running a hosting company - make it easy to signup and you get a lot of new users but also a lot of abuse.

Implement anti-abuse measures and you will hit some loud false positives (this may be the case with GCP here).

I don't envy anybody running a hosting co - the internet is a really ugly place under the surface.

edit: to add - AWS are really good here. Must be the ~30 years of retail fraud and abuse experience.

duckmysick 17 hours ago | | |

Hetzner is famously aggressive with their KYC (Know Your Customer) requirements, often locking new sign-ups and asking for photos of ID.

Damned if you do, damned if you don't.

bootsmann 17 hours ago | | |

Is it really a false positive if railway lets people run abusive services on GCP and then GCP consequently shuts them down?

edelbitter 19 hours ago | | |

I continue to receive phishing via AWS pretending to be Amazon. And not even the Unicode-lookalike shenanigans that my spam filter refuses for excessive mixed scripts, no; literally claiming to be Amazon as in: the company that operates the relay.

swyx 17 hours ago | | |

i wonder if DID or World (various ways of Proof of Human) can help solve this issue.

fjni 23 hours ago |

Wait… railway runs on GCP? Didn’t they make a whole thing about not “building a cloud on top of another cloud?”

Or did they just mean that they’re not renting VPSs but only metal from the cloud provider?

In my mind I was so excited that there was another provider not just paying one of the hyperscalars but at a minimum colocating and owning more of their stack. https://blog.railway.com/p/heroku-walked-railway-run

miniman1337 23 hours ago | |

from the blog linked via Wayback Machine. "From Day 1, we had this notion at the forefront.

The other notion that we have intuited is that you can’t build a cloud on another cloud. We have devoted years of practice running our own metal (and playing well with other clouds) to make sure that Railway’s business, which invariably becomes your customer’s business, is as rock solid as possible."

dlcarrier 20 hours ago | | |

I'm not familiar with Railway, so this might not make any sense, but it's possible they were using their own hardware but managing it with Google accounts. It's not uncommon for a company's offsite human-to-human communications to fail when there's a Google outage or ban, so it's not unexpected to have the same interference with human-to-machine or machine-to-machine communications.

MrDarcy 22 hours ago | | |

That’s strange, when I interviewed with the founder a few years ago he told me they were on AWS wanting to move to firecracker.

eoswald 23 hours ago | |

Yep, and this is why I'm pissed. They lied. They're completely dependent on GCP. So, I gotta do some research, i need something a little more stable (and less dependent on one company's whims) than this. This is bad for them, because it really strikes at the heart of their 'big claim,' peacefull software deployments. This is chaos.

chatmasta 21 hours ago |

I thought Railway was building their own data centers? [0]

> The fact of the matter is, you simply cannot build a cloud on someone else’s cloud.

Indeed…

[0] https://blog.railway.com/p/launch-week-02-welcome

QuinnyPig 19 hours ago | |

Vercel seems to be pulling it off. So does PlanetScale, albeit for databases only. But everything’s a database.

ksajadi 18 hours ago |

When you signup for Railway, they have uncommon way of making sure you have read and understood their T&C regarding abuse of their systems, including crypto mining, etc.

My guess is that many are abusing their free tier, causing them trouble with their service providers.

I take no joy in seeing Railway take a hit like this, even as a competitor, but free compute attracts all sorts of strange users. We've been there and decided early on to avoid free compute even it costs us our top of the funnel.

eoswald 23 hours ago |

Sorry, I have a hard time blaming Google for this, when Railway seems to be having increasing trouble keeping the platform stable. Something like this should NOT take down an ENTIRE service. There should be a backup when literally your business is about being the reliable backend. This just seems like poor planning to me.

ryanisnan 23 hours ago | |

I don't quite know what you mean. Do you really expect Railway to use a multi-cloud architecture to host all of their client's projects? I suspect that would lead to a lower availability, all things considered.

eoswald 23 hours ago | | |

Well, in the same token, is it smart to base your ENTIRE architecture on a single cloud architecture? Isn't that why some of us build in fallbacks for AWS-hosted services? I mean, their enitre platform, both public and private facing, is running on the same thing. One error, one problem, takes out the entire service.

impulser_ 23 hours ago | | |

They literally own their own data centers. That's whats surprising about this. They are lying to their customers when they say they operate their own data center because obviously they don't if everyone's apps are down with GCP blocking their account.

cactusplant7374 23 hours ago | |

Disaster recovery is pretty expensive, right? Especially for their size.

UrbanNorminal 22 hours ago |

Is google allergic to humans or something? Cannot they just send an email or call the company before taking a wrecking ball to the entire company's infra? Are they stupid?

BarryMilo 21 hours ago | |

Surely this is automated. They wouldn't waste precious dollars on employing humans just to keep other humans happy.

snypher 19 hours ago | | |

It surprises me there's not a manual review for $$$$ accounts. Speculation at this stage, but it's weird they would be put in the Recycle Bin like that.

lateral_cloud 19 hours ago | |

Keep the pitchforks at bay for now. No one knows what actually happened yet and we are only seeing one side of this outage.

faangguyindia 1 day ago |

Google cloud also locked out a Korean Goverment Organization recently. The guy posted on GCP subreddit.

Google really need to improve their support team. It's strange such a big corp can't even afford to have proper support team.

danpalmer 23 hours ago | |

> It's strange such a big corp can't even afford to have proper support team

Railway say they are in touch with that support team.

shooker435 22 hours ago | | |

god help them

choilive 22 hours ago | |

Not strange, Google has never had a proper support team unless you are an "Enterprise" level customer.

aranelsurion 13 hours ago | |

> support team

They must’ve upgraded them to Gemini 3.5 by now.

benwoodward 22 hours ago | |

pretty sure their support team is a flaky ML model that is haplessly flagging random accounts

King-Aaron 23 hours ago | |

> It's strange such a big corp can't even afford to have proper support team

This seems to be by design.

ndneighbor 22 hours ago | | |

We have a CSM, Head of Customer Support contact, and further contacts with GCP. Despite that, we still had this issue.

add-sub-mul-div 22 hours ago | |

Automating support, automating everything is the key to their whole deal. Tech giants leapfrogged the rest of the economy by innovating a company that can scale its customers without having to scale itself proportionally.

bearjaws 22 hours ago |

I will never leverage GCP in an enterprise setting, it's honestly amazing how hard they fumble the bag. Will be interesting to see when GCP support started working with them, from the updates there was an hour and change from when they identified the issue and GCP support was confirmed.

In the cloud space it seems like AWS does nothing and wins.

brokenodo 21 hours ago |

Well, as a 2 week tenured and very happy Railway customer until now, I am now a Render customer. Somehow DNS cut over within 1 min(!) and live after about 30 minutes of work. Not bad!

DrewADesign 21 hours ago | |

In my experience, DNS changes are a lot faster than they used to be. There’s some website that has a map that tries to resolve your domain with a bunch of name servers around the world that was pretty neat to look at last time I migrated something.

nbarbettini 20 hours ago | | |

I became so conditioned to waiting hours(!) for DNS propagation that I'm always pleasantly surprised when it takes <5 min these days.

twostorytower 19 hours ago | |

I love pointing my name servers to Cloudflare so any DNS changes from that are practically instant.

swyx 17 hours ago | | |

as with many things, we say we like decentralization but quietly vote for centralization

Avicebron 23 hours ago |

Isn't Railway the "the API key to delete the backups is in the prod database, because that's where the backups live duh" guys?

trvz 17 hours ago | |

No, this is the company that failed those guys.

You should also read the story, as you're perpetuating a false version of it: https://x.com/lifeof_jer/status/2048103471019434248

codegeek 22 hours ago |

This is bad. Even their own website is down at railway.com. Looks like total dependency on google cloud. Surprising for a company of their scale with all this VC money.

choilive 22 hours ago | |

They run a decent amount of their own compute/bare metal server for customer workloads. But likely still had some critical dependencies on GCP.

rmeara 18 hours ago | |

Google has a total dependency on it's own infra and does fine. Why do its customers need multicloud? Huge PITA unless you need an absurd number of 9s

cube00 19 hours ago | |

> Surprising for a company of their scale with all this VC money.

Not sure too many VCs would be cool with deep redundancy when there's more features to build to bring in more customers instead.

whh 22 hours ago |

This could kill a startup. I really don't like Google's automated and silent account murder functionality.

MrDarcy 22 hours ago | |

There’s no way this was automated or silent.

The only reasonable explanation is Railway lost control of their estate and something was happening that warranted a group of humans to decide flipping the kill switch was the best of a set of bad alternatives.

macintux 22 hours ago | | |

You’re giving Google far more credit than they’ve earned.

faangguyindia 21 hours ago | | |

you can go on google cloud subreddit and watch horror stories

i actually built a good plan out of those horror stories for my companies.

throwaranay4933 1 day ago |

This screenshot from Discord suggests the idea that the outage is caused by automated GCP account ban: https://x.com/acgfbr/status/2056866780866351323

Alive-in-2025 21 hours ago | |

Automated account bans are the bane of internet existence today. I was banned from reddit for "bad behavior", I appealed and both times it's oops, there was nothing there, some automated system thought your comment was rude even though it wasn't.

Then they send you very strongly worded messages that says trying to work around the ban will lead to something bad happening.

I've been worried my main email account provider would do this. The core issue is even if you pay, even if you are a company as shown here companies don't carefully enough have limits on banning. I can only imagine they ban lots of scammy things every day so "they think it's working great".

enahs-sf 23 hours ago |

I respect what railway is doing but also would never run my business on such a platform.

eoswald 23 hours ago | |

Today changed my opinion on them completely. Was willing to give them the benefit of the doubt that they're growing fast, but now seeing that they've failed to scale properly, and are missing little things that become big things later. I can't take that risk.

dpark 23 hours ago | |

That kind of sounds like you don’t respect what they are doing.

enahs-sf 20 hours ago | | |

I think it’s good people are making IaaS platforms, but have dealt with enough firefighter hero bullshit to have seen this coming a mile away. Uptime and redundancy are strongly correlated.

usernametaken29 21 hours ago |

I didn’t knew Railway so with this misleading headline I thought a Google Cloud data centre was being built in the way of a railroad. That’d been a funny story to read..

Polizeiposaune 19 hours ago | |

An elevated railroad once ran through one end of what is now a Google-owned building (Chelsea Market in Manhattan). It's now part of the High Line elevated pedestrian park.

astafrig 21 hours ago | |

How is the title misleading?

tauntz 16 hours ago | | |

"Railway Blocked by Google Cloud"

If you don't happen to know that "Railway" is referring to a company, then you might reasonably read that as "a GCP outage caused issues in the train network somewhere".

TheTaytay 22 hours ago |

I’ve seen a few smug “all your eggs in one basket” comments here.

I’m aware of some companies hosting their own metal and infra, but I’m not aware of large companies mitigating risk by hosting on separate cloud providers as a fallback mechanism. We might disagree with cloud provider choice, or think they should have been hosting their own metal, but that’s still an “all your eggs in one basket” choice, right?

Heck, they might even have multi-region fallback with GCP, but if GCP bans your account, that doesn’t matter.

Are there good examples of running a company of railway’s size so redundantly that their host could nuke one of their accounts and they’d just keep on trucking?

fontain 22 hours ago | |

They do run their own metal. That’s their entire ethos. Railway is their own cloud.

chradams 22 hours ago | |

Just google multi-cloud. Yes. It's a thing.

wmf 22 hours ago | | |

99% of multi-cloud is fake though. True multi-cloud is incredibly rare.

padolsey 22 hours ago |

Does anyone know how this even happens inside the walls of google? Is it an automated process? How is such a (presumably) high revenue account just magically blocked without human intervention? I'm quite perplexed.

jpollock 22 hours ago | |

There would have been efforts to contact them, but it would have been via their contact method, aka the email they set it up with.

Common ways this happens? They are using a credit card to run their business with no backup payment method. Then the company's contact person is on vacation.

mbreese 21 hours ago | | |

Yeah, I'm not sure what to think here. We know Google is not the best at customer service and has automated account suspensions. But, what I'm curious about here is why this happened.

Railway hosts applications for customers. An uneducated guess for some possible reasons: 1) one of those customers hosted something they shouldn't have 2) railway had something spawn that took up too many resources 3) Or their account balance was too high 4) Or something...

But all of this probably culminates in someone needed to read an email that was missed.

Scaling a customer infrastructure setup like Railway is hard. This is one of the non-technical hard parts - how to make sure your account with your primary vendor is safe. But, I'm willing to wait to pass judgement here until more information is available. I'm sure the post-mortem will have lessons. I'd like to know more.

thayne 20 hours ago | | |

> via their contact method, aka the email they set it up with

If it's anything like AWS, that may be just one of hundreds of emails they send every day, most of which are just noise.

scratchyone 22 hours ago | | |

Honestly still insane to nuke a high-volume client's business after a single payment issue. There would be no reason for Google to believe that a single hiccup like that is evidence that they won't get paid and have to cut account access immediately.

jasonkester 19 hours ago | |

Yeah, compared to the AWS experience:

I had a toy Free Tier account that managed to overstep a limit one month and rack up $0.0038 in charges.

AWS hounded me about it for an entire year before finally putting the account on hold. Then kept at it for months more before finally deleting it.

It’s pike the paperboy from Better off Dead, if he were to continue delivering newspapers while hounding you for his two dollars.

zx8080 16 hours ago |

For those who opened this link to read news about the real railway (with trains), it's not about it. Thank you for wasting my time!

cube00 17 hours ago |

Railway "What we know so far: May 19th 2026": https://station.railway.com/community/what-we-know-so-far-ma...

mjy78 20 hours ago |

All in on cloud so we don’t need to worry about backups. Now your subscription is the single point of failure.

jefborges 22 hours ago |

Railway is back, but I’m not sure if I can trust keeping my projects there, so I’m going to migrate to another company.

oofbey 21 hours ago | |

After reading about how their delete database API also deletes all the backups, I concluded they are not to be trusted.

CodesInChaos 17 hours ago | | |

Don't all major clouds do that by default? But at least they have additional protections you can configure, if you know about them.

marknutter 20 hours ago | |

It's not back.

hnburnsy 21 hours ago |

From their founder on X...

"Absolutely. The Railway network is a mesh ring between AWS, GCP, and Metal

So: - High availability interconnects - High availability path routing between clouds - Database itself is high availability

However, Google's VPC itself is not. So we will add a shard to Metal and AWS"

hnburnsy 21 hours ago | |

More here...

https://x.com/JustJake

thrownthatway 19 hours ago |

Huh.

Railway dot com

Has nothing to do with railways.

I wish software people would get their own words.

patrickmay 11 hours ago | |

I was also expecting a story about a physical railway being shut down.

sammy2255 21 hours ago |

The 3-2-1 backup rule is pretty outdated in the world of cloud. You could have 3 complete copies of your data in different S3 buckets, but if they're all under the same account you've lost your blast radius protection

zootboy 17 hours ago | |

It's not outdated, you just actually need to follow it. 3 copies of data in separate S3 buckets is ignoring the "2" in the 3-2-1 rule: 2 different mediums, and also the "1" rule: 1 copy offsite. In the cloud era, offsite means not on the same cloud provider. Different mediums ideally means a non-cloud provider (e.g. a NAS at your office under your control).

rsync 21 hours ago | |

If only there were a quick and easy way to replicate s3 buckets to an independent provider…

… on the Unix command line …

… to a cloud older than AWS…

… if only …

funtech 20 hours ago | | |

Wish I could upvote this comment account more. Too many people look for something new and shiny when trusty ol tools are sitting right there. :)

oefrha 20 hours ago | | |

Well having backups help, but I certainly can’t migrate my infra to rsync.net on moments’ notice (or ever since rsync.net does storage and nothing else) so my customers aren’t affected.

lemagedurage 19 hours ago | | |

Inflated egress costs might make this prohibitively expensive, $80 per TB at GCP and AWS

eclipticplane 20 hours ago | | |

I don't think that technology exists. Sorry.

whalesalad 19 hours ago | |

You replicate data to different clouds.

jaspanglia 19 hours ago |

Cloud platform dependencies are becoming a huge single point of failure

gnabgib 23 hours ago |

Dupe - join the discussion started an hour ago instead of query string work (12 points, 4 comments) https://news.ycombinator.com/item?id=48200827

aarondf 23 hours ago | |

I added the qs because it defaulted to a story from 3 months ago.

jkogara 15 hours ago |

Interestingly, upon logging in this morning I was presented with a new terms and conditions banner that required me to agree to not deploy a list of, to varying degrees, nefarious things (bots, torrents, "anything illegal", etc.). Is it likely that some of these workloads resulted in the auto restriction from GCP?

Mengkudulangsat 23 hours ago |

That explains why all my vibe-coded hobby projects are down.

Thank God I'm not dealing with any public-facing sites! Would have been an expensive lesson for a newbie coder if my job depended on this.

danpalmer 11 hours ago |

7 minutes from bug filing to account restoration. This shouldn't have happened in the first place, but that's an excellent response time from the support team.

orliesaurus 22 hours ago |

I wonder if someone has exploited a weird Google-safety automated process to report something on Railway which caused Google to block the whole thing.

whh 14 hours ago |

There's that "automated action" again. Regardless of the architectural decision, it makes me incredibly uneasy relying on GCP if these types of things can happen.

r_lee 22 hours ago |

seriously, is it possible to trust GCP with critical data/services at this point if you're not a billion dollar company?

I'm exaggerating but someone said they got "auto banned"

what if that happens to a small account which hosts some really important data/services there?

zelon88 21 hours ago |

Wild to me that any tech sector business would want to rent an operating environment to park their entire infrastructure into. This is the equivalent to traveling shoe salesmen setting up a tent in the parking lot of a strip mall.

AbstractH24 11 hours ago |

Did anyone discover any unexpected tools/wesbties use railway during this outage?

brunooliv 15 hours ago |

Having tried many of these hosting services to host/play with toy apps, DigitalOcean and Fly.io are both unparalleled GOATs.

mattbee 17 hours ago |

The risk of an "upstream cloud provider" is not something you need to tolerate in your supplier of internet infrastructure!

dlcarrier 19 hours ago |

This is the kind of outage worthy of a Kevin Fang video.

tux 22 hours ago |

At this point you can’t trust Google anymore, it keeps breaking things. Imagine having Google AI do this thins automatically. Will have apocalypse in in a day.

yomismoaqui 15 hours ago |

Remember, the cloud is someone else's computer.

If that person turns it off you're screwed.

dwa3592 22 hours ago |

Wait, I thought railway was a cloud provider like AWS, GCP but better and more agile. At least that's the impression i got from their website.

pavelevst 19 hours ago |

Avoid vendor locking, have backups, make disaster recovery standby (or plan for quick recovery elsewhere)

leventhan 20 hours ago |

What's a good alternative to Railway?

brokenodo 23 hours ago |

I’m a new customer and have been falling in love with Railway over the last 2 weeks, but this is quite the wake up call.

csw-001 23 hours ago | |

Literally in the same boat. I've been really happy with it, but this is a major eye opener.... It's been done for a looooong time by provider standards.

reelvideocap 23 hours ago | | |

same

choilive 22 hours ago | |

Been a customer with them for over a year now, small incidents here and there but never anything this major.

TheAtomic 22 hours ago | |

same same

steve1977 19 hours ago |

Lesson learned: don't rely on a single hyperscaler, even (or especially) as a startup.

burnerRhodov3 19 hours ago | |

I just... I don't really understand why startups even use AWS, GC, or any other cloud hosted software? Hetzner, etc. Are all extremely cheap, and honestly scale so well... Code nowadays is cheaper for configs, and having full control over your compute is... liberating.

dannersy 19 hours ago | | |

Low cost to entry, easy to get scale from the beginning if you need it. The large cloud providers throw free credit at startups to lock them in all the time. I had a short lived stint trying to get my own startup off the ground and it was really easy to get free compute from Google with no strings attached. This was many years ago now, but I would be surprised if it is any different.

I am with you entirely and would not have taken that route today, but it is really easy to see why people go that route.

antran22 18 hours ago | | |

A few years ago, when I was kinda active in the startup scene in my area, you have people selling access to cloud credits with penny-on-the-dollar price. The credits are given out liberally to big-corps, organization by AWS/GCP, through workshops, webinars, events. All in the hope of roping the departments into building MVPs, demos on AWS/GCP, but people also find a way to cheat on that system and make some quick bucks.

I know a startup of my acquaintances that have been running on AWS for 5 years straight without paying a single dollar to AWS. When the credits almost run out, they started to migrate their data over to another account with credit. That happened twice already.

It helps to have a portable, replicable IaC config. But also this is sustainable because they are a pretty small struggling shop. You will probably not be able to do this if you are trying to maintain more than 3 nines for an enterprise client.

chi_features 17 hours ago | | |

Perhaps Railway does a bit more than what you think, they have some great functionality (I'm not affiliated with them). Check out [Features | Railway](https://railway.com/features) "PR Environments", they are incredible for the QA process

steve1977 19 hours ago | | |

Oh absolutely... and many use architectures that have evolved out of the needs of really big companies and are not really a good fit for a startup. But I guess they want to be "ready for growth".

bilalq 20 hours ago |

Building a startup on GCP (or even Google Workspace) is an existential risk.

redanddead 22 hours ago |

one of the many reasons companies are cloud agnostic and dont want to get locked in

fh67 21 hours ago | |

Yeah but until you find that the new cloud provider won't approve your compute quota or doesn't have enough capacity in the region or you hit fraud flags for stagnant account spinning up lots of compute.

parineum 21 hours ago |

There's a lot of, what seems to me, unfounded blame being directed at Google for this. Isn't railway the company that just blamed Anthropic for deleting their prod database?

mmmore 21 hours ago | |

Nope, Railway was the company who was hosting PocketOS, which is the company that blamed Cursor for deleting their prod database. Railway is only involved insofar as their API allowed an instant delete of the prod database.

oofbey 21 hours ago | | |

Railway deserves a lot of blame here. Deleting backups along with the database is a lot like not having backups. Moronic design choice.

sidrag22 21 hours ago | |

fairly certain you are remembering the goofy article that was going around where a railway user allowed an agent to delete his db. iirc he questioned the agent after and the agent told him it should have read the file that told him not to do things, so just sounds like he deleted his db and blamed his tools.

jujube3 21 hours ago |

If you buy a cloud-on-a-cloud, you're a clown-on-a-clown.

koolhead17 20 hours ago |

Let's blame some rouge AI agent at GCP causing this.

ryanisnan 23 hours ago |

Yikes. I was wondering why my TLS certs were coming up as invalid.

bshack0 23 hours ago |

so....what are we switching to y'all? cloud-run ? ;P

auxiliarymoose 23 hours ago | |

federated hardware (a bunch of raspberry pis networked into a high availability kubernetes cluster, hidden across various local coffee shops for free power and bandwidth)

throwatdem12311 23 hours ago | |

raspberry-pi cluster in my closet

frio 22 hours ago | | |

16GiB Raspberry Pi 5s in my country are now going for ~$450USD, so I've gotta say that's out of reach for me now :(.

eezing 20 hours ago |

“Deletion of private cloud subscription…”

Who deleted it?

isninkhamiss 22 hours ago |

github got way more noise for less

ChrisArchitect 22 hours ago |

Earlier: https://news.ycombinator.com/item?id=48200827

Drew-Aetherwave 22 hours ago |

It is killing me...

mcontrerazCL 23 hours ago |

all my fkn postgres bd in railways! what do i do now?

eoswald 23 hours ago | |

Hahah at least you're not getting called every five minutes because you cant shut off the alerts, because its apparently deployed SOMEWHERE but good luck finding how to access it. Can't wait to see the bill from Twilio because of this lol

cactusplant7374 23 hours ago | |

Take a walk. Breathe in the fresh air. It feels good.

jamwise 19 hours ago |

There goes a 9

WhereIsTheTruth 17 hours ago |

When your cloud depends on an other cloud

All these companies are fraud

Osborn_Ojure 22 hours ago |

compute recovered, get ready boys!

fnord77 20 hours ago |

wish I knew what "railway" is

iloveplants 1 day ago |

seems like it's every day

shevy-java 20 hours ago |

Do not become dependent on Google. Ever.

rvz 22 hours ago |

Let me guess… Googler running AI agent in production that blocked this startup’s account.

paganel 15 hours ago |

Apparently this has nothing to do with real-world trains and to the real-world rail system, at first, and reading the title alone, I had thought that some trains might have got stuck somewhere because of an IT (google cloud) failure. It's just another SaaS story.

rekabis 23 hours ago |

TL;DR: putting all your eggs into one basket is bad, man.

lfx 23 hours ago | |

That’s true, however having only few eggs and shopping for several baskets does not make sense in early days. Not sure how big railway is, but usually you start small with one egg.

christophilus 23 hours ago | | |

You’d think they wouldn’t have started with GCP. There are plenty of datacenters where you can buy racks and racks of servers, and talk to a human when something goes wrong, and even walk in and access your servers. That’s what I’d be using if I were to build a Rackspace today.

rekabis 23 hours ago |

TL;DR: putting all your eggs into one basket is bad, man.

Aachen 7 hours ago | |

Note, you submitted a dupe: https://news.ycombinator.com/item?id=48201711 (the comment I'm replying to is 1 ID older so I guess this is the canonical one to reply to)

canpan 22 hours ago | |

How to handle domains? The rest is easy, but your domain registrar blocking you sounds like a pain. My current solution is to use a local small provider, just for the domain. Then if there is a problem with your play account it is out of any blast radius.

FlamingMoe 22 hours ago | | |

What do you mean by local small provider? A registrar on main street?

rekabis 18 hours ago | | |

What the deuce are you blathering on about. An account got blocked, this has nothing to do with a domain.

And I’m talking about having disparate failovers that don’t rely on a single hosting provider. At that point, who cares what Google does to your cloud account… work with the hot failover and spin up another hot failover somewhere else.

truekonrads 22 hours ago | | |

MarkMonitor

binarycleric 22 hours ago | |

Same applies to all the companies betting the farm on AWS.

rekabis 18 hours ago | | |

Precisely. If you’re going to have a hot failover, it behoves you to have an entirely separate entity billing you for that hosting.

Honestly, I don’t know where the downvotes are coming from. Do people have no clue about service resiliency? I can understand if it’s a personal project or you haven’t yet scaled to paying customers, but anything at scale with serious money involved needs to be completely independent of the underlying hosting. It should remain up even if an entire provider goes titsup.