Tell HN: Wells Fargo completely offline

292 points by p3nt3ll3r 7 years ago | 239 comments

Wells Fargo is completely offline. How does this happen? Been this way since 6am PST!

jedberg 7 years ago |

For everyone complaining about "how could they not have failover" let me ask you this: Would you take a job with Wells Fargo to fix their infrastructure?

And if you did take such a job, how long do you think it would take you to get the budget and approvals from all the auditors necessary to fix everything?

How much would they have to pay you to take that job, knowing how frustrating it would be to get anything done?

And now you see why banks have such terrible IT.

(One of my mentors actually works at BofA, and says he only does it because he gets to work 6 hours a day, gets a VP title and a ton of money, but nothing ever gets done)

WhitneyLand 7 years ago | |

This gives a good idea of a very real phenomenon that plays a huge but largely hidden role in the failure, success, and form that technology takes at companies.

It's not just infrastructure. Quality, innovation, speed, efficiency, etc. To whatever degree you believe the old 10x programmer meme, believe this: Change that constant 10 to any value for many employees at once at a company and the impact is staggering.

Good people don't think to they are too good for these types of companies. They think the companies are too constraining and place a limit on their ability to realize dreams or their potential to perform.

They don't prefer to work with less passionate peers. Not because such are lesser human beings to be avoided. Rather, it gives them another edge. Constant, motivating, cross-reinforcement during discussion, doesn't just inspire it creates action.

Finally, most of these companies don't have tech as their core business. That correlates very directly to the fact that the most senior and influential people in the company will care less about you, being willing to partner on ideas, etc. It's not personal, Banks think bank stuff is most important and anything else is less important, even if it's a critical operational component of the business.

Gokenstein 7 years ago | | |

If you're the best engineer for a company who's bottom line is not direct sales of engineered products you are a middle tier or bottom tier engineer on the open market.

You're the best engineer willing to work in a black box, a niche, in isolation from your peers with constant pushback.

tnolet 7 years ago | |

Well, I bet there are continents of people who would love a corporate IT job at WF. You make it seem like it's equivalent to mining coal in the 1920's.

tastroder 7 years ago | |

It's not like people blame the staff working on wiring the network hardware here. For any company in that position I'd assume they realize their dependence on that part of the infrastructure for their core business. I don't really see the relevance of how long compliance processes take, that's part of the industry and WF didn't discover the internet last week. Salaries in finance reflect that frustration, who cares.

If the answer to that challenge was one of "too hard", "it works for now" or "too expensive" that's still a strategic failure that impacts their core business now.

close04 7 years ago | | |

It's because management doesn't want to touch the topic. No CIO wants to risk breaking something so they all pray to God nothing breaks on their watch and they can get their retirement package or move on to the next job and it's someone else's problem.

This is why sometimes absurd amounts of money are paid to keep the old system going as it is. It's "safer" than to move to a new one with the associated risks and teething issues.

benoror 7 years ago | |

Nice mentor you got there

jedberg 7 years ago | | |

Heh, well he's also a brilliant technologist. He just has other priorities now.

dalbasal 7 years ago | |

You can sympathize, mock or rationalize through the corporate-political logic that inevitably leads here... the interesting points remain the same: most tech operations suck.

This is not really related to costs. It's related to culture. Costs and quality of software, including infrastructure, range by orders of magnitude per theoretical quanta of software.

We don't really know how to fix this.

msla 7 years ago | |

> And if you did take such a job, how long do you think it would take you to get the budget and approvals from all the auditors necessary to fix everything?

The only people who think auditors are bad are the ones trying to pull something an auditor would catch.

smolder 7 years ago | | |

I don't think that's a reasonable conclusion. I had to work on software in a heavily audited (finance) environment and there were a lot of beaurocratic obstacles related to that, but the code was fine and unaffected by the whole process. I would have been happier not having to think about audit requirements and just focusing on making working software.

nemothekid 7 years ago |

Apparently there was a fire in a Datacenter - theres a thread on /r/sysadmin by an insider.

https://np.reddit.com/r/sysadmin/comments/ao4g2y/wells_fargo...

Talyen42 7 years ago |

3rd largest bank in the world offline for an entire day because a smoke alarm went off

great job guys

wolfpwner 7 years ago | |

More like 3rd largest bank in US

qaq 7 years ago | |

pretty interesting attack vector

drdeadringer 7 years ago | | |

I feel like I've seen this in 'Mr Robot' or similar.

clairity 7 years ago |

for those who haven't taken the plunge, now is a good time to move to a credit union, regional bank, or online bank:

- for instance, in cali: https://www.golden1.com/

- high interest accounts: https://www.bankrate.com/banking/checking/best-checking-acco...

- good rate online bank: https://empower.me/

get better service, lower/no fees, and good convenience without all the economy-damaging selfishness and hubris.

bridanp 7 years ago |

Regional bank checking in. We've been told to reroute certain workloads through other means until 6am central Friday. Regardless of that, the story coming out of this over the next few days is really going to help Business Continuity departments at other institutions plan for this scenario. My teams are running mock scenarios this afternoon based on the auto power shutdown causing a data center to be unable to fail over. Might end up being something else entirely, but it's still a good scenario to investigate.

a3n 7 years ago | |

I get paid by direct deposit into my WF acct between midnight tonight and 6am tomorrow. I suppose I can eventually rely on my employer keeping records and replaying everything.

bridanp 7 years ago | | |

If ACH areas are involved, hopefully they already had those queued a day early per normal routines and you'll (hopefully) be fine. There are exceptions allowing companies to provide ACH files late. That's the exception and not the rule.

habosa 7 years ago |

For those who have worked in bank IT, I have some questions.

So these days there's very little physical money. Most of my "wealth" is just entries in various digital ledgers. My bank says I have $XXX and my brokerage says I have $YYY and my retirement account says I have $ZZZ.

Let's go with the bank account case. Is it possible that a catastrophic accident or attack could wipe my balance down to $0 with no way to recover? What if a data center was nuked? What if two data centers were nuked?

How much redundancy is in the system? Are there third-party agencies that track private bank ledgers? How hard is it to take them out too?

Ever since I read "The End of Alchemy" (a great book, btw) this thought has haunted me.

rubbingalcohol 7 years ago |

You can still access Wells Fargo Online via the direct link: https://connect.secure.wellsfargo.com/auth/login/present?ori...

sxates 7 years ago | |

Confirmed - was able to log in. Some account info is still there, though my mortgage is unavailable. Maybe they'll just lose it?

howard941 7 years ago | | |

Your deposits, sure, your trust, of course. The mortgage loan? Fat chance.

chevas 7 years ago | |

2FA not working on main URL or this direct link.

webdestroya 7 years ago |

From their API team:

Please be advised that the Wells Fargo Gateway is currently unavailable.

We're experiencing system issues due to a power shutdown at one of our facilities, initiated after smoke was detected following routine maintenance. We're working to restore services as soon as possible.

We apologize for the inconvenience as we continue to work on a resolution.

We will update you no later than 3:00 pm, Eastern Time.

rococode 7 years ago | |

Amazing that one of the largest banks in the US is somehow still (apparently) reliant on a single location.

williamstein 7 years ago |

Wow, functionality is still significantly degraded over 24 hours later. I just tried to log in to check on an account balance, and the site was extremely slow. It showed my balance, but failed every time to load the transaction history with "Error in external system" or something like that. The sign in page still shows: "Alert: Some customers may be experiencing issues accessing online and mobile banking. We apologize for any inconvenience."

rolph 7 years ago |

OK, Hanlon's razor applies here:

https://en.wikipedia.org/wiki/Hanlon%27s_razor

recall Stuxnet, its possible this was an attack, somesort of malicious mod to firmware, but_ Hanlon's razor.

from the reddit:

"throwawayfordays75 1399 points 8 hours ago*2

Throwaway since I have first hand knowledge. Fire suppression went off in one of their main Data Centres from some utility work this morning. No power to any of the network or compute equipment and some failovers did not work as expected. "

At this point im wondering what "utility work" was happening.

hinkley 7 years ago | |

That user posted an update:

"everything minus core network gear was manually being unplugged from any PDUs to help the control the initial power-on."

Can't we... Why don't we have rack hardware that can handle this situation? I thought some HDD RAID solutions had circuitry to keep them from browning themselves out while spinning up the disks. I guess I'm surprised this isn't a solved problem at the rack level now.

Or have we been so focused on never cold booting a rack of servers that we haven't spent any effort on foolproofing of cold booting a rack of servers?

[Edit: answering my own question] apparently these exist and are called Managed PDUs. Can we deduce WF doesn't have them?

mustardo 7 years ago | | |

Some servers "stage" power on to disks. as spinning up disks is often the largest source of inrush current for e.g. a 10 disk server might only apply power to bays 2 at a time with a delay between each, this doesn't solve the inrush current problem of switching on a rack full or servers together (where managed PDUs come in)

rolph 7 years ago | | |

it would seem to be a possibility. im also thinking 10 finger power up sequence, as in: insert power cord 1 into powersocket 1, wait till audible beep, insert power cord 2 into power socket 2, wait till audible beep etc. :-D

api 7 years ago |

Too big to fail over?

rawrmaan 7 years ago | |

Underrated comment

0xfeeddeadbeef 7 years ago |

Looks like Mr. Robot is at it again.

nodesocket 7 years ago |

I thought at one time Stripe used Wells Fargo as their bank. Any reported problems from Stripe?

patio11 7 years ago | |

(I work at Stripe.)

This does not impact Stripe customers at this time. Specifically, the Stripe API and Instant Payouts are functionally normally.

(We're aware of the issues Wells Fargo is experiencing, and have an incident response spun up internally.)

nodesocket 7 years ago | | |

Interesting, I would have thought if the bank API that backs Stripe is unavailable, there would be impacting issues such as payouts.

driverdan 7 years ago | |

I signed up for Stripe in 2011 or 2012. About six months after I signed up an account rep from Wells Fargo called me asking about my merchant account. Stripe never made it quite clear what was going on but apparently in the early days they opened accounts at Wells Fargo on your behalf without telling you.

patio11 7 years ago | | |

I also signed up for Stripe in 2011. The Terms of Service mentioned Wells, beginning in the first paragraph. You might want to look in your files and/or the Wayback Machine. https://web.archive.org/web/20111017081436/https://stripe.co...

dawnerd 7 years ago | | |

Wasn’t Wells Fargo caught for making accounts for people without permission just to boost their numbers?

bdcravens 7 years ago | |

Still do: "The Payment Method Acquirer for Visa and Mastercard Transactions is Wells Fargo Bank, N.A, and you may not submit Visa and Mastercard Charges without first agreeing to the Wells Fargo Financial Services Terms."

https://stripe.com/us/ssa

Operyl 7 years ago | |

Zero problems with Stripe, still processing payments without any issue. Just took a look at their status page too, in case something got posted, but no such notice exists.

mbesto 7 years ago | |

"Wells Fargo as their bank" is a weird statement. Wells Fargo is a HUGE company with various banking services. This isn't as simple as "WF data center had an outage so WF as a company is completely down"...that's not how it works.

rixrax 7 years ago |

And I’m not surprised they don’t have a services status page either that many modern companies do offer[0][1][2].

[0] https://api.twitterstat.us [1] https://help.netflix.com/en/is-netflix-down [2] https://developers.facebook.com/status/dashboard/

fidla 7 years ago |

4chan's pol forum is all over it. Apparently it's a "happening" worthy of hundreds of comments

krapp 7 years ago | |

Really? /pol/?

this is what the merry band of racist edgelords have fixated on?

Seems like it would be more /g/'s thing.

lenticular 7 years ago | | |

Anything that they can spin into some insane fascist conspiracy theory.

byron_fast 7 years ago |

I work at a place that just had a near 24 hour outage due to a frozen water pipe. Water!

ams6110 7 years ago | |

Cooling water? Essential in a data center.

samstave 7 years ago |

This will be an interesting root cause if it manages to get leaked....

bigbassroller 7 years ago |

I got an email today from WF titled “New year. New look. Continued commitment.” Maybe they are flipping the switch on their “New year. New look...”; which sounds like a new redesign and they are having downtime while switching over.

godelmachine 7 years ago |

Well, for a second I thought they have closed business operations completely, keeping in mind how embroiled they were in graft cases in 2008 recession.

Pardon my immense ignorance!

sugarygrind 7 years ago |

#bitcoin -- just had to say this.

sys_64738 7 years ago |

Lots of new Wells Fargo job vacancies will be published tomorrow I fear.

mikestew 7 years ago | |

Yeah, but it won't be to fill the roles vacated by those that failed to fund data center redundancy despite repeated requests and proposals.

meesterdude 7 years ago |

they have a history of outages - though not as long.

MuffinFlavored 7 years ago | |

What is more likely going down in the middle of the day? Their mainframe/COBOL stack? Their Java stack?

berbec 7 years ago | | |

Their entire data center, it seems

eweise 7 years ago |

Try Varo

chad_strategic 7 years ago |

Wells Fargo is nothing more than a federally regulated crime organization, supported by the federal reserve and the American tax payer, but I digress.

This thread has primarily focused on redundancy and software architecture. That could be the case, but there is no better way to fight proxy war via hacking of banks. It’s a domino effect... pull your money out of the bank, the bank calls loans, then there is credit crunch, investors lose confidence in the stock, value lost etc... the enemy has secretly inflicted civilian problems across the economy. Distrupting the banks and the flow of money can lead to a revolution... or take your mind off of America’s enemy or diverting energy else where.

deevolution 7 years ago |

Bullish on bitcoin b/c of its highly resilient characteristics. Might not process as many transactions per second and it might pollute more than the average bank, but sure as hell dont need to worry about a natural disaster or freak incident single handedly taking out the entire network.

jameskilton 7 years ago | |

Wells Fargo customers must be terrified that their money is going to be gone when the service finally comes back. This downtime is obviously to prevent a bank rush while the leadership leaves the country with millions in customer funds.

Oh wait...

shawnz 7 years ago | | |

Very funny, but you're describing a failure mode of bitcoin exchanges rather than the bitcoin network, which is not what the parent was talking about.

deevolution 7 years ago | | |

Queue infinite stream of downvotes. News flash, you dont need an exchange to store or transact bitcoin!!! And people using exchanges should understand the risks associated with using a centralized service.

mikeash 7 years ago | |

I’d wager that WF is still processing a lot more transactions than Bitcoin is even during this outage.

pjc50 7 years ago | |

Or you could have a Quadriga-like incident where a single person's death or fraud takes out an entire bank?

SketchySeaBeast 7 years ago | | |

I think the argument here is that you don't need to keep your Bitcoin in a bank, and instead just horde it all under your digital bed.

amaccuish 7 years ago | |

There's always one...

pastor_elm 7 years ago | |

decentralization wins out again. How many times do we have to go through this...