Serverless: A lesson learned the hard way

Serverless: A lesson learned the hard way(sourcebox.be)

204 points by V3loxy 8 years ago | 133 comments

>This is probably the most stupid thing I ever did. One missing return; ended up costing me $206.

No, dear author. Setting up the AWS billing alarm was the smartest thing you ever did. It probably saved you tens of thousands of dollars (or at least the headache associated with fighting Amazon over the bill).

Developers make mistakes. It's part of the job. It's not unusual or bad in any way. A bad developer is one who denies that fact and fails to prepare for it. A great developer is one like the author.

lifeisstillgood 8 years ago | |

Just wanted to say that last paragraph is one of the simplest descriptions of professionalism in the job I have read - accept and prepare for your own human failings :-)

cddotdotslash 8 years ago |

This is not a "Serverless" problem; this is a mistake a developer made that used a pay-per-use system. If I write code that launches EC2 instances and I accidentally set it to launch an instance every second instead of minute because I divided wrong, that's my fault.

eridius 8 years ago | |

It is a serverless issue because if you were using your own server, a mistake like this wouldn't have cost money, it would have just degraded your service (or possibly brought it offline).

So I guess the question is, with a mistake like this, is it better to be charged hundreds or thousands of dollars, or to have your service degrade or go offline until you can fix it?

tdb7893 8 years ago | | |

Could you just do serverless where it starts to rate limit one you reach a certain cost? It seems like this is an issue that could be fixed somehow

cddotdotslash 8 years ago | | |

This is splitting hairs though. It's a mistake in the code that caused it to do something unexpected that costs money. In the serverless world, that means invoking a function repeatedly, costing money. In the old-server world, maybe it means your script had a bug that downloaded an image repeatedly, causing you to rack up networking charges.

rovr138 8 years ago | | |

> It is a serverless issue because if you were using your own server, a mistake like this wouldn't have cost money.

We dynamically create and instantiate new servers based on load and if it's sustained for a while. Once it's up, it's added to the load balancer. Once the load of them goes down, it's spin down after it's spent some time idle (it costs to instantiate so might as well keep outside of the queue for a bit before completely removing it).

This all runs automatically. If we don't limit it, it's on us.

How is this not a problem with how he managed it?

> This is probably the most stupid thing I ever did. One missing return; ended up costing me $206.

He clearly mentioned it's his error there.

antaviana 8 years ago | | |

There are chances that degradation or unavailability are not free as in beer.

If the degraded or offline system is used by people, and these people cannot work, the cost can be a lot higher. For example, 10 people not able to work could cost something in the range of $250-$750 per hour.

Moreover, if customers are lost due to this degradation of service and CAC is high, then clearly the cheapest thing is a high bill by AWS, which probably is also capped by Amazon (and handled as an alert by Amazon).

chrisco255 8 years ago | | |

The developer should be writing unit tests for their code so they can avoid small mistakes like this.

nickthemagicman 8 years ago | | |

I notice AWS doesn't have any ability to set limits....

OzzyB 8 years ago | |

It's "serverless" in the sense that if the developer had provisioned a "server" then the max incursion of cost would equal the cost of that server, no more no less.

So yeah, let's blame the developer, but let's not play like mistakes don't happen and they're not costly in the "serverless" world.

Terretta 8 years ago | | |

Use EC2, set an auto scaling threshold on CPU utilization, do something dumb, you’ll find you ran “a server” x n.

It’s easy to burn tens or hundreds of thousands ‘accidentally’ on “server”, easier than on serverless.

If you’re spending real money, you should have an account team. Talk to them if such a problem happens.

jzawodn 8 years ago | | |

LOL WAT? How is it "serverless" if you're provisioning a server?

AndrewCHM 8 years ago | |

If I set up a server to run a script each time a file gets updated in a folder, then the costs arn't going to increase.

Its not a "problem" with server/serverless of course, but no-scaling-by-default vs unlimited-scaling-by-default (which is imo the better way to split the server/serverless topic), one is going to cost more when things get thrown for a loop

emeraldd 8 years ago | |

You could argue that it's a pay-per-use problem because of inherent unpredictability in that pricing model. When you have "big" resources behind you, it's less impactful but still and issue. The difference with a server based vs. serverless here is how the cost grows. You can predict what the cost of a server is going to be and strictly control it. Can you actually do that with the serverless option?

AYBABTME 8 years ago | |

It's a pricing model and limits issue from AWS. Same as you can't launch 1000 instances by mistake (there are account limits) you should be able to run hundred of thousands of recursive serverless calls by mistake (isn't there an account limit?).

ajmurmann 8 years ago | |

What you say is if course true. However, with EC2 that risk is mostly limited to the code launching EC2 instances. Serverless expands that risk to your application code. There are just way more opportunities to screw up in a way that directly hits your wallet.

rockostrich 8 years ago | | |

The same would apply to auto-scaling with EC2 instances though. If you were scaling resources based on a queue and made the mistake of adding something to the queue every time you finished something on the queue then you would start to use too many resources.

Without autoscaling, you would just have a queue that grows until the machine runs out of disk space. Either way, this was a problem with code and not event based scaling.

CupOfJava 8 years ago | |

Maybe the budget notification was late.

numbsafari 8 years ago |

"The actual cost is now $206 and over $1000 forecasted, it makes me think twice about using pay-per-use services in the future."

Never use a pay-per-use service that does not include a reasonable "turn off after $X" feature and appropriate warnings. Also, never use such services without being sure to configure such settings.

I like to think of this as a self-inflicted "DDOC" attack: Distributed Denial of Capital.

Best not to leave yourself exposed.

sudhirj 8 years ago |

You did well to have billing alerts enabled. Exactly the same thing happened to be, but I didn't notice for three months - no emails because I'd created an account and domain for a side project. Didn't notice anything on my card because the charge had been declined, but my bank didn't contact me. Finally found out because I knew the local AWS rep (was the relationship manager for the accounts we use at work). Had to apologise and explain the situation in detail to AWS and they forgave the bill. That was tens of thousands of dollars.

sudhirj 8 years ago | |

For those talking about this being a 'serverless' problem or not, think they point is that it's a lot easier to shoot yourself in the foot. Great power + responsibility, etc. On regular servers (outside of unbounded autoscaling) mistakes cost a flat rate.

21 8 years ago | | |

Not necessarily.

With a regular server you could go viral, your server dies, so you lose also without a bound in lost business/good will/whatever.

Also need to take into account the time/effort spent on making the regular server scale, albeit this is also a relatively flat rate.

matchagaucho 8 years ago |

TLDR: Deployed buggy code with infinite trigger. Hosting costs increased.

dvfjsdhgfv 8 years ago | |

The only difference here is that if he deployed it on his own server, he wouldn't have lost money.

Dolores12 8 years ago |

"a $180 actual cost. I was left with a light headed feeling, it's a lot of money for me"

people still play with fire. limit your losses, go with digital ocean or something for 5$/mo flat no matter what.

godot 8 years ago | |

His previous blog post actually said that he moved away from the exact $5/mo digital ocean plan you're talking about, to this.

I think the author meant to do this less of a "play with fire" way but more of experimenting with new tech way. But yes, I agree that for personal sites, running with your own money, you probably want to stick with something safer like the $5/mo digital ocean box.

V3loxy 8 years ago | | |

This is indeed the case, playing around with new tech. I've been a happy customer with DO for years but my own website was the ideal case to try out the whole serverless thing. As my website doesn't get much traffic so it'd cost me next to nothing. I do agree with most of the comments here though and the $5 DO box is the safest choice. I might have been a bit too excited with the new things and failed to think logically :)

Piskvorrr 8 years ago | | |

"Oh, look, new tech. It shines! It heats! Ouch, it also burns!" The comparison to fire is quite fitting.

tschellenbach 8 years ago | |

that depends for 5$ it breaks after a certain level of traffic. for many applications its always better to spend $$ instead of things shutting down. (hosting is usually insignificant compared to people, revenue etc)

cbhl 8 years ago | | |

If $180 is significant, then this person is probably a student or on a budget, not a business. So they'd probably prefer the downtime.

qaq 8 years ago | | |

it's better to have an ability to cap $. With something like DO it'a way easier to control costs compared to AWS.

rb808 8 years ago |

The infinitely scalable cloud services have this big problem with surprise bills. In the old days teams often made do with what was available, now it too easy to spin up more resources.

I'm surprised people still talk about cloud services as being cheaper esp where developers are free to use what they want.

abalone 8 years ago |

The main issue here is the budget notification emails aren't an adequate mechanism to catch infinite loops. They are too slow and you've already racked up big overages by the time you see it.

Idea: use API Gateway to configure a quota to match your budget projections. That will force a hard stop. Would be nice if AWS made this easier.

thorum 8 years ago |

I wrote my (small) AWS app so it can run both on AWS and my local machine. Then you can write tests against the higher-level logic like "save this file to S3" and run those tests locally as well.

My main challenge with serverless is using Lambda with API Gateway. Lambda has no database connection pooling, so I end up with a ridiculous number of connections to RDS - one for each simultaneous user. I haven't found a solution to this yet, other than not using API Gateway.

voganmother42 8 years ago | |

One solution, use external pooler like pgbouncer or mysql-proxy running on a small instance(s)

orthecreedence 8 years ago | | |

I'm actually kind of blown away RDS doesn't have pgbouncer installed on the database. That's how we operate...each db server has pgbouncer living right on it. We connected to bouncer, not directly to the DB.

benologist 8 years ago |

There's a lesson in there about how AWS makes a crisp $10m dollar bill for the richest man in the world every day.

dvfjsdhgfv 8 years ago | |

I know you're being sarcastic but I feel it's partly true. They announce so many different services, each month something new appears, but this very basic feature - bill capping - asked by users from the very beginning, has never been implemented. It's hard to believe they lack the skill or that it would be much more complicated than the current alert system.

sudhirj 8 years ago | | |

According to the folks I've spoken to, they don't do bill capping because they have no way to safely shut down your workloads in any way - they'd prefer to let you know and have you do it. And having that choice is way better than a capping operation destroying your production database or causing downtime for your users.

sah2ed 8 years ago |

Charged at a flat rate like the cost of a Digital Ocean $5/mo instance, would developers pay for such a service to provide automated notifications of service overages for all the major cloud providers?

All a developer needs to do immediately after adding a credit card to AWS/Azure/GCP would be to create an IAM role with permission to automatically add and track fine-grained billing alarms and notify via email/sms for any potential billing overages.

I think a $60/yr service like this would be useful to protect against future events of bill shock.

autotune 8 years ago | |

Looks like it's already been done, at least for AWS, based on a few Google results (not that I've used/tested any of these):

https://github.com/Teevity/ice https://billgist.com/ http://cloudcheckr.com/

21 8 years ago | |

Azure has a feature on their trial account that when you hit your free limit, you can either:

a) go into credit (so they will charge you at end of month)

b) disable services

Maybe AWS/Google also support a hard limit on spending.

coldcode 8 years ago | |

I looked at Lambda (we use AWS a lot at work) and decided to simply stay with a flat rate DO server. I know what it costs, no need to worry.

gleenn 8 years ago |

He obviously has a smaller account so Amazon might be less flexible, but it's worth contacting support and explaining the error. Sometimes they do give your money back in my experience.

qaq 8 years ago | |

In my experience flexible starts north of few mil. per month spend so why all the startups are running on AWS is a mystery to me.

amcleod 8 years ago |

Not sure if someone has mentioned this already, but you should contact AWS support and ask if they will forgive the bill given it was an error which led to the high costs. I’ve had bills forgiven this way in the past (e.g. forgot to disable an instance that wasn’t really doing anything).

supertramp_sid 8 years ago |

I was about to make the same mistake yesterday , but I had written a validation function that would check inside a folder only, and fortunately I did not upload the file in that folder. And the next morning , I read this article... Man I gotta be careful LOL

j45 8 years ago |

The relatively low barrier to learn just a tiny bit of following a Linode or DO vps hardening & stack setup guide to get an ubuntu server going can go a very long way for development and prototyping environments.

It's gotten much, much easier, and is just another form of command line management, similar to the CLI framework tools with your preferred stack.

Once that first setup is done, similar to setting up a serverless environment, you are generally restoring backups of your base image and beginning projects from there.

It also immensely helps to learn about how to build something to scale that isn't completely reliant on the PaaS layer.

odammit 8 years ago |

I've built a fair amount of serverless services over the past two years using the Serverless framework, apex and straight API Gateway/Lambda.

It's nice not to have to worry about a server, but I feel like there are just as many little things to futz with in serverless architectures especially before "environment" variables existed in Lambda.

mullen 8 years ago |

Setup spending alarms for your account. Personally, mine is $5, $10, $15, $20 and so on. At $30, my wife gets paged.

tapirl 8 years ago |

The cost by using AWS is hard to foresee. For years, my s3 storage got charged nothing. But some a month, it got charged several dollars.

I have migrated all my services to GCE. At least GCE provides free decent quotas for every resource.

solidsnack9000 8 years ago |

Serverless is going to make resource usage a focus in a way it hasn't been for years. The quick feedback, the absence of "all you can eat" pricing and the possibility for savings are all factors in this.

tomc1985 8 years ago |

Run+know your infra, none of this is a problem. Serverless is a scam.

lostcolony 8 years ago | |

So run my static content only blog on dedicated hardware that I have to administer rather than throw it in an S3 bucket with a Cloudfront on it? No thank you.

Qualify your statements.

dvfjsdhgfv 8 years ago | | |

You don't really need to administer it, there are plenty of hosting options, not just bare metal. And you can always add Cloudlflare anyway.

emersonrsantos 8 years ago |

Serverless still is a marketing tool for cloud providers. It will be useful when it really offers advantages over managing your own servers, especially on the cost and debugging.

cryptos 8 years ago |

Hey, at least is was an enterprise cloud scale infinite loop.

brown9-2 8 years ago |

You can host a static site on S3 + Cloudfront, so what purpose does Lambda have in this picture?

archii 8 years ago |

>Keep an eye on your logs, test everything again and again.

This is the takeaway quote from this for me.

illuminati1911 8 years ago |

This has nothing to do with serverless itself but it's rather problem of AWS.

le-mark 8 years ago |

Off topic, but the "serverless" moniker needs to die. I propose "adminless" as in "server I don't have to admin, configure, or patch" as being much more descriptive of whats really going on.

nathancahill 8 years ago | |

Eh, if I'm deploying a cloud function, the server truly doesn't exist for me. It's more like a Web Worker running in a privileged environment. I'm ok with the name.

sidlls 8 years ago | | |

No, it's really a server running your function. You just (are told you) don't have to worry about the server or what it's actually doing.

motoboi 8 years ago | |

Well, it's not adminless either, as AWS have lots of them keeping the hosts alive and running.

But if you can ignore that, you can probably also ignore the fact that your code runs on a server.

danschumann 8 years ago |

Fun programming story!

w8rbt 8 years ago |

Endless loops are now billing issues, just like a DOS.

alsadi 8 years ago |

If you are big enough and want serverless no-ops yet don't want to pay per burger when you already breed cows then consider kubeless.io

fibo 8 years ago |

I have a 5$ AWS billing alarm.