Improved VPC Networking for AWS Lambda

Improved VPC Networking for AWS Lambda(aws.amazon.com)

158 points by joaofs 6 years ago | 94 comments

gazzini 6 years ago |

This is huge for Lambda. It allows devs to create “serverless” apps [1], with relational databases, without 10+ second cold-start times. In the article, they measure it as 988ms.

I have tried building an API using API Gateway <-> Lambda, but had to choose between using DynamoDB to store data (no-SQL, so challenging to query) or suffering unacceptably long response times whenever a request happens to cause a cold-start. Theoretically, this problem is now going away!

[1] https://serverless-stack.com

paulddraper 6 years ago | |

*It allows devs to create those apps _within a VPC_.

You could always have fast startup with Lambda + database outside the VPC.

jabart 6 years ago | | |

Which is how most breach announcements start

"A database server was found with an open port exposed to the internet and no or poor authentication, all records were exposed."

This also should mean that Lambda's can get stable public IPs through a VPC for firewalls as well.

*edit for must to most.

gazzini 6 years ago | | |

You raise a fair point, this was possible, although it seems safe to say it would be a compromise on security.

I think it’s best not to expose the DB to outside connections in general, although it is still possible [1] when using RDS instances.

I think this is different for things like DynamoDB because, instead of a standard SQL-like db “connection”, they use AWS role-based auth for each request.

Of course, one could always configure some type of proxy service between the lambda and the DB... but that seems antithetical to going “serverless” in the first place.

[1] https://stackoverflow.com/questions/45227397/publicly-access...

Edit: I thought it was not possible to expose an RDS instance outside of a VPC, but I was wrong (you can place it in a public subnet, linked in [1]).

k__ 6 years ago | | |

Also, wasn't Aurora Serverless created because of that problem?

danb232 6 years ago | |

is that a good tutorial? looks really good on the surface!

mvanbaak 6 years ago | |

If you put an event bus in the middle (kinesis) your api-lambda functions don't need direct access to your RDS. Subscribe lambda functions to your kinesis stream, and let them handle the link to your RDS. This way you wont notice the cold starts.

meekins 6 years ago | | |

This comment doesn't seem to make sense, could you elaborate a bit? How would you replace the database working as a persistence layer to an API application by polling an event stream?

scarface74 6 years ago | | |

And then when you need to read the database?

jmb12686 6 years ago |

AWS announced this enhancement at 2018 re:Invent. It was slated for "sometime in 2019". I was excited, and I'm impressed that they released the feature well ahead of the end of the year (and before the next conference, which would obviously raise a few questions)

scarface74 6 years ago | |

They did something similar with drift detection and cloud formation. They announced it at reInvent 2017 and released it one week before reInvent 2018.

slovenlyrobot 6 years ago |

This has been a /major/ sore point for Lambda use, amazing they fixed it, and always great to see they've documented the intense engineering requirements involved to make it happen.

AWS is a beautiful mix of business and technology, it's very rare to see such a large engineering-driven organization managing to balance customer friendliness. I'm an unashamed fanboy

k__ 6 years ago | |

Major is a bit harsh.

As far as I know this was only an issue for legacy architectures.

scarface74 6 years ago | | |

No. Using an RDMS instead of DynamoDB is not a “legacy” architecture. You also shouldn’t expose your database publicly.

slovenlyrobot 6 years ago | | |

There is an entire ecosystem of tooling that will shit itself and wake up half the company if you assign a public IP address in the wrong VPC

Stuff like this is pain in the ass, it was a major problem

ajoy 6 years ago |

This solves one part of the cold start problem. Starting the container and loading the image on to it is still going to cause some latency.

nostrebored 6 years ago | |

Solves might be strong, but it removes a big portion of the cold start latency that was difficult to optimize for and out of the control of developers. Creating minimal images isn't difficult for a number of environments (e.g. webpacking your node.js lambdas) and barring necessarily large images (think pandas on Lambda) this puts a lot of control for the cold start p99 back in the hands of customers.

Overall, definitely a big win!

k__ 6 years ago | |

I found it a bit strange that they sold Lambda as THE new way to do API development.

You can connect API-Gateway with other services via Velocity templates, which don't have cold starts.

AppSync also doesn't suffer from cold starts.

Both are also serverless services.

Lambda is good if the other solutions are missing something, so you can drop it in quickly, but I wouldn't use it as the go to services for that...

Scarbutt 6 years ago | | |

API-Gateway can return HTML?

StreamBright 6 years ago | |

Which can be mitigated by invoking your own Lambda functions once every minute or 5 minutes. Usually does not blow the budget.

nostrebored 6 years ago | | |

Warming functions in the previous VPC architecture was always a questionable practice. You had no guarantee that your environments would be warm across all subnets or which subnets would handle incoming requests. Beyond that, what happens to requests which you receive when the function is being warmed? You still incur cold starts.

There has never been a guarantee of environment reuse. Any architecture which isn't capable of incurring cold starts is not a good fit for serverless.

scarface74 6 years ago | | |

Which is a horrible idea....

How many lambdas do you keep warm? 5, 10, 20? Every new connection is a new lambda instance. You're still just delaying the inevitable.

Just use Fargate if you want to stay serverless and don't want the cold start times -- well at least before today.

jfbaro 6 years ago |

Wow! That's great. Cold starts are no longer a show stopper! Rust powered APIs running on AWS .. It sounds really exciting

reilly3000 6 years ago |

This is great news, but I'm bummed they didn't bundle the NAT gateway with this service. In a typical function that calls out to get data from a service and reads/writes from a DB in a VPC, that requires the somewhat painful configuration of a NAT gateway and dedicated subnets, as well as a $36/month bill for the NAT gateway service.

There are some workarounds that using multiple lambdas, but they have their own gotchas.

Still, hooray, this is good news. The Data API is great for Serverless Aurora, but I can't use that with BI tools.

abhorrence 6 years ago | |

You can run your own gateway instance(s) for a lot cheaper than the nat gateway service. There are definitely some tradeoffs, but if $36/mo is an issue, they can be worthwhile: https://docs.aws.amazon.com/vpc/latest/userguide/VPC_NAT_Ins...

scarface74 6 years ago | | |

This is not meant to be a criticism of AWS, I’m an AWS true believer, but the main purpose of going to AWS is to make the “undifferentiated heavy lifting” someone else’s problem not to save money.

Going to AWS to save money on resources is about like going to the Apple Store to buy a cheap laptop.

EwanToo 6 years ago |

This is a great improvement for Lambda users, much reduced cold start times!

paulddraper 6 years ago |

Iconoclast view ahead (change my mind please):

AWS does tons of stuff around VPCs....I feel like they really want me to use them (or their customers really want to use them), but I just don't see why.

I just run RDS on the internet. I don't have to muck with the complexity or cost of NATs or peering or Lambda slow start or any other weird networking issues.

I know it's "public", but that seems irrelevant in the era of cloud services. This isn't any different than, say, how Firebase or a million other services run. Should I be concerned that my Firebase apps are insecure because someone isn't overlaying a 10.* network on them?

EDIT: I should clarify that I understand the legitimacy of security groups, especially for technologies that weren't meant to operate outside a firewall. But that's mostly a different subject; AWS had security groups years before VPCs and subnets and NATs.