Fly.io makes infrastructure easy for developers(blog.chiselstrike.com) |
Fly.io makes infrastructure easy for developers(blog.chiselstrike.com) |
They are also missing a proper managed database with point-in-time backups through the web UI like those offered by most proper PaaS services.
We did bake nixpacks into our CLI recently, they seem better for our particular environment than buildpacks. Railway.app did a great job with these: https://nixpacks.com/docs/getting-started
We're working on managed databases, but we're not doing them like Heroku did. We just launched a preview of managed Redis with Upstash: https://fly.io/docs/reference/redis/
This seems like the future of managed databases on a platform like ours. There are companies that build very good managed database services. We're getting to the size where these people will work with us. Getting well managed DBs onto the platform is basically what I'm spending all my time on these days.
Incidentally, we're a lot cheaper than Heroku because we run our own infrastructure.
If you happen to have a service beta testing anything I would be interested in joining it.
Is `fly postgres create --name restoredDb --snapshot-id backupId` that hard that's it's a deal breaker?
> support for Heroku buildpacks
I haven't tried it but there's some buildpack support: https://fly.io/docs/reference/configuration/#builder
As for the managed DBs, one of their founders was from Compose, so yeah they know how these things work. But AFAIK Fly doesn't have much interest in DBs, their focus is really in VMs.
ps. are those people already at fly? shameless plug - would love to work on things like that.
We are not Heroku. It is ok for you to not like what we're doing. We're building something different. We've never even _said_ we were a Heroku alternative, we just liked their UX for deploying apps and decided to roll with something similar.
I would rather say 'It no longer works as a business for Salesforce.'
Our app has two components, a backend on Heroku and distributed frontend on Fly. The backend relies on a managed database, and I have not had to touch it in 6 years. Heroku does a great job providing confidence that the managed database will Just Work. The current Fly Postgres offering doesn't provide this confidence.
We shipped "automated" Postgres because we couldn't get any fully managed DB providers to pay attention to us when we were small. I expect we'll have an option running on Fly infrastructure in the next six months.
Our Redis is fully managed, so you can get an idea of how it might play out: https://fly.io/docs/reference/redis/
Kubernetes for the basics is actually pretty easy despite what people say, I got a fairly simple cluster running with little pain but...it doesn't take long before you want multiple clusters, or vlan peering, or customising the DNS or.... and that is when it becomes complex.
What will fly.io do? Probably what everyone else does, starts simple, becomes popular and then caves in to the deluge of feature requests until they end up with an Azure/AWS/GCP clone. If it stays simple, lots of people will say that you will outgrow it quickly and need something else, if you increase functionality, you lose your USP of making infrastructure easy.
I think perhaps the abstractions are the problem, if you are abstracting at the same level as everyone else e.g. docker images, orchestration etc. then I don't understand how it can ever work differently.
To make my point, the very first comment below (above?) is about container format, a really fundamental thing that noobs are not likely to know about, they will just immediately have some kind of error.
What we did, instead, was built low level primitives, then built opinionated PaaS-like magic on top of those.
If you're running a Phoenix app, `fly launch` gets you going, then `fly deploy` gets you updated.
If you want to skip the PaaS layer and do something more intense, you can use our Machines API (or use Terraform to power it) and run basically anything you want: https://fly.io/docs/reference/machines/
We are very, very different than k8s. In some ways, we're lower level with more powerful primitives.
We probably won't build an AWS clone. I don't think devs want that. I also don't think devs want a constrained PaaS that only works for specific kinds of apps.
I think want devs want is an easy way to deploy a boring apps right now, and underlying infrastructure they can use to solve bigger problems over time.
I also don't want to set up my own log aggregator, grafana, and prometheus/alert manager, but for a quick "show everyone your app", I don't need those. I can add that harder crap later when the app shows promise and I actually need to debug performance.
No, mrkurt will not cave, I can guarantee you that. Fly will be a platform that says no to feature requests that don't make sense for their customer base.
I have no affiliation with Fly, other than I've used it on and off since the beginning of the platform's existence. They're a veteran team that knows how to build platforms. I definitely trust them to go in the right direction with their roadmap, and all my new projects go on Fly.
I concur.
Fly is not your typical startup with dreams of becoming the next big corp monster.
They are just a bunch of talented people with a vision having fun making cool stuff.
Then, Fly is not for such an application. Just not yet. I mean, we wouldn't buy a snowboard and complain we couldn't go skiing. Different tools.
The point really is, for a similar thing that which Fly is capable of (and other NewCloud services to an extent like railway.app, render.com, replit.com, convex.dev, workers.dev, deno.com, pages.dev, vercel.com, temporal.io etc), you're better off NOT using AWS/GCP/Azure. I certainly have found it to be true for whatever toy apps I build.
There's certainly a limit, but it makes me so sad that developers see the current state of orchestration and say “welp, it's a complex problem, guess this is as good as it gets” (not you specifically, but it's a common sentiment on HN.)
Sure, there will always be use cases that require getting down to a lower level, but there's definitely space for reducing complexity for common use cases.
The question is whether customers need or want that "functionality".
https://docs.podman.io/en/latest/markdown/podman-manifest-pu...
Wonder why they don't use OCI format.
# podman inspect quay.io/my/image:latest | jq ".[].ManifestType"
"application/vnd.oci.image.manifest.v1+json"
# podman push --format v2s2 quay.io/my/image:latest registry.fly.io/my-app:latestYour server doesn't have a static IP for outgoing requests, so to use it with RDS, you can't just open up a port on the RDS side. (They want you to set up your own proxy) https://community.fly.io/t/set-get-outgoing-ip-address-for-w...
The NFS kernel module isn't installed, so you can't use EFS. (They suggest some 3rd party userland tool)
They expect you to set up VPN access with Wireguard for any connections to your containers. You can't just SCP your files to a volume. It's so much more hassle than connecting with kubectl cp or scp, especially if you're hoping to script things.
All that said, I'm happy to see competition in the "we'll run your docker image" space.
It's a rough edge, to be sure! I just wouldn't want to leave it as "we think the status quo is the right way to handle getting files to and from instances".
[processes]
web = "bundle exec puma -C config/puma.rb"
worker = "config/cron_entrypoint.sh"
And the shell script looks like this: #!/bin/sh
printenv | grep -v "no_proxy" >> /etc/environment
cron -f
Although I would say that cron isn’t a great solution for containerized apps on most platforms, it seems like scheduled processes need a rethink for todays infra.If you just want to run a container in multiple regions with anycast, Fly is really the best option out there IMHO. Nothing comes close.
There are some rough edges for certain use cases but they keep polishing the service and the DX keeps getting better.
Personally the only features I'm missing today are:
- PG backups/snapshots. AFAIK these are coming in the form of virtual disk snapshots.
- Scale apps from zero to say 100 VMs like Cloud Run does. There's some autoscaling right now and the machines API, but still needs more polish. Specially for certain use cases like concurrent CPU heavy tasks (video encoding, etc). AFAIK some form of this is also coming in the next months.
I am very skeptical of this claim.
Just curious about other's people definition, as I imagined Fly.io as a more self-served BaaS (mostly for web applications)
- startups, small teams, no devops.
- horizontally scalable monoliths.
- no complex infra needs.
Haven't used fly.io but strongly considering migrating over from Heroku and above is roughly what I would say is a good fit for Heroku so hopefully it's the same on fly.io
Choose a docker image and just docker-compose up your application.
If you outgrow that, you might aswell switch to kubernetes and aws/gcp/azure
But if you do no amount of Kubernetes on the old school cloud providers is going to get you there. You will encounter the hard problems fly solves for.
Unless data is geographically sharded as well, is there really a benefit? Collaborative apps perhaps?
If you're trying all the things, https://railway.app/ is also a good option.
It has bee pretty great and painless. I have my own docker servers, and so on, but I don't have local registry setup, so dealing with getting images moved around, and all that hassle was going to be annoying.
Made a free fly.io account, did the `flyctl deploy` after making the config file and it just worked moments later. Really a nice flow.
Not sure yet if I would use it for anything else, but this was nice and easy so its definitely on my list of things to check for new projects.
There's one thing that's straight out frustrating: no support for easy environment variables management. Yes you can add secrets but it's hard to read them back. Not everything is a secret, e.g. log level.
https://fly.io/docs/reference/configuration/#the-env-variabl...
You can't read secrets back once you've set them. That's because we can't read secrets back once we've set them --- at least, our API can't. The API has write-only access to our secret storage.
Environment is not code. And Heroku had it figured out.
And mind you, Fly.io is trying to sell itself as natural Heroku competitor.
We didn't know about Fly.io when we chose GCP for this setup.
The initial setup on GCP was exceedingly painful. We used CloudRun for our app server, with the value prop being that "it just works". It didn't. Our container failed to start with zero logs from our servers. Stackdriver was of no help. Eventually we found a Stackoverflow thread revealing that CloudRun didn't like Docker images built from Macs. As always, GCP's official docs and resources are incoherent. GCP docs address a hundred things you don't care about, and the signal-to-noise percentage is in the low teens, if we're being generous. We had to chase down half a dozen bureaucratic things to get our CloudRun app to see and talk to CloudSQL. Apparently with Fly.io, you just run a command to provision Postgres, and pass in an environment variable to your app.
We consoled ourselves that GCP was difficult to setup but now it's set-and-forget. This is also a lie. This week we saw elevated and unexplained 5xx. First was CloudRun randomly disconnecting from CloudSQL. As AWS measures reliability in terms of 9s, the way GCP DevRel responded to this bug is that this is a distributed system and therefore acceptable that things just fail a reliably human-reproducable 1%+ error rate. Yesterday we saw botnet traffic scanning for vulnerabilities on our app. This happens if you're on the web, not inherently GCP's fault. We have GCP's Cloud Load balancer setup but it's not very smart. We were able to manually block specific IP addresses but it's no where as meaningful as Cloudflare. Not a fan of Cloudflare the company but their products address a need. The botnet somehow knocked over our "Serverless VPC connection" to CloudSQL. Basically what that is is a proxy server that you are forced to setup because CloudRun can't actually talk to cloudSQL. All the auto-scaling claims of GCP's serverless are diminished if we are forced to introduce a single point of failures like this in the loop. That serverless VPC connection requires a minimum of 2+ VMs, so the scale-to-zero of CloudRun is no more.
Our experience with GCP is constantly having to come up with workarounds and addressing their accidental complexity. This should not be the customer's problem. For example, CloudSQL doesn't have an interface to query your databases. If you use a private IP for security, you can't even use GCP's command line tools to access this. We found out that GCE VMs are automatically networked to talk to CloudSQL. We ended up creating a "bastion" GCE VM instance and setup Postgres CLI tools in order to do ad-hoc queries of our DB state. For this, we just needed to the cheapest VM but GCP makes even this difficult. As for Stackdriver, it's still been an annoyingly painful UI.
* support for abstract apps, not only HTTP/web apps like heroku, let's say I want to deploy a SIP app
* support for HTTP/2 and potentially HTTP/3
If they do support these two I would say it's enough to be considered Heroku killer.
We do support HTTP/2.
Yes, fly employees, I will file a bug somewhere - or email me.
The rest is just deploying and running containers. There are lots of ways to do that. I loved using Google Cloud Run a few years ago. Stupidly easy to get started with and flexible enough for many things. With some service discovery on top, it's perfect for a lot of stuff. Add some managed middleware & databases to the mix and you essentially have a close to zero ops CI/CD capable environment. No devops needed for this either. When I did this for the first time, I was up and running with our dockerized app in about 15 minutes. Most of that was just waiting for builds to finish.
I'm CTO of a company currently and I've gotten sidetracked with enough lengthy and super expensive devops type stuff in past projects that I'm on purpose avoiding to go near certain things not because I can't do it but because I don't think these things are worth spending any time on for us right now. So, no terraform, no kubernetes, no microservices. I just don't have the time or patience for that stuff. We run a monolith. So, there's not a lot I actually need from my infrastructure. I need it to be fast, secure, and resilient and be able to run my monolith. But I don't need to have things like service discovery, complicated network setup (bog standard vpc is fine), and all the other stuff that devops people obsess about.
We use a load-balancer, I clicked one together in the Google UI. It's fine. Ten minute job. Doesn't need terraform scripting. We have two of them. And we have a couple buckets and our monolith behind that. I could grab the gcloud command that recreates this thing and put it somewhere. But I have more urgent things to do.
For deployment we use simple gcloud commands from github actions to update vms with new instance templates to tell them to run the latest container that our build produced. We started with cloud run but our monolith has a few worker threads that we don't want killed so we moved it to proper vms. Very easy to do in Google Cloud.
Our deploy command does a rolling restart. We have health checks, logging, monitoring, alerting, etc. Could be better but it works. Initial provisioning of the environment was manual and we scripted together all the commands that are part of our deploy process for automation. We added a managed redis, database, and elasticsearch to this. None of that was particularly hard or worth automating to me. Yes, it's bit of a snowflake. But not that complicated and I documented it. So, we can do it again in a few hours if we ever need to.
The dirty little secret of a lot of devops that it's a lot over over-engineered YAGNY stuff that is super labor intensive to setup and maintain and you end up using it a lot less often than people think.
This is why freelance devops engineers are so in demand: this stuff just requires a lot of manual work! Companies need these people full time and usually more than just one. The devops alone can add up to hundreds of thousands of dollars/euros per year.
It's a lot of manual work that probably should be automated. However, hiring a lot of people at great expense to automate things that are cheap and not that complicated is not always the best use of resources. I've seen companies that spend an order of magnitude more on devops salaries than on the actual hosting bills. If you think about it, that's kind of weird to be spending so much for so little gains. And most of these companies are not particularly big or experience enormous scaling issues.
(I think I read Fly is planning to add scale to zero to their normal service, they currently have it at the api level with “fly machines”)
The other thing I want is a completely hands off managed DB with point in time restore. None of those three have that yet. Crunchy Data looks perfect but are not “in cloud” with them, only being on AWS/Azure/GCP. If one of those three added that capability in house I would probably just go for it.
koyeb was too hard to setup for me. railway easier, but the images were extremely unstable.
fly was easiest to setup, esp. DNS for custom domains and let's encrypt, and works fine with docker images. there's no GitHub app, but the docs for a deploy action were good enough.
I would also be remiss not to mention Coherence (withcoherence.com) [I'm a cofounder] where we're trying to deliver some of the same magic as the best in class PaaS's above, but we are running your workloads in your own AWS or GCP account. We're really excited about the potential future of a great developer experience that can be delivered as a service instead of rebuilt over and again by platform teams in-house.
This is especially true when you realize you want a QA env, US/EU env, staging, etc. If it's one server (or server and DB), it's much much easier to create more environments.
There are also companies that never figured it out because "developer focused" is not the right business model for them. Those are, I think, the companies that make us all feel burned.
Heroku is one of those companies where "developer focus" is not the right business model. Salesforce has a model, it's working very well, Heroku's doesn't fit.
It feels like the simplicity of Heroku deployments with the power and security of an AWS VPC.
I think the killer app on fly is actually a geo aware sql db such as cockroach. That as a managed offering puts fly above and beyond anything we’ve had before.
I suspect they know this.
Geo aware SQL DB sounds like a lot of added complexity. What is the latency trade off in practice? 100ms ping time is probably small compared to query execution time, especially if your backend returns everything the frontend needs in one response.
I understand someone at the scale of Amazon wanting to shave ms off page loads. But most web apps?
The last web app I worked on had a latency budget of 200ms. The one before that it was 400.
Cross pacific queries can take 200 of that off the top. Even transcontinental is close to 80. With query times you can at least work to improve them.
Now it’s not as simple as just locality but that’s a starting point.
Same has been said for every company who taken outside investment ever. "But no, Heroku/Figma/GitHub/X are different, they really do care about their users and would never sell/go public/Y", and then a couple of years later we end up in the same position.
It might not even be up to the "bunch of talented people" in the end, what they have to do to survive or to grow. But grow they have to, unless investors are fine with getting their ROI over 10-50 years rather than 1-10 years. A growing usually comes with some pain.
Which means, there will be one of three outcomes:
1. We are correct, and manage to build the right thing. We'll get to work on this forever.
2. We are correct, but not the right group to build it. We fail.
3. We are incorrect, and the world doesn't need a public cloud for devs. We fail, and I become a carpenter.
We have the same incentives as our investors. That doesn't mean it'll work. It does mean that we all believe that we're building a product for developers.
We're pretty good at surviving, so far. And there are early signs that we're good at growing. There's reason to be hopeful. :)
> We have the same incentives as our investors.
this is wrong, even if today it looks right because the different incentives result in the same concrete things.
You have a company; your goal is to make the company succeed. Investors have a portfolio; their goal is to make the portfolio succeed. Your company succeeding is only one aspect of their portfolio succeeding, and one whose importance and externalities can change drastically for reasons outside your control.
Maybe I'm too pessimistic, I apologize if that's the case. But I fail to see how the company could ever work on "the right thing" "forever" since there is outside investment in the company. Do these investor not want a return on their investment at one point? If it's in 1 year or 10, eventually they are gonna want you to either go public, or get bought by another company, both of which makes the mission goal change from "the right thing" to "the profitable thing" at that moment.
But again, maybe I'm just overly pessimistic based on bad experiences with VC funded companies.
1.We are correct, and manage to build the right thing [developer first cloud]. We'll get to work on this *until we exhaust that market and are forced to grow beyond it"
Unless "Upstash Redis Database" is one thing? In which case, that's "Redis, and a database we don't do anything with".
I'm still not seeing the lock-in.
Whatever EdgeDB is: we're happy if you run it here! If it has like, a company running it as a service, we're very happy to talk to them, too. But we're going out of our way to avoid weird custom services, like Amazon's 19 different messaging services, that have platform-specific APIs. Our APIs are Docker, IPv6, DNS, Redis, and Postgres. Our Postgres is literally just an app running on Fly.io; you can run it yourself if you like --- or, I guess, run it somewhere else.
You can scp the tarball of your docker and docker-compose down/up
If you want even easier, just put your app on docker and run nginx natively via systemd or wahtever.
Because it mustn't "just work" and then when the project grows you setup something new. It must "just work" and then scale to continue working for the initial and maybe all phases of the software product.
So e.g. security matters, so you rally don't want to maintain an OS. Sure a OCI image has OS like properties, but by using stateless approaches and carefully select a very thin base image and rebuild it automatically with the newest version of the image you can sidestep most of that work load.
Then you need auto scaling.
Then you need to be able to add services on the fly without problems (e.g. another DB) and they need to have their own scaling.
Then there is networking, as you don't create a super complex applications you don't need anything complex but still something solid.
fly.io does provide all that
So not only can you get something running trivially, you can the incrementally improve on it and even ship it in many cases without needing to re-design your deployment.
Looks tbh. pretty grate for small and mid sized companies (or autonomous project of such size in larger companies).
But if I'm going to do all that, I might as well just install k8s and deploy with the cli. I've already gone through the learning curve. But I don't want to do that. I don't want other people that might join my project to have to learn k8s for a tiny side project.
But you don't "just" show that something works.
Once you did show that you are expected to make a MVP and then a MVP which works in production and then add features etc.
With fly you can just use the same basic setup to go through all of this steps reducing a ton of friction.
I have seen startups fail because of this friction. I also have seen internal proposals being abandoned because of this friction.
I think something happened with a generation of programmers that completely missed the basics and went straight to abstractions.
They know react but not JavaScript, they know k8s but not linux… and they barely know how a computer works. They read a lot of stuff online in blogs but no books. They think moving fast is what some company marketing material says it is.
Look anything you build on top of the basics will be more complex than the basics. It’s worth it to abstract complexity when that arises, when you don’t have it, it’s not necessary and only causes you to move slowly.
Nothing will be faster to iterate than a file on disk if you’re working alone and want to validate an idea. You can literally rsync the folder you’re dev-ing from on a git tag and change a symlink in a single command. That will take you very far with no friction.
What I’ve seen failing is people who find the non existing problems they have more interesting than delivering value