How We Automate Our Infrastructure(segment.com) |
How We Automate Our Infrastructure(segment.com) |
However, what frustrates me the most about it, is that every startup is left to figure out everything from scratch and it seems impossible.
There are many tools you need to familiarize yourself with, too many to be comfortable with.
Companies that already figured it out write blog post like this, which provide insights but it's super high level, as a startup engineer this gives you absolutely no value other than "yes, they are using it too".
I wonder if there's a solution for this generic enough to open source that will be a good start for startups.
You check out the project, read some docs and in 2-3 hours you have a cluster running. Kind of a "batteries included" devops solution.
both are open source technologies based on Docker that are gathering momentum and you can hire consultants to help you deploy either one.
I personally don't use Docker. The startup I'm building has chosen to standardize on the JVM for all application code so we leverage the JAR file as a kind of container. The Java ecosystem already solved the problem of zero-downtime deployments a long time ago so for us deploying can be as simple as shipping new jars file across the network.
Instead of using Docker to drive development we simply spin up development database/redis/etc instances in the cloud which automatically join a development VPN network. All of the non-VPN interfaces are automatically firewalled off. One nice advantage of this setup is that developers who have slow laptops are still able to work. I'm a big fan of this approach.
Check out Wildfly's "High Availablity" features if you're interested in one way that the Java Ecosystem can make headaches like zero-downtime deployment, HTTP health checks, monitoring, caching, and even load balancing disappear.. It'll deploy non-java code too as long as it's on the JVM. If you're a Scala only shop there are some great Scala-only alternatives available to boot.
A recent example is trying to make Hibernate work via postgis and postgresql as a datasource in Wildfly. We weren't able to solve it, we could only work around it.
Finally, if you need some behavior off the beaten path, you'll have to use lots of annotated Java which makes it easy if you know all this but it's hard to read a Java file with 10 annotations for classes and methods, simply because you don't know what happens when.
To summarize, it's an ok solution if you have a Java guy with lots of experience in all this (luckily we had one). Otherwise you gonna have to learn a lot (as in by heart) because you can't really reason about XML and annotations (as you could, e.g., when composing services in Clojure).
I think the space between Heroku and AWS remains to be solved and lots of companies will jump on the train (if it's good and fast enough).
Basically what I have come up with is a push or merge on master in github, triggers a build in the service, which will push your new image up to docker hub, then ping an agent that runs on your docker host, notifying it of the new image, and any meta data needed to determine how it should proceed.
So for example, if git push to master on app, webhook fires on service, service pulls code, runs commands to run tests if you want, build docker image, etc. Push new image to docker hub, pings agent on docker host, agent gets data, pulls new image, deploys new container, does health checks, and then starts migrating new traffic to the new container before taking old container offline.
We've got a mostly automated cloud-agnostic process for spinning up a multi-datacenter Mesos cluster which integrates nicely with a docker CI workflow.
I'm pretty sure it's quite valuable, though I'm also unclear what people would be willing to pay.
Your solution probably works great for your needs, but this stuff is expensive to productize. See https://www.openshift.org/
The problem is that this includes too many new tools that startups need to learn about, implement and maintain.
Most people, just reading "Mesos" "Marathon" or other in the space just tune out.
What we ended up with uses Terraform for provisioning instances, Docker (and a private registry) for distributing our application code, Consul for coordinating everything and HAProxy w/ consul-template for dynamic routing. There were only two pieces that we had to write. The first (which we may open source, if we're given the time to clean it up and generalize it) is a small Go agent that runs on provisioned hosts, figures out its role based on instance meta data, pulls its configuration from Consul and handles deployment, both initial and subsequent when a new version is registered with Consul. The second piece is ensuring that CI generates Docker images as artifacts, pushes them to our private registry and updates Consul to indicate that there's new code to deploy.
It took us about a week to get this working and it's been mostly rock solid for almost a year now. Part of why it's been solid is that we understand exactly how every component of it works. The one problem we've had came from not understanding how HAProxy worked (never point HAProxy and an ELB...it will cache the NS resolution and ELBs can change IPs over time). If we'd tried something off-the-shelf, we'd have a much shallower understanding and, since it's not optimized for our use case, we would have run into many more issues than we've had. On the whole, I highly recommend rolling your own. The code that you will have to write is glue code that's really just replacing what would be configuration in something pre-built. I get that it seems imposing to people without devops experience, but between the tools that are available these days and articles like the one we're commenting about, it doesn't take a guru to get everything working seamlessly. Also, the tools from Hashicorp are fabulous. Use them whenever possible. No disclaimer necessary since I have no affiliation with them beyond using their tools and watching their talks on the subject.
My only complaint with Ansible really has been that it feels slow at times.
I'm interested in checking out Docker. What exactly does it buy me over my Ansible config/deployment scripts? Does it obsolete them?
Instead, I guess I'll be using a Ansible to configure a container locally (in place of using Dockerfiles)? Then perhaps a different Playbook to deploy this container to my hosts?
It's also not strictly necessary to ensure dev/prod parity.
Highly recommend Salt then. A bit more of a learning curve, but so much faster than Ansible.
There are projects out there such as Torquebox for JRuby and Immutant for Clojure which attempt to wrap some of this configuration in a DSL which I think is really convenient.
It is true though that if you want to extend Wildfly you need to create a Wildfly module which can mean writing Java code. I look at this as being similar to how if you want to extend NGINX you have to be prepared to write your configuration in LUA or C. Unfortunately the JBoss community isn't as well documented as NGINX is right now, so realistically there is some pain.
Since my application didn't need to have Wildfly manage database thread pools on it's behalf I didn't feel the specific pain point that you mention.
Over the long term, when thinking about scale I enjoy knowing that there are companies like Redhat out there who provide support for this technology, but I don't anticipate ever needing to engage them. With this tech configuration is always the hard part but once it's up and running it's performance characteristics are predictable and the Undertow web server is in the top 5 on the latest benchmarks: https://www.techempower.com/benchmarks/#section=data-r11&hw=...
And there are a lot of nuances that make different tooling the right choice for different situations.
And the problem with trying to simply say "Do this" is what happens if your "do this" flow is tool X (similar to tool Y), but the CTO likes tool Y. Which isn't compatible with tool Z.
There are a nearly infinite number of ways to put together a decent development work flow.
Ansible is useful for automating tasks on an actual unix machine (VM or physical). Think of it basically as a parallel ssh to your remote machines.
So typically, you'd use docker containers to create reliable packages for your code and use ansible to do things like provision machines, change configs, run one-time commands on groups of machines etc. And yes, you can also use ansible to deploy your docker containers to your servers too. But that part is more manageable with tools like quay etc. which gives you nice things like package versioning etc.
But my impression of openshift is that it's really a work in progress and that they haven't actually gotten it adequately productized yet.
Docker has gotten enough developer buy-in into containerization that I think it's fundamentally changed what it means to do infrastructure, be it PaaS or IaaS or whatever.
Then there's command and control. OpenShift seems to be more friendly to keeping things under someone's thumb. In an ideal world people would use Kubernetes the way Google uses Borg and devs would be trusted the way they are at Google. But between corporate fiefdoms and the aforementioned hiring practices many companies are still very far from that ideal.