Why is Kubernetes getting so popular?(stackoverflow.blog) |
Why is Kubernetes getting so popular?(stackoverflow.blog) |
Why should it have?
Many people I talk with will complain about security, performance and complexity of k8s (and containers in general). Non-practicing engineers (read: directors/vps-eng) will complain about the associated cost with administering their k8s clusters both in terms of cloud cost and devops personnel cost.
Someone earlier mentioned it was the new wordpress - I don't think that's an unfair comparison, although I would challenge the complexity/cost of it.
Longer term, I think the contribution of Kubernetes will be getting us used to a resource/API-driven approach to infrastructure that abstracts away cloud providers, hardware, etc. But it will probably be superseded in the coming years by something that honors similar API "contracts." Probably written in Rust troll
That being said, I will also be the first one to recognize that PLENTY of workloads are not made to run on Kubernetes. Sometimes it is way more efficient to spawn an EC2/GCE instance and run a single docker container on it. It really depends on your use-case.
If I had to run a relatively simple app in prod I would never use Kubernetes to start with. Kubernetes starts to pay itself off once you have a critical mass of services on it.
There is some tech so simple that you just learn it and start using it, others that you know you can pick up when the time is right.
And software you would be happy to invest time in... as long as someone is paying you to do it, software you fear might keep you from getting a job if you don't invest in it.
There is software so simple it might be right (it isn't) and software so complicated that it must be important if people are using it/working on it.
So it's not that Kubernetes is good, it's just that it makes people neurotic enough to jump on the bandwagon. Been a few of those in my career. A few have stuck, most have not.
It also promotes immutable infrastructure and hence increases the portability. While some of the things like load balancers and ingress are controlled by cloud provider almost everything else can be seamlessly migrated to another cloud provider or on prem.
It makes dev, test, staging, prod environments consistent and also solves a lot of pain points of managing infrastructre at scale with autoscaling, auto healing and more. Istio adds a lot more kubernetes and makes the supporting microservices even easier.
Its going to be an important piece in Hybrid world as it brings a lot of standardization and consistency in two disparate environments.
I only consider that late because I've been reading the hype around k8s for many years already.
Became a late adopter of containers just before k8s actually. Now I've migrated most of my setups both privately and professionally to containers. And setup my first k8s clusters both at work and in my homelab.
So my perspective is that containers are first and foremost an amazing way of deploying software because all that complexity I did in ansible to deploy the software has been moved to the container image.
The project itself now, be it Mastodon, Jitsi, Synapse to name a few, package most of their product for me in automatic build pipelines. All I need to do is run and configure it.
And therefore, moving on to k8s, it would stand to reason that some of those services are able to be clustered. Where better to do such clustering than k8s?
That's just an ops perspective. We also have devs where I work and with k8s they're able to deploy anything from routes down to their services using manifests in CD pipelines. What's not to like?
Only reason one might get disenchanted with k8s is if you expect it to be a one-stop solution for your aging .net application. Not saying you can't deploy that in k8s, I'm just using it as an example of something that might not be microservice ready.
It's basically running a big computer without even trying.
-------
Kubernetes - kubernetes.io
Kubernetes is an open-source container-orchestration system for automating application deployment, scaling, and management. It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation.
Original author(s): Google
After spending 18 months working on bringing kubernetes(EKS) to production, with dozens of services on it, the time was right to hand over migrating old services to the software engineers who maintain them. Due to product demands, but also some lack of advocacy, this didn't happen, with the DevOps folks ultimately doing the migration and retaining all the kubernetes knowledge.
An unpopular opinion might be that Kubernetes is popular because it gives DevOps teams new tech to play with, with long lead times for delivery given its complexity. Kubernetes usually is a gateway to tracing, service meshes and CRDs, which while you don't need at all to run Kubernetes, they will probably end up in your cluster.
"Developers love it!" Yeah, I'd love someone to drive my car for me, too. Doesn't mean it's a great idea to use technology so complex you have to hire a driver (really several drivers) to use it.
If you already have 3 people working for you that (for example) understand etcd's protocols or how to troubleshoot ingress issues or how to prevent (and later fix) crash loops, maybe they can volunteer to babysit your cluster for you, do all the custom integration into the custom APIs, keep it secure, etc. But eventually they may get tired of it and you'll have to hire SMEs.
If you're self-hosting a "small" k8s cluster and didn't budget at least $500k for it, you're making a mistake. There are far simpler solutions to just running a microservice that don't require lots of training and constant maintenance.
Complexity isn't always bad, but unnecessary complexity always is.
- Setup VM: and their dependencies, tool chain. If you use thing like package that has native component such as image processing you event need to setup some compiler on the VM - Deployment process - Load balancer - Systemd unit to auto restart it. Set memory limit etc.
All of that is done in K8S. As long as you ship a Dockerfile, you're done.
If you have 16CPU EC2 for your business logic, one for your DB, and you're smartly hosting your static content elsewhere or via Cloudflare ... I mean you need to have a 'big company' before going too far beyond that.
What gives? What are all these startups doing?
This is not a story about K8's, this is entirely something else, it's about psychology, complexity, our love of it, or rather our 'belief' that complexity = productivity that solving 'the hard infra problem' must inherently, be somehow be 'good for the company' because it 'feels difficult' and therefore must be doing something powerful or at least gaining some kind of competitive advantage?
(Aside from the 'Docker is Useful and K8's follows' point which actually makes sense a little bit ...)
http://www.smashcompany.com/technology/my-final-post-regardi...
- Most code running on k8s hasn't hit full production load yet.
- Where it has worked well, its been managed by devs that know what they are doing.
- It something worth putting on a backed dev resume
- Apparent cost saving ('we just need 1 vm instead of 5', 'we can auto scale to infinity','we don't have yo pay for aws, we get it all on our own vms').
Wait a few months and we will see a slurry of posts that read 'why we moved away from kubernetes', 'top 5 reasons to not use kubernetes', 'How using kubernetes fucked us, in the ass', 'You dont need kubernetes', 'Why I will never work on a project that uses kubernetes', 'Hidden costs of kubernetes' and so on.
C'mmon, you know how this works. Just take the time and read the docs. They are well written (They just don't mention where k8s does not work well)
I dont want do memory management-> gc
I dont want to do packaging -> Docker
I dont want to do autoscaling -> Kubernetes
Something like an easy to use (and operate!) multi-tenant docker-compose on steroids with user management/RBAC and a built-in Docker image repository that gets out of your way would be amazing for small teams / startups that don't want to deal with the complexity of Kubernetes.
Jokes aside, when you've lots of teams, all working on small pieces of a large product and shipping on their own, iterating fast... you need a platform and ecosystem on top to meet their requirements. As you reach planet-scale, you need to NOT let your cost grow exponentially. Hence it is popular.
What if you're not planet-scale? Well, it will still help (attract talent, design for scale, better ecosystem etc.). Hence it is popular.
If you're building a business however, focus on business and time-to-market, definitely not the infra, i.e. kubernetes.
operations details are hidden from developers and development details (the details of the workload) are hidden from the operations engineers.
I can't just blow away the instance, make a new one with their API, and run a bash script to set it up because I need to persist some sqlite databases between deploys.
Nix looks promising, but also seems to be a lot to learn. I think I'd rather focus on my app than learn a whole new language and ecosystem and way of thinking about dependencies.
I don't think my needs are insane here, I'm surprised there seems to be no infrastructure as code project for tiny infrastructures.
User data is a bash script that can be automatically run when the machine first spins up.
You could pass that script via digital oceans cli or even a tool like terraform.
Just use Ansible if you miss YAML, and you can actually deploy to real hardware.
Everyone was trying to make a system simple and adopted, but if you want it to be adopted, it's going to need a lot of features. Also Google worked some real magic in getting Kubernetes being supported by all the cloud providers.
It's a framework that will enable you to do what you want, while being the standard.
You could write your script to do that in a simpler way, but most people already know the standard and it's easier for everybody to understand Kubernetes rather than your clever solution.
I've never really thought it was that useful for (for example) nodejs, where you can just npm install your whole environment and deps, and off you go.
- Automatic scaling of pods and cluster VMs to meet demand.
- Flexible automated process monitoring via liveness/readiness probes.
- Simple log streaming across horizontally scaled pods running the same app/serving the same function using stern.
- Easy and low cost metrics aggregation with Prometheus and Grafana.
- Injecting secrets into services.
I'd imagine there are other things can offer the same, but I find it convenient to have them all in the same place.
Our management of cluster is just a simple "add more CPU or memory to this nodepool", sometimes change a nodepool name for deployment for certain service. All done simple cloud management UI. For those who call microservices fancy stuff. No, we are a startup with fast delivery, deploy cycle. We have tons of subproject , integrations, and our main languages are nodejs, golang and python. Some of these are not good at multi-thread so no way to run it as a monolith. The other one is used only when it's needed for high performance. So All together Microservices + Kubernetes + Helm + good CI + proper pubsub gives our backend extremely simple fast cycle of development, delivery, and what's important flexibility in terms of language/framework/version.
What is also good is the installation of services. With helm I can install high availability redis setup for free in 5 minutes. The same level of setup will cost you several thousand dollars for devops work and further maintenance and update. With k8s it's simple helm install stable/redis-ha
So yeah, I can totally understand some simple projects don't need k8s. I can understand you can build something is Scala and Java slowly but with high quality as a monolith. You don't need k8s for 3 services. I can understand some old DevOps don't want to learn new things and they complain about a tool that reduces the need of these guys. Otherwise, you really need k8s.
Because soon from one program on a dev server, there is a need to run databases, log gathering, multiply the previous to do parallel testing in clean environment, etc. etc.
Just running supporting tools for a small project where there was insistence on self-hosting open source tools instead of throwing money at slack and the like? K3s would have saved me weeks of work :|
https://static.googleusercontent.com/media/research.google.c...
YAML should not even be needed for Kubernetes. Configuration should be representable in a purely declarative way, instead of making the YAML mess, with all kinds of references and stuff. Perhaps the configuration specification needs to be re-worked. Many projects using YAML feel to me like a configuration trash can, where you just add more and more stuff, which you haven't thought about.
I once tried moving an already containerized system to Kubernetes for testing, how that would work. It was a nightmare. It was a few years ago, maybe 3 years ago. Documentation was plenty but really sucked. I could not find _any_ documentation of what can be put into that YAML configuration file, what the structure really is. I read tens of pages of documentation, none of it helped me to find, what I needed. Then even to set everything up, to get the Kubernetes running at all also took way too much time and 3 people to figure out and was badly documented. It took multiple hours on at least 2 days. Necessary steps, I still remember, not being listed on one single page in any kind of overview, but somewhere a required step was hidden on another documentation page, that was not even mentioned in the list of steps to take.
Finally having set things up, I had a web interface in front of me, where I was supposed to be able to configure pods or something. Only, that I could not configure everything I had in my already containerized system, via that web interface. It seems that this web interface was only meant for the most basic use cases, where one does not need to provide containers with much configuration. My only remaining option was to upload a YAML file, which was undocumented, as far as I could see back then. That's were I stopped. A horrible experience and I wish not to have it again.
There were also naming issues. There was something called "Helm". To me that sounds like an Emacs package. But OK I guess we have these naming issues everywhere in software development. Still bugs me though, as it feels like Google pushes down its naming of things into many people's minds and sooner or later, most people will associate Google things with names, which have previously meant different things.
There were 1 or 2 layers of abstraction in Kubernetes, which I found completely useless for my use-case and wished they were not there, but of course I had to deal with them, as the system is not flexible to allow me to only have layers I need. I just wanted to run my containers on multiple machines, balancing the load and automatically restarting on crashes, you know, all the nice things Erlang offers already for ages.
I feel like Kubernetes is the Erlang ecosystem for the poor or uneducated, who've never heard of other ways, with features poorly copied.
If I really needed to bring a system to multiple servers and scale and load balance, I'd rather look into something like Nomad. Seems much simpler and also offers load balancing over multiple machines and can run docker containers and normal applications as well, plus I was able to set it up in less than an hour or so, having to servers in the system.
What I can tell you, is that the unbelievable bloat in the complexity of our systems is going to bite us in the ass. I'll never forget when I joined a hip fintech company, and the director of eng told us in orientation that we should think of their cloud of services as a thousand points of light, out in space. I knew my days were numbered at exactly that moment. This company had 200k unique users, and they were spending a million dollars a month on CRUD. Granted, banking is its own beast, but I had just come from a company of 10 people serving 3 million daily users 10k requests a second for images drawn on the fly by GPUs. Our hosting costs never exceeded 20k per month, and the vast majority of that was cloudflare.
Deploying meant compiling a static binary and copying it to the 4-6 hardware servers we ran in a couple racks, one rack on each side of the continent. We were drunk by 11am most of the time.
Today, it's apparently much more impressive if you need to have a team of earnest, bright-eyed Stanford grads constantly tweaking and fiddling with 100 knobs in order to keep systems running. Enter kubernetes.
I am a huge Kubernetes fan, and think that it is a good and necessary tool with little accidental complexity (most concepts are there because you will likely need them and/or that they are a valid concern), but my position is that the growth of Kubernetes has not been organic -- it's been heavily promoted and marketed and pushed to where it is today.
Let's compare a project like Ansible first release in 2012[0], and the first AnsibleFest is in 2016[0]. Ansible is a very useful abstraction/force multiplier for doing ops. If a dedicated conference is a measure of community/enthusiasm reaching a fever pitch, it took 4 years for Ansible to reach critical mass. Kubernetes had it's first Kubecon in 2015[1] ONE year after it's initial release in 2014[2]. Did it reach critical mass 4x quicker than ansible? Maybe, but I think the simpler explanation is that the people who want Kubernetes to succeed know that creating buzz and the appearance of widespread adoption and community is more important than it actually being there, as it becomes a self-fulfilling prophecy. Once you have enough onlookers, people motivated to work on open source (i.e. give away labor, time and energy for free) will come improve your project with you, serve as an initial user base, your biggest promoters, all the while strengthening your ecosystem.
Another interesting side to this is how thoroughly Kubernetes seems to be crushing it's competition -- DC/OS (Mesos), Nomad and other competition are not fighting a functionality war, they're fighting a marketing war. DC/OS and Nomad are not obviously worse in function, but certainly don't compare when you consider ecosystem size (perceived, if not actual) and brand. It's a winner-take-most scenario and tech companies are particularly good at seizing this kind of opportunity. Of course, if you compare the resources of the entities backing these projects, it's clear who was going to win the marketing war.
In a world of free tiers as a good way to get people locked in, developer evangelists who build essentially propaganda projects (no matter how cool they are), and shrinking attention spans, Kubernetes is a good tool which has marketed itself to greatness. In it's wake there are efforts like the CNCF which I struggle to characterize because it's hard to differentiate their efforts to standardize from an effort to bureaucratize. I'm almost certainly blinded by my own cynicism but most of this just doesn't feel organic. Big, useful open source software gets world-renowned after years/decades of being convenient/useful/correct/etc but Kubernetes (and other projects given the CNCF gold star) seem to be trying to skip this process or at least bootstrap a reputation out of the gate.
DevOps traditionally moved much slower -- I can remember what seemed like an age of "salt vs ansible vs chef", with all three technologies having had lots of times to prove themselves useful. Even the switch to containers instead of VM/user based process isolation took more time than Kubernetes has taken to dominate the zeitgeist.
[0]: https://en.wikipedia.org/wiki/Ansible_(software)
[1]: http://www.voxuspr.com/2019/03/what-is-kubecon-its-past-pres...
1. It's portable
2. It's fast
3. It's declarative
4. It's fun / productive / easy
5. It's safe / automatic
6. It's an integrated framework
The opposites are also used to detract competitors.
The idea of k8 is that it will be portable to all hosting providers and linux distributions as opposed to developing shell scripts for Red Hat, especially multiple versions. I don't think it's easy or fun or fast.
My favorite example of this right now is Vitess. Sure, it's a beautiful piece of technology. But, for a usecase my company is looking at, we'll be replacing one (exceptionally large) DB with in excess of 80 mysql pods, managed by another opaque-through-complexity system running on the top of kubernetes (which already bites us regularly even though it's "managed").
The complexity and failure scenarios makes my head ache, even though I should never have to interact with it myself.
Oh, and my current favorite PITA - having to change the API version of deployment objects from 'v1beta1' to 'v1' in over 160 microservice charts as part of a kubernetes version upgrade. Helm 2 doesn't recognize the deployments as being identical, so we're also have to do a helm3 upgrade as well, just to avoid taking down our entire ecosystem to do the API version upgrade. Wheeee!
How is this a problem unique to Kubernetes? Don't you have to make similar changes when upgrading a library or dependency that was in beta?
That said, a couple thoughts that came to mind:
1. having only 4 servers in 2 locations serving 3m customers a day seems crazy to me, atleast in the context of current practices regarding highly available systems.
2. not sure your cost comparisons are fair, in the first case you're talking about cloud costs (so including hardware, 3rd party services/api fees, etc), but in the second you're just talking hosting fees.
If your first company had a relatively static, hardware-heavy (gpus doing most of the work) workload, easily handled by a few servers -- then it would be crazy to pay for a cloud provider. And it wouldn't make much sense to bother with k8s or containers either (imo).
On the other hand, if the more recent company has a dynamic/spikey, software-heavy workload, with a ton of different services, orders of magnitude more infrastructure, and (being fintech) much more demanding SLAs... then it might make a lot of sense to use a cloud provider and take advantage of k8s. Especially if you're a start up that doesn't have the time/expertise to deal with datacenter design.
I agree that there's a lot of unnecessary fixation on the latest and greatest these days, but there are definitely situations where kubernetes can be very valuable.
This was all for a weather radar app, and you are correct, there really weren't any SLA's, but we had to handle very high loads. We did make use of cloud services for some pieces of the system (there was a database and a small API for some minor bookkeeping, mostly around users). I included those costs in my estimate of monthly expenses. We had lots of caches, for all our JSON and for things like user authentication, which saved us from having to really figure out the database side. The caches were typically push-based, so we didn't let user requests get to the disc, if we could help it.
The vast majority of requests were for those images though, which required moving lots of clumsy geographic data into the GPUs to render map tiles (at high-def and high zooms as well), so the requests were still somewhat costly to serve, even if they didn't hit a database. We were able to get away with a small footprint in the datacenter by making heavy use of CDN caching. Cache lifetimes for the latest weather images were often measured in seconds, and getting those timings right was crucial. Screwing up cache lifetimes would rapidly swamp the system with requests, but the software was good at continuing to keep latency low under heavy load, and degrading gracefully. In fact, the vast majority of bandwidth usage in the datacenters was actually not requests, but streaming geographic data from various government sources. We regularly had 50-100MB/s coming in, and we stored all of it in memory. The GPU machines had 100-200GB of memory, and we used all of it. We had to cycle through that memory pretty rapidly as well, so making sure allocations were low and memory was freed up on time was important.
It may not sound like we had much redundancy, but with all the caches, and each machine being quite powerful, it was better than it sounds in that regard. We often took machines in an out of nginx. The way the graceful degradation worked, we would prioritize the imagery from higher zooms (more zoomed out) so the worst that would happen on a typically day is that some very zoomed in images, in places few people were looking, might be slow or time out.
So, in the end, you are correct, the situations are different. The bank had to store things for a lot longer, and had to uphold more stringent SLA's and the like. That said, I still think they were flushing a lot of cash down the toilet, and making things over complicated :).
If such a tool does not exist do any of you feel that the creation of such a tool is within the realm of possibility?
I would imagine all these knobs could have default configurations that 99% of all users would be okay with and that the knob should only be exposed in a small amount of cases.
Don't get me wrong. I'd still probably build that as a monolith in Java instead of a thousand NodeJS services, but I can see how you end up with Kubernetes.
Let's be real, if you are old enough to get that reference without Googling, you probably would not have lasted that long at a hip fintech company anyways :-P
But it doesn’t seem like it generates quite the same buzz as kunernetes. Not even within the azure/win/.net part of the world.
So have anyone here worked with both and could share some experience?
yaml is such a horrible format that I would even prefer JSON...
Can k8s success be explained partly due to the need for a more polyglot stack?
Once you know K8s, it's not very difficult to use. Plus, it provides solutions to a lot of different infrastructure-level problems.
it's not unique in what it does, but even with puppet and the likes you always had this or that exception because networking, provider images varying selinux defaults etc.
kuberent on it's own already covered most ground, but configmap and endpoints really tie it together in a super convenient package
it's not without pitfalls, like ms aks steal 2gb from each node so you have to be aware of that and plan accordingly, but still.
This is what I hate alot about things like k8, docker, etc is the memory profile… pretty much makes it a non starter if you want to run it on anything low cost.
What is the cheapest way to setup a production kubernetes on a cloud provider?
Kubernetes is popular because it's the new 'cool'.
1. I work for GCP 2. https://cloud.google.com/anthos/gke
1. https://cloud.google.com/blog/topics/anthos/multi-cloud-feat...
The live VMWare migration to Anthos is also quite impressive too
To the first- yes, enormously so. If you know your history, it is the Linux to the Microsoft that is AWS- except backed by a business. (Google is maybe RedHat in that story, but the analogy is more inaccurate than accurate).
To the second, not really. GCP is mostly turning into an ML play.
And like PHP, it will be criticised with the power of hind sight but will continue to be used and power vast swaths of the internet.
But what is the universally regarded theory that k8s contradicts? I don't think there is one.
The storage of apiserver essentially works as distributed Blackboard in a "Blackboard System", with every controller being an agent in such a system. Meanwhile the agents themselves approach their tasks from control theory areas - oft used comparison is with PID controllers.
The managed hosts and/or their tools probably helped negate damage/resolve issues quicker. However I think that the idea that "all you need is a couple of dozen lines of yaml and a managed provider" is exactly why it's headed down a similar path.
For a real world examples just look at every improperly configured S3 bucket leaking data. Every private key accidentally posted to github from a careless 'git add -a'. Every API that doesn't properly check auth. None of these are within the purview of a managed hosts responsibility.
I'm not even against K8 in any of this. Just making the observation that - like PHP - it is empowering entire groups of people to do things they otherwise wouldn't be able to do.
You always have those concerns, it's just implemented differently. Customer hurling abuse at you over the phone (or worse - in person) is a form of healthchecks and monitoring, if worse than often common "have someone log in to the server every day and check if it's alive".
So is frantically logging into server to manually truncate log files that filled your one and only disk volume and caused the above abuse.
So is "we're losing customers because of how slow it is" yet not having a single idea why it is slow, because it runs fast when dev checks on their laptop.
All of the above are based on actual real world events, sometimes involving large corporations. In fact, the large corps seem to have most issues with manual work, because they can afford throwing cannon fodder ^W^W "experienced engineers" at the problems.
At some point it becomes a question of what is good use of your time. I disagree heavily with people claiming that running kubernetes is somehow orders of magnitude more complex than anything else (especially with k3s and using non-etcd backing stores). The complexity is necessary complexity, which you can tackle in various ways including YOLO.
Sometimes the YOLO approach however bites in the worst moment, and spending time on bespoke scripts, or figuring out configuration drift, are all costs that show up as you tackle said complexity.
Personally, the reason I went with kubernetes in the first production deployment I did with it, after being vocal anti-docker person at work, was because of... cost efficiency. Both in terms of my time (even though we had to spend significant amount of time migrating, as it was lift&shift of existing software), and in terms of compute costs - thanks to heavily loaded nodes our worst compute bill never reached above 20% of previous "condition normal". I don't think we ever really had more than 10 servers on purpose. Using k8s paid for itself.
We've had to work very hard to allow for developers/sre/ops folks to be able provision vms and bare-metal machines in our datacenters the same way they would in the cloud provider that we use. Obviously its not as fast, seamless or feature-rich as it is with aws/gcp/azure et al, but I'm proud of the progress we've made.
What really kills me though, is that a huge chunk of our engineers seem to think our work is a complete waste of time in the first place. We have several physical dcs, and tens of thousands of machines... but since most engineers don't have to think about costs, or about workloads other than their own, they think of us as out of touch and clinging to the past.
Nothing worse than getting snark about our platform from an SRE who spends their days in a web app glueing together the ready made services of google and amazon while acting as if they're building the world of tomorrow :)
But that's a moot point anyways, since Vitess doesn't use persistent volumes - it reloads the individual DBs from backups and binlogs when a pod is moved or restarted.
NFS is an option, but it’s not the only option. If you need locally attached storage you can use local PV’s which went GA in Kubernetes 1.14, or any of the plethora of volume plugins that exist for various network storage solutions.
I had forgotten about local storage; it’s not something we can use in our environment.
It’s a moot point, in either case. Vitess doesn’t rely on persistent storage, it relies on replicas and backups.
Forest, trees.
• Lets companies brag about having # many production services at any given time
• Company saves money by not having to hire Linux sysadmins
• Company saves money by not having to pay for managed cloud products if they don't want to
• Declarative, version controlled, git-blameable deployments
• Treating cloud providers like cattle not pets
It's going to eat the world (already has?).
I was skeptical about Kubernetes but I now understand why it's popular. The alternatives are all based on kludgy shell/Python scripts or proprietary cloud products.
It's easy to get frustrated with it because it's ridiculously complex and introduces a whole glossary of jargon and a whole new mental model. This isn't Linux anymore. This is, for all intents and purposes, a new operating system. But the interface to this OS is a bunch of <strike>punchcards</strike> YAML files that you send off to a black box and hope it works.
You're using a text editor but it's not programming. It's only YAML because it's not cool to use GUIs for system administration anymore (e.g. Windows Server, cPanel). It feels like configuring a build system or filling out taxes--absolute drudgery that hopefully gets automated one day.
The alternative to K8s isn't your personal collection of fragile shell scripts. The real alternative is not doing the whole microservices thing and just deploying a single statically linked, optimized C++ server that can serve 10k requests per second from a toaster--but we're not ready to have that discussion.
As a spectator, not a tech worker who uses these popular solutions, I would say there seems to be a great affinity amongst in the tech industry for anything that is (relatively) complex. Either that, or the only solutions people today can come up with are complex ones. The more features and complexity, the more something is constantly changing, the more a new solution gains "traction". If anyone reading has examples that counter this idea, please feel free to share them.
I think if a hobbyist were to "[deploy] a single statically linked, optimized [C++] server that can serve 10k requests per second from a toaster" it would be like a tree falling in the forest. For one because it is too simple, it lacks the complexity that attracts the tech worker crowd, and second, because it is not being used by well-known tech company and not being worked on by large numbers of people, it would not be newsworthy.
Developer time for fixing these bugs is in most cases more expensive, than to throw more hardware at your software written in a garbage collected language.
And at this point the hobbyist might wonder, "why isn't my toaster software being used by well-known companies? where are the pull requests to add compatibility for newer toaster models?"
> As a spectator, not a tech worker who uses these popular solutions, I would say there seems to be a great affinity amongst in the tech industry for anything that is (relatively) complex.
I think you have it backwards. General/abstract solutions (like running arbitrary software with a high tolerance for failure) have broad appeal because they address a broad problem. Finding general solutions to broad problems yields complexity, but also great value.
That, and mixture of a sunken cost fallacy/lack of the ability to step back and review if the chosen solution is really better/simpler rather than a hell of accidental complexity. If you've spend countless months to grok k8s and sell it to your customer/boss, it just has to be good, doesn't it?
Plus, there's a great desire to go for an utopian future cleaning up all that's wrong with current tech. This was the case with Java in the 2000s, and is the case with Rust (and to a lesser degree with WASM) today, and k8s. Starting over is easier and more fun than fixing your shit.
And another factor are the deep pockets of cloud providers who bombard us with k8s stories, plus devs with an investment into k8s and/or Stockholm syndrome. Same story with webdevs longing for nuclear weapons a la React for relatively simple sites to make them attractive on the job market, until the bubble collapses.
But like with all generational phenomenae, the next wave of devs will tear down daddy-o's shit and rediscover simple tools and deploys without gobs of yaml.
K8S is developed by a multitude of very large companies. Each with their own agenta/needs. All of them have to be addressed. Thus the complexity. If you think about it they probably manage to keep the complexity to relatively low levels. Maybe because it is pushed to the rest of the ecosystem (see service meshes for example).
Being pushed by the behemoths also explains the popularity. Smaller companies and workers feel that this is a safe investment in terms of money and time familiarizing with the tech stack so they jump on. And the loop goes on.
Main business reason for all that though I think it's the need of Google et all to compete with AWS creating a cloud platform that comes to be a standard and belongs to no-one really. In this sense it is a much better, versatile and open ended openstack attempt.
And yes, there is less fancy companies like one where I work where we don't use Kubernetes because it's kind of overkill if all of your production workload fits onto 2 beefy bare metal servers.
I can see a point in using Docker to unify development and production environments into one immutable image. But I have yet to see a normal-sized company that gets a benefit from spreading out to hundreds of micro instances on a cloud and then coordinating that mess with Kubernetes. Of course, it'll be great if you're a unicorn, but most people using it are planning for way more scaling than what they'll realistically need and are, thus, overselling themselves on cloud costs.
hah, i've noticed that too - specifically around k8s/deployments/system architecture too. i've taken to calling it complexity fetishisation.
i think it stems from the belief/hope that, whilst they don't "google sized" data today, they need to allow for it.
i'll take two toasters, please.
I was pretty skeptical too but then handed over a project which was a pretty typical mixed bag: Ansible, Terraform, Docker, Python and shell scripts, etc... Then I realized relying on Kubernetes for most projects has the huge benefit of bringing homogeneity to the provisioning/orchestration which improves things a lot both for me and the customer or company I work for.
Let's be honest here, in many cases it does not make a difference whether Kubernetes is huge, inefficient, complicated, bloated, etc... or not. It certainly is. But just the added benefit of pointing at a folder and stating : "this is how it is configured and how it runs" is huge.
I was also pretty skeptical of Kustomize but it turned out to be just enough.
So, like many here. I kind of hate it but it serves me well.
Citation? In my experience companies hire more sysadmins when adopting k8s. It's trivial to point at the job reqs for it.
> Company saves money by not having to pay for managed cloud products if they don't want to
Save money?! Again citation. What are you replacing in the cloud with k8s? In my experience most companies using k8s (as you already admitted) don't have a ton of ops experience and thus use more cloud resources.
> Treating cloud providers like cattle not pets
Again. Citation? Companies go multi-cloud not because they want to but because they have different teams (sometimes from acquisition) that have pre-existing products that are hard to move. No one is using k8s to get multi-cloud as a strategy.
> It's going to eat the world (already has?).
Not it won't. It's actually on the downtrend now. Do you work for the CNCF? Can you put in a disclaimer if so?
> just deploying a single statically linked, optimized C++ server that can serve 10k requests per second from a toaster
completely un-necessary; most of the HN audience is not creating a c++ webserver from scratch; most of the HN audience can trivially serve way more than 10k reqs/sec from a single vm (node, rust, go, etc. are all easily capable of doing this from 1 vcpu)
Your C++ example is orthogonal to the deployment aspect because it discusses the application. Kubernetes and the fragile shell scripts are about the deployment of said application.
How are you going to deploy your C++ application? Both options are available, and I would wager that in most cases, Kubernetes makes more sense, unless you have strict requirements.
A "C++ monolith" allows me to potentially bypass a lot of this deployment stuff because it could serve lots (millions) of users from a single box.
I think for the average application there's still something to be said for manual cross-layer optimization between infrastructure, application, and how both are deployed.
What I mean is we can't yet draw too clear a line between the application and how it's deployed because there are real tradeoffs between keeping future options open and getting the product out the door. A strength of kubernetes is that if you get good at it it works for a variety of projects, but a lot of effort is needed to get to that point and that effort could have gone into something else.
Neither of those features are inherently impossible to do with GUIs. Alternatively, you can have a GUI editing your text based configuration.
Although, doesn't look as cool so, here we are.
In some cases, the cost of a managed cloud product may be cheaper than the cost of training your engineers to work with K8s. It just depends on what your needs are, and the level of organizational commitment you have to making K8s part of your stack. Engineers like to mess around with new tech (I'm certainly guilty of this), but their time investment is often a hidden cost.
> The alternatives are all based on kludgy shell/Python scripts or proprietary cloud products.
The fact that PaaS products are proprietary is often listed as a detriment. But, how detrimental is it really? There are plenty of companies whose PaaS costs are insignificant compared to their ARR, and they can run the business for years without ever thinking about migrating to a new provider.
The managed approach offered by PaaS can be a sensible alternative to K8s, again it just depends on what your organizational needs are.
You are writing this and i thought yesterday how to extend my current home k8s setup even further.
I would even manage that little c++ tool through k8s.
K8s brings plenty of other things out of the box: - Rolling update - HA - Storage provisioning (which makes backup simpler) - Infrastructure as code (whatever your shellscript is doing)
I think that the overhead k8s requires right now, will become smaller over the years, it will be simpmler to use it, it will become more and more stable.
It is already a really simple and nice control plane.
I like to use a few docker containers with compose. But if i already use docker compose for 2 projects, why not just using k8s instead?
How do you manage deployments for that C++ monolith? How is the logging? Logrotate, log gathering and analysis? Metrics, their analysis and and display? What happens when you have software developed by others that you might also to want to deploy? (If you can run a company with only one program ever deployed, I envy you).
All of that is simplified by kubernetes by simply making all stuff follow single way - "classical" approaches tend to make Perl blush with the amount of "There is more than one way to do it" that goes on.
.. but hire others to manage k8s? Or existing software engineers have to spend time doing so?
Many of the other points don't seem unique to k8s either.
I do like the alternative you've suggested though.
Most of it can be managed by text boxes on the front-end with selections and then it can just generate or edit the required files at the end of a wizard?
But then again it's actually the first public/popular attempt on a cloud OS. There might be a next one with better ergonomics than yamls.
Is that an incorrect understanding? I know C++ is supposed to be great for performance, but in truth I've never needed anything to be that fast. And if I can get the job done just as well with something I already know, I won't bother learning something like C++ which has a reputation for not being approachable.
But maybe I don't have full context?
An alternative is to have your program expose its pid somewhere, and your make file could send a signal to that pid when a new version is ready. The advantage is that if your program crashes on startup, you don't have to do something different to restart it.
If your application has to include an "auto-update" feature (like browsers), use that instead - its certainly better to eat your own dog food. Maybe just hack it a little bit so that you can force a check programmatically (e.g. by sending it a signal) and so that it connects to a local updates server.
It is true that C++ is an overly complicated language; if you don't need maximal performance, you have a lot of AoT languages that are a bit slower ( something around 0.5 C ) but more "user-friendly". In particular, if you are into servers and want fast edit-compile-run loops, Go might be a good choice.
In a world where you are billed by ressources used, wouldn’t it be a good idea to have light and fast services that don’t consume those ressources ?
I’ve been wondering this for a time now.
C++ is more for the extreme control over memory. An optimized C++ server can max out the NIC even on a single core and even with some text generation/parsing along the way.
If you do need higher availability you can go the route StackOverflow was famous for for quite some time of having a single ReadWrite master server, a backup master to take over and a few of ReadOnly slave servers, IIRC. With such setups you can ignore all the extra complexities cloud deployments bring with them. And just because such simple setups make it possible to treat servers like pets, doesn't mean they have to be irreproducible undocumented messes.
system("go build")
exec("./main")
Not a literal hot code reload that some advanced stuff get to enjoy, but nice enough to shorten the feedback loop.Even just working in TypeScript with TSed and a few basic strongly typed concepts (Rust's Result equivalent in TS, Option or Maybe, and typed http request/response, and Promise and JSON.parse) makes a big difference.
A lot less okay, just echo/print/log this object (or look up documentation eew), look at what does this look like and how to transform it into what I need. Instead you do that in the IDE.
• Company saves money by not having to hire Linux sysadmins
• Company saves money by not having to pay for managed cloud products if they don't want to
As a developer I want to right code, not manage a Kubernetes installation. If my employer wants the most value from my expertice they will either pay for a hosted environment to minimize my time managing it or hire dedicated staff to maintain an environment.
A lot of people is just really interested in having something complex instead of understanding their actual needs.
With hypervisors and managed environments taking over distributed computing, if there is a kernel derived from Linux or something completely different, it is a detailed that only the cloud provider cares about.
It's still Linux inside the container. Even if it's some abstract non-Linux service thing running the container, what happens in the container is still the concern of the developer.
The alternative is to have a old and boring cluster of X identical java nodes which host the entire backend in a single process... The deployment is done by a pedestrian bash script from a Jenkins. It used to work fine for too long I guess and folks couldn't resist "inventing" microservices to "disrupt" it.
k8s is popular because Docker solved a real problem and Compose didn’t move fast enough to solve orchestration problem. It’s a second order effect; the important thing is Docker’s popularity.
Before Docker there were a lot of different solutions for software developers to package up their web applications to run on a server. Docker kind of solved that problem: ops teams could theoretically take anything and run it on a sever if it was packaged up inside of a Docker image.
When you give a mouse a cookie, it asks for a glass of milk.
Fast forward a bit and the people using Docker wanted a way to orchestrate several containers across a bunch of different machines. The big appeal of Docker is that everything could be described in a simple text file. k8s tried to continue that trend with a yml file, but it turns out managing dependencies, software defined networking, and how a cluster should behave at various states isn’t true greatest fit for that format.
Fast forward even more into a world where everybody thinks they need k8s and simply cargo cult it for a simple Wordpress blog and you’ve got the perfect storm for resenting the complexity of k8s.
I do miss the days of ‘cap deploy’ for Rails apps.
Kubernetes is very complex and took a long time to learn properly. And there have been fires among the way. I plan to write extensively on my blog about it.
But at the end of the day: having my entire application stack as YAML files, fully reproducible [1] is invaluable. Even cron jobs.
Note: I don't use micro services, service meshes, or any fancy stuff. Just a plain ol' Django monolith.
Maybe there's room for a simpler IAC solution out there. Swarm looked promising then fizzled. But right now the leader is k8s[2] and for that alone it's worth it.
[1] Combined with Terraform
[2] There are other proprietary solutions. But k8s is vendor agnostic. I can and have repointed my entire infrastructure with minimal fuss.
Most companies who were late on the Cloud hype cycle (which is quite a lot of F100s) got to see second-hand how using all the nice SaaS/PaaS offerings from major cloud providers puts you over a barrel and don't have any interest in being the next victim, and it's coming at the same time that these very same companies are looking to eliminate expensive commercially licensed proprietary software and revamp their ancient monolithic applications into modern microservices. The culimination of these factors is a major facet of the growth of Kubernetes in the Enterprise.
It's not just hype, it has a very specific purpose which it serves in these organizations with easily demonstrated ROI, and it works. There /are/ a lot of organizations jumping on the bandwagon and cargo-culting because they don't know any better, but there are definitely use cases where Kubernetes shines.
I've yet to meet anyone who can easily explain how the CNI, services, ingresses and pod network spaces all work together.
Everything is so interlinked and complicated that you need to understand vast swathes of kubernetes before you can attach any sort of complexity to the networking side.
I contrast that to it's scheduling and resourcing components which are relatively easy to explain and obvious.
Even storage is starting to move to overcomplication with CSI.
I half jokingly think K8s adoption is driven by consultants and cloud providers hoping to ensure a lock-in with the mechanics of actually deploying workloads on K8s.
That said I think Kubernetes may be at its Productivity journey on the tech Hype cycle. Networking in Kubernetes is complicated. This complication and abstraction has a point if you are a company at Google scale. Most shops are not Google scale and do not need that level of scalability. The network abstraction has its price in complexity when doing diagnostics.
You could solve networking differently than in Kubernetes with IPv6. There is not a need for complicated IPv4 nat schemes. You could use native ipv6 addresses that are reachable directly from the internet. Since you have so many ipv6 addresses you do not need Routers/Nats.
Anyhow in a few years time some might be using something simpler like an open source like Heroku. If you could bin pack the services / intercommunication on the same nodes there would be speed gains from not having todo network hops going straight to local memory. Or something like a standardized server less open source function runner.
https://en.wikipedia.org/wiki/KISS_principle https://en.wikipedia.org/wiki/Hype_cycle
There are many arguments that IPv6 didn't solve too many IPv4 pain points, but if it solved something is definitively this.
1) It solves many different universal, infrastructure-level problems. 2) More people are using containers. K8s helps you to manage containers. 3) It's vendor agnostic. It's easy to relocate a k8s application to a different cluster 5) People see that it's growing in popularity. 6) It's Open source. 7) It helps famous companies run large-scale systems. 8) People think that it looks good on a resume and they want to work at a well known company. 9) Once you've mastered K8s, it's easy to use on problems big and small. (Note, I'm not talking about installing and administrating the cluster. I'm talking about being a cluster user.) 10) It's controversial which means that people keep talking about it. This gives K8s mind share.
I'm not saying K8s doesn't have issues or downsides.
1) It's a pain to install and manage on your own. 2) It's a lot to learn--especially if you don't think you're gonna use most of it's features. 3) While the documentation has improved a lot, it's still weak and directionless in places.
I think K8s is growing more popular because it's pros strongly outweigh it's cons.
(Note I tried to be unbiased on the subject, but I am a K8s fan--so much so that I wrote a video course on the subject: https://www.true-kubernetes.com/. So, take my opinions with a grain of salt.)
I still believe 90% of users would be better served by Nomad. And if someone says "developers want to use the most widely used tech", then I'm here to call bullshit, because the concepts between workload schedulers and orchestrators like k8s and nomad are easy enough to carry over from one side to the other. Learning either even if you end up using the other one is not a waste of time. Heck, I started out using CoreOS with fleetctl and even that taught me many valuable lessons.
The second reason is also about standards, but using them more assertively. Docker had way more attention and activity until 2016 when Kubernetes published the Container Runtime Interface. By limiting the Docker features they would use, they leveled the playing field between Docker and other runtimes, making Docker much less exciting. Now, new isolation features are implemented down at the runc level and new management features tend to target Kubernetes because it works just as well with any CRI-compliant runtime. Developing for Docker feels like being locked in.
Likewise, Linux is also a confusing mess of different parts and nonsensical abstractions when you first approach it. It does take some time to understand how to use it, and in particular how to do effective troubleshooting when things aren't working the way you expect.
But I 100% agree--I think it's the new Linux. In 5-10 years, it'll be the "go to", if not sooner.
Then a lot of people drink the koolaid and apply it everywhere / feel they're behind if they aren't in Kubernetes.
We are not in Kubernetes and have multiple datacenters with thousands of VMs/containers. We are doing just fine with the boring consul/systemd/ansible set up we have. We also have somethings running in Containers but not much.
Funnily enough at the OSS summit I had a couple of chats with people in the big companies (AWS, Netflix, etc.) and they themselves have the majority of their workflows in boring VMs. Just like us.
IMO containers are greatest for stateless apps that don't require much resources, but having a dedicated machine for them is a waste.
The smart people at Google knew that by quickly packaging their own internal tech and releasing it on open source they’d help people move from the incumbent AWS.
Helping customers switch IaaS hurts the both, lock in is better, but it hurts AWS way more. Proof? They made it free to run the necessary compute behind K8s control plane, until recently that was.
Are there benefits on running your biz’ web app using constructs made for a “cloud”? Sure there is, that’s why people are moving to K8s. There is real business benefits, given a certain amount of necessary moving parts. LinkedIn had such a headache with this they created Kafka.
I suspect most organisations’ Architects and IT peeps push for K8s as a moat for their skills and to beef up their resumé. They know full well that the value is not there for the biz’ but there’s something in it for them.
1. It's simple to get started with, but complex enough to tweak to your needs in respect to simplicity of deployment, scaling and resource definition.
2. It's appealingly cloud-agnostic just at the time where multiple cloud providers are all becoming viable and competitive.
I think it's more #2 and #1; as always, timing is everything.
The system becomes so complex that most people screw up simple things like redundancy, perimeter security and zero downtime updates.
I've seen all of the above from very bright and capable people.
Lets use Istio's "istioctl manifest apply" to deploy a service mesh to my cluster that allows me to pull auth logic / service discovery / load balancing / tracing out of my code and let Istio handle this.
Lets configure my app's infrastructure (Kafka (Strimzi), Yugabyte/Cockroach, etc) as yaml files. Being able to describe my kafka config (foo topic has 3 partitions, etc) in yaml is priceless.
Lets move my entire application and its infrastructure to another cloud provider by running a single bazel command.
k8s is the common denominator that makes all this possible.
can't... terraform make all of that possible?
You can write your own providers, you can use the provisioned support, but TF doesn't like that and it shows.
Again the cross cloud portability is a non starter, unless you're really at scale.
We moved to it from docker swarm because docker swarm still has a lot of glitches with its overlay network. Rolling upgrades would leave stale network entries and its impossible to reproduce. Sometimes it happens sometimes it doesn’t.
With a managed solution, Kubeadm, or RKE it’s not hard to deploy anymore. All our infrastructure is in code, is immutable, and if you’re careful can be deployed into any kubernetes cluster.
Just like Docker has been great for easily deploying open source products, kubernetes is great for doing the same thing when you need to deploy horizontally. It’s easy for OSS to provide a docker image, a docker compose file for single node deploy, and Kubernetes yaml for a horizontal deploy.
The environments advertise themselves via that same modified ingress's default backend. We stick a tiny bit of deploy yaml in our projects, the deployments kube tagging gives us all the details we need to provide diffs, last build time, links to git repos, web sites etc for the particular environment. The yaml demonstrates conclusively how an app could or should be run, regardless of os or software choice, so when we hand it to ops folks there is a basis for them to run from.
However, because enterprise ops prior to Kubernetes are both costly and brittle, Kubernetes just works for enterprises.
We had a huge PowerShell codebase and it was a nightmare to maintain. in the meantime, it's no way as robust as Kubernetes.
It's just as simple as that: sure, Kubernetes seems to be complex, but most enterprise stuff are even worse. At the same time, despite they are costly, the quality is usually pretty crappy because those scripts are written under delivery pressure.
I've noticed that there are a lot of replies such as "it is overhyped" and "I can just run a VM".
Kubernetes is not for you as your use case may not match what it does and solves. Kubernetes provides a standard way of running your applications. It is complex but logical. Yaml sucks but it is simple and logical. I prefer to use terraform for kubernetes but it is the same thing, simple and logical. You cannot say the same with puppet, chef, ansible etc. All of those configuration tools are a big mess of different setups and scripts. I can go to any company and understand how their system works quite quickly. It makes searching for answers easy too because it is standard.
When you are running several services and there is an outage, it is a godsend. You can instantly view the status of things, how they are configured and when they changed. That is POWERFUL.
It takes a while to understand how all of the resources fit together but that is the same case with any type of deployment system and/or operating system.
p.s. I am not running that huge of a system, maybe about 5k containers total between dev, staging and prod. Maybe 500k requests a day. Running a couple kubernetes clusters is significantly nicer than running things in ECS.
The kubernetes ecosystem is really amazing and full of invaluable resources. It's vast, complex, but well-thought. Getting to know all ins and outs of the project is time consuming. So much things to learn and so little time to practice...
I introduced K8s to our company back in 2016 for this exact reason. All I cared about was managing the applications in our data engineering servers, and Docker solved a real pain point. I chose K8s after looking at Docker Compose and Mesos because it was the best option at the time for what we needed.
K8s has grown more complex since then, and unfortunately, the overhead in managing it has gone up.
K8s can still be used in a limited way to provide simple container hosting, but it's easy to get lost and shoot yourself in the foot.
There are basically two relevant package managers. And say what you will about systemd, service units are easy to write.
It's weird to me that the tooling for building .deb packages and hosting them in a private Apt repository is so crusty and esoteric. Structurally these things "should" be trivial compared to docker registries, k8s, etc. but they aren't.
It's great for distributions, but not so great for custom developments where dependencies can either be out of date or bleeding edge or a mix of the twos. For these, a bundling approach is often preferable, and docker provides a simple to understand and universal way to achieve that.
That's for the packaging part.
Then you have the 2 other parts: publishing and deployment.
For publishing, Docker was created from the get go with a registry, which makes things relatively easy to use and well integrated. By contrast, for rpm and deb, even if something analog exists (aptly, pulp, artifactory...) it much more some tools created over time which work on top of one another, giving a less smooth experience.
And then, you have the deployment part, and here, with traditional package managers, it difficult to delegate some installs (typically, the custom app develop in-house) to the developers without opening control over the rest of the system. With Kubernetes, developers gained this autonomy of deployment for the pieces of software under their responsability whilst still maintaining separation of concerns.
Docker and Kubernetes enabled cleaner boundaries, more in line with the realities of how things are operated for most mid to large scale services.
dpkg, rpm, nix, snap, dnf, and I'm sure someone is going to respond with package managers I forgot.
> everybody thinks they need k8s and simply cargo cult it for a simple Wordpress blog
docker _also_ has this problem though. there are probably 6 people in the world that need to run one program built with gcc 4.7.1 linked against libc 2.18 and another built with clang 7 and libstdc++ at the same time on the same machine.
and yes, docker "provides benefits" other than package/binary/library isolation, but it's _really_ not doing anything other than wrapping cgroups and namespacing from the kernel - something for which you don't need docker to do (see https://github.com/p8952/bocker).
docker solved the wrong problem, and poorly, imo: the packaging of dependencies required to run an app.
and now we live in a world where there are a trillion instances of musl libc (of varying versions) deployed :)
sorry, this doesn't have much to do with k8s, i just really dislike docker, it seems.
The dependency thing is just the fallout of the (bad) default provided by distributions.
In production this model is quite a good way to guarantee your internal components aren't directly exposed too.
You are supposed to keep only a single process inside one docker container. If you want two processes to be tightly coupled then use multi-container pods.
At my company we have had better success with micro-services on AWS Lambda. It has vastly less overhead than Kubernetes and it has made the tasks of the developers and non-developers easier. "Lock-in" is unavoidable in software. In our risk calculation, being locked into AWS is preferable than being locked into Kubernetes. YMMV.
Oh boy I do not miss them. Actually I'm still living them and I hope we can finally migrate away from Capistrano ASAP. Dynamic provisioning with autoscaling is a royal PITA with cap as it was never meant to be used on moving targets like dynamic instances.
Add operators, complicated deployment orchestration and more sophisticated infrastructure... It is hard to know if things are failing from a change I made or just because there are so many things changing all the time.
But I'm not sure one can find something of "the right power" that has the same support from cloud providers, the open source community, the critical mass, etc. [1]
Eventually, a standard "simplified" abstraction over k8s will emerge. Many already exist, but they're all over the place. And some are vendor specific (Google Cloud Run is basically just running k8s for you). Then if you need the power, you can eject. Something like Create React App, but by Kubernetes. Create Kubernetes App.
[1] Though Nomad looks promising.
The test is going from zero to production traffic in a new cloud region.
Effectively, "every infrastructure as code project will reimplement Kubernetes in Bash"
Kubernetes is fine, but setting it up kind of feels like I'm trying to earn a PhD thesis. Swarm is dog-simple to get working and I've really had no issues in the three years that I've been running it.
The configs aren't as elaborate or as modular as Kubernetes, and that's a blessing as well as a curse; it's easy to set up and administer, but you have less control. Still, for small-to-mid-sized systems, I would still recommend Swarm.
The kind of people who has to both set the cluster up and keep it up and also has to develop the application and deploy it and keep it up etc is not the target audience.
K8s shines when the roles of managing the cluster and running workloads on it are separated. It defines a good contract between infrastructure and workload. It lets different people focus on different aspects.
Yes it still has rough edges, things that are either not there yet, or vestigial complexity of wrong turns that happened through it's history. But if you look at it through the lense of this corporate scenario it starts making more sense than when you just think of what a full-stack dev in a two person startup would rather use and fully own/understand.
Are you following "k8s the hard way"? I've never had this problem; either:
`gcloud container clusters create`
Or
`install docker-for-mac`
And you have a k8s cluster up and running. Maybe it's more work on AWS?
Once everything is "infrastructure as code", the app team becomes less dependent on other teams in the org.
People like to own their own destiny. Of course, that also removes a lot of potential scapegoats, so you now mostly own all outages, tech debt, etc.
I worked in networking for the longest time. When I started there network guys and server guys (at least where I was). They were different people who did different things who kinda worked together.
Then there were storage area networks and similar, networks really FOR the server and storage guys.... that kind of extended the server world over some of the network.
Then comes VMware and such things and now there was a network in a box somewhere that was entirely the server guy's deal (well except when we had to help them... always).
Then we also had load balances who in their own way were a sort of code for networks ... depending on how you looked at it (open ticket #11111 of 'please stop hard coding ip addresses').
You also had a lot of software defined networking type things and so forth brewing up in dozens of different ways.
Granted these descriptions are not exact, there were ebs and flows and some tech that sort of did this (or tired) all along. It all starts to evolve slowly into one entity.
We can build higher level abstractions easily having a schema to target and we can build them in whatever we want. That's a big boon for me :)
We have plenty of yaml 'code' which is simple and does exactly what it needs to do.
For all other usecases, there are plenty of alternatives, including libs for your preferred language.
Most people use the yaml way because its easy and does exactly what it needs to do.
Everyone else has plenty of well supported and well working alternatives.
When you get to that blog post please consider going in depth on this. Would love to see actual battletested information vs. the usual handwavy "it works everywhere".
Even ingress is trivial if you use a cloud balancer per ingress. But I wanted to save money so use a single cloud balancer for multiple ingresses. So you need something like ingress-nginx, which has a few vendor-specific subtleties.
In retrospect though, maybe it's exactly what I needed. Great suggestion.
- no RBAC
- no quotas
- no preemption
- no namespacing
This means: everyone is root on the cluster, including any CI/CD system that wants to test/update code. And there's no way to contain runaway processes with quotas/preemption.
Cause those are the parts that I miss probably the most when dealing with non-k8s deployment, and I haven't had the occasion to use Nomad.
The way I learned it in Bret Fisher's Udemy course, Swarm is very much relevant, and will be supported indefinitely. It seems to be a much simpler version of Kubernetes. It has both composition in YAML files (i.e. all your containers together) and the distribution over nodes. What else do you need before you hit corporation-scale requirements?
1. Swarm is dead in the water. No big releases/development afaik recently
2. Swarm for me has been a disaster because after a couple of days some of my nodes slowly start failing (although they’re perfectly normal) and I have to manually remove each node from the swarm, join them, and start everything up again. I think this might be because of some WireGuard incompatibility, but the strange thing is that it works for a week sometimes and other times just a few hours
3. Lack of GPU support
Cronjobs, configmaps, and dynamically allocated persistent volumes have been big ones for our small corporation. Access control also, but I'm less aware of the details here, other than that our ops is happier to hand out credentials with limited access, which was somehow much more difficult with swarm
Swarm has frankly also been buggy. "Dead" but still running containers - sometimes visible to swarm, sometimes only the local Docker daemon - happen every 1-2 months, and it takes forever to figure out what's going on each time.
I see ruby kind of uses YAML often but are people comfortable editing YAML files? I always have to look up how to do arrays and such when I edit them once in a while.
Open cron.yaml and see. With schedule. Self documented.
Amazing. Every time. Even when some as my k8s battle wounds are still healing (or permanently scarred). See other replies for more info.
That’d be a great first step if the purpose is to learn Kubernetes. If, however, you want to set up a cluster for real use then you will need much more than bare bones Kubernetes (something that solves networking, monitoring, logging, security, backups and more) so consider using a distribution or a managed cloud service instead.
Note: my experience was all with cloud-provided Kubernetes, never running my own. So it was already an order of magnitude easier. Can't even imagine rolling my own. [2]
[1] My personal favorite. Truly egregious, despite how amazing k8s is. https://github.com/kubernetes/kubernetes/issues/63371#issuec...
[2] https://github.com/kelseyhightower/kubernetes-the-hard-way
I’ve deployed swarm in a home lab and found it really simple to work with, and enjoyable to use. I haven’t tried k8, but I often see view points like yours stating that k8 is vastly superior.
Edit: not sure why the down votes, I was just trying to point out what seems like a big distinction that the article is trying to make.
How about "infrastructure-as-some-sort-of-text-file-versioned-in-my-repository". It's a mouthful, but maybe it'll catch on.
The difference between code and data is pretty big.
One implies an expectation that the user is going to write some kind of algorithm whereas the other is basically a config file.
We used K8S on a large project and I felt like it really, really wasn't necessary.
So terraform creates the cluster, DNS and VPC. Then k8s runs pretty much everything.
Spin up a K8S in any major cloud provider and you get all this with a consistent API, which is where the value lies.
What is wrong with actually using Google’s hosted Kubernetes that would make you want to run compose yourself in their VMs and setup auto scaling, physical machine upgrades, etc.
But thats not the point.
k8s is the first infrastructure control plane in existence which is widley supported and standardized.
Can you get a managed docker swarm on aws, gcp, azure, do etc.? no.
How many kubernetes offerings do you know? I know at least 6.
Revolution doesn't need to be flashy and noisy.
The meaning of "app" on top of these two operating system abstractions is entirely different and the comparison probably doesn't extend beyond this. From a computing stack standpoint though, it makes sense.
Remember the first time you saw the AWS console? And the last time?
Besides, personally I find AWS console much easier to understand. I don't get why people hate it.
Most situations I have a direct comparison, k8s takes less ops. Often thanks to helm.
The AWS console is designed for lockin and I could use configuration management for AWS too but the time required to go through their way of doing x is just not worth it. Unless I want to become a AWS solutions architect consultant
There was a time in between for me - that was Rightscale.
For me, the real thing that k8s bring is not hardware-infra - but reliable ops automation.
Rightscale was the first place where I encountered scripted ops steps and my current view on k8s is that it is a massively superior operational automation framework.
The SRE teams which used Rightscale at my last job used to have "buttons to press for things", which roughly translated to "If the primary node fails, first promote the secondary, then get a new EC2 box, format it, install software, setup certificates, assign an elastic IP, configure it to be exactly like the previous secondary, then tie together replication and notify the consistent hashing."
The value was in the automation of the steps in about 4 domains - monitoring, node allocation, package installation and configuration realignment.
The Nagios, Puppet and Zookeeper combos for this was a complete pain & the complexity of k8s is that it is a "second system" from that problem space. The complexity was always there, but now the complexity is in the reactive ops code, which is the final resting place for it (unless you make your arch simpler).
I'm sorry, but I can't stand this kind of bullshit. You cannot possibly take two random things, put 'modern' in front of one word and 'ancient' in front of another, to justify changing things.
The problem of Kubernetes is probably that people started drinking the microservices koolaid and now need complex solution to deploy their software that became more complex when they adopted a microservices architecture.
Today Kubernetes is the antithesis of the cloud - Instead of consuming resources on demand you're launching VMs that need to run 24/7 and have specific roles and names like "master-1". Might as well rent bare-metal servers. It will cost you less.
I knew literally nothing about k8s in September and now I have multiple clusters humming along, treating the worker cluster nodes as a generic pool of compute, autoscaling the cluster as well as the pods inside it. Upgrading is a breeze, I have great observability, I can deploy experiments and new applications with a single CI step or click, in fact I have nodes that are killed and get replaced for cost savings by SpotInst in the middle of the business day and I don't even need to know about it. My load balancers and even DNS are all provisioned for me and I can use the same Helm charts to create an identical staging and production environment.
Kubernetes IS the spirit of the Cloud and 12 factor apps. It's not that scary, and with tools like Rancher and k3s you can make it even simpler.
https://kubernetes.io/blog/2016/07/autoscaling-in-kubernetes...
Long term, but up front costs are what make cloud services appealing.
FWIW, it's possible to minimize your idle VM costs to an extent. For example, you could use one or more autoscale groups for your cluster and keep them scaled to one vm each. Then use tools like cluster auto scaler to resize on demand as your workload grows. You are correct that idle vm costs can't be completely avoided. At least not as far as I am aware.
But as of where we are now, it is a good abstraction to get there. It provides a lot of stuff like service discovery, auto-scaling and redundancy. Yes you do need to have instances to run K8s, but that is as of date the only abstraction that we have on all cloud providers, local virtualization, and bare metal. So yes it isn't true on demand "cloud" but in order to work like that you need to fit into your service provider's framework and accept limitations on container size, runtime, deal with warm up times occasionally.
In light of this statement, what do you make of the fact that billions of dollars are spent on EC2? And of the people who spend that money?
Oh darn, I still don’t understand. Maybe I should learn what Docker is first?
The buzzword mumbo-jumbo on the first paragraph alone (which isn't really even your fault or anything, just the bogus pomp inherent to k8s as a whole) is already a scarecrow to anyone that "wasn't born with the knowledge", really.
It is pretty hard to get used to it. Brushing it away won't make it approachable.
A single Service with Type=LoadBalancer and one Deployment may be all you need on Kubernetes if you just want all connections from the load balancer immediately forwarded directly to the service.
But if you have multiple different services/deployments that you want as accessible under different URLs on a single IP/domain, then you'll want to use Ingresses. Ingresses let you do things like map specific URL paths to different services. Then you have an IngressController which runs a webserver in your cluster and it automatically uses your Ingresses to figure out where connections for different paths should be forwarded to. An IngressController also lets you configure the webserver to do certain pre-processing on incoming connections, like applying HTTPS, before proxying to your service. (The IngressController itself will usually use a Type=LoadBalancer service so that a load balancer connects to it, and then all of the Ingresses will point to regular Services.)
Kubernetes should have been IPv6 only, with optional IPv4 ingress controllers.
I really hope someone takes the mantle of Leslie Lamport (creator of the language TLA - "the quixotic attempt to overcome engineers' antipathy towards mathematics") and replaces Kubernetes with some software with a first principles approach.
An ingress object creates an nginx/nginx.conf. That nginx server has an IP address which has a round robin IPVS rule. When it gets the request it proxy's to a service ip which then round robins to the 10.0.0.0/8 container IP.
Ingress -> service -> pod
It is all very confusing but once you look behind the curtain it's straight forward if you know Linux networking and web servers. The cloud providers remove the requirement of needing Linux knowledge.
Looking at the docs, ingress-nginx configures an upstream using endpoints, which are essentially Pod IPs, with skips kubernetes service based round-robin networking altogether.
Assuming you use an ingress that does configure services instead, and assuming you're using a service proxy that uses ipvs (i.e. kube-proxy in default settings) then your explanation would have been correct.
For the most part, kubernetes networking is as hard as networking with loads of automation. Often, depth in both those skills are pretty exclusive, but if you're using the popular and/or supported CNI not doing things like changing in-flight, your average dev just needs to learn basic k8s debugging such as kubectl get endpoints to check whether his service selectors are setup correctly, and curl them to check whether the pods are actually listening on those ports.
Is there an easier + simpler alternative?
K8S is extremely complicated for huge swarm of webdevs and java developers that really reqlly dont understand how the stuff they use/code really works.
K8S was supposed to decrease the need for real sysadmins but in my view it actually increased the demand because of all the obscure issues one can face in production if they dont really understand what they are doing with K8S and how it works under the hoods.
Which I find hilarious.
Badly! That'll be $500, thanks for your business.
On a serious note, the whole stack is keeping ok-ish coherence considering the number of very different parties putting a ton of work into it.
In a few years' time it'll be the source of many war stories nobody cares about.
Also, after OpenStack, the bar for "consulting-driven software" is far from reached :)
Where is the momentum?
Hosted GKE costs the same per month as an hour of DevOps time, what's wrong with paid management for k8s?
The issue that I have with managed k8s is that these products will decrease the pressure to improve k8s documentation, tooling and setup itself. And then there's folks (like me) who want or need to run something like k8s on bare metal hardware outside of a cloud where the cloud-managed solution isn't available.
As a relatively noob sysadmin, I liked it a lot. Easy to deploy and easy to maintain. We've got a lot of mixed rented hardware + cloud VPS, and having one layer to unify them all seemed great.
Unfortunately I had a hard convincing the org to give it a serious shot. At the crux of it, it wasn't clear what 'production ready' Nomad should look like. It seemed like Nomad is useless without Consul, and you really should use Vault to do the PKI for all of it.
It's a bit frustrating how so many of the HashiCorp products are 'in for penny, in for a pound' type deals. I know there's _technically_ ways for you use Nomad without Consul, but it didn't seem like the happy path, and the community support was non-existent.
Please tell me why I'm wrong lol, I really wanted to love Nomad. We are running a mix of everything and its a nightmare
Consul by itself is the game-changer. Even in k8s it's a game-changer. It solves so many questions in an elegant way.
"How do I find and reach the things running in (orchestrator) with (unknown ip/random port) from (legacy)?" being the most important. You run 5 servers, and a relatively lightweight client on everything (which isn't even outright required, but it sure is useful!), and you get a _lot_ with that.
Consul provides multiple interfaces and ingress points to find everything. It also is super easy to operate, and has a pretty big community.
If you absolutely cannot have Consul, Nomad is still a really good batch job engine, and makes a very great "distributed cron," which is more extensible, scalable, and easy to use than something like Jenkins for the same task.
My team is pretty small (was 4 people, now 6) and we manage one of the worlds largest nomad and consul clusters (there are some truly staggeringly large users of Vault so I won't make that claim). Even when shit really hits the fan, everything is designed in a way that stuff mostly works; and there's enough operator friendly entry points that we can always figure out the problem.
Edit: just noticed an actual Nomad user replied as well, and I like their answer better. Consider mine an addendum. :)
Batch workloads rarely require Consul, but for deploying your standard network services on Nomad: Consul is basically required. You could likely use any number of service mesh systems instead (either as sidecars, Docker network plugins, or soon CNI), but you'll be doing a lot of research and development on your own I'm afraid.
The Nomad team is by no means opposed to becoming more flexible in the future (and indeed better CNI support is landing soon as a first step), but we wanted to focus on getting one platform right and a pleasure to use before trying to genericize and modularize it.
I think a Distributed OS is the only sane solution. Build the features we need into the kernel and stop futzing around with 15 abstractions to just run an isolated process on multiple hosts.
Linux (and the BSDs) are remarkably stable, festureful, and resilient operating systems. I would hate to give up such a strong foundation. Nomad can crash without affecting your running services. Nomad can be upgraded or reconfigured without affecting your running services. Nomad can be observed, developed, and debugged as a unit often without having to consider the abstractions that sit above or below it. The right number of abstractions is a beautiful thing. Just no more and no less. :)
Well sure, but if the story just ended with "everyone use the least exciting tool", then there'd be few articles for tech journals to write.
But Kubernetes promises so much, and deep down everyone subtly thinks "what if I have to scale my project?" Why settle for good enough when you could settle for "awesome"? It's just human nature to choose the most exciting thing. And given that I do agree that there's some manufactured hype around Kubernetes, it isn't surprising to me why few are talking about Nomad.
Isn't the most popular k8s case to deploy Docker images still though?
A lot of the Kubernetes "cool kids" just run containerd instead of Docker. Docker itself also runs containerd, so when you're using Kubernetes with Docker, Kubernetes has to basically instruct Docker to set up the containers the same way it would if it were just talking to containerd directly. From a technical perspective, you're adding moving parts for no benefit.
If you use containerd in your cluster, you can then use Docker to build and push your images (from your own or a build machine), but pull and run them on your Kubernetes clusters without Docker.
But deploy to k8s? There's no docker outside of few bits involving "how to get to the image", and the actual docker features that are used are also minimized. The result is that many warts of docker are completely bypassed and you don't have to deal with impact of legacy decisions, or try to wrangle system designed for easy use by developer at local machine into complex server deployment. And, IMHO, interfaces used by k8s for the advanced features are much, much better than interfaces used or exported by docker.
What k8s brings to the table is a level of standardization. It's the difference between bringing some level of robotics to manual loading and unloading of classic cargo ships, vs. the fully automated containerized ports.
With k8s, you get structure where you can wrap individual program's idiosyncracies into a container that exposes standard interface. This standard interface allows you to then easily drop it into server, with various topologies, resources, networking etc. handled through common interfaces.
I said that for a long time before, but recently I got to understand just how much work k8s can "take away" when I foolishly said "eh, it's only one server, I will run this the classic way. Then I spent 5 days on something that could be handled within an hour on k8s, because k8s virtualized away HTTP reverse proxies, persistent storage, and load balancing in general.
Now I'm thinking of deploying k8s at home, not to learn, but because I know it's easier for me to deploy nextcloud, or an ebook catalog, or whatever, using k8s than by setting up more classical configuration management system and deal with inevitable drift over time.
I think the build, deploy, start and run-time split is an important aspect that gets overlooked quite a bit, and is critical to evaluating tools at this point. That is why we aren't still doing everything with Chef or Puppet. Whether we continue doing it with Kubernetes or Pulumi or something else matters a bit less.
Repeatability is not the goal, as others in this thread have implied. The goal is trusting that the button will work when you push it. That if it doesn't work, you can fix it, or find someone who can. Doing that without repeatability is pretty damned hard, certainly, but there are ways to chase repeatability without ever arriving at the actual goal.
can't you do that just with containers?
Are we talking about k8 base on your own server rack at your house?
Kubernetes is one way to deploy containers. Configuration systems like Ansible/Salt/Puppet/Chef/etc are another way to deploy containers.
Kubernetes also makes it possible to dynamically scale your workload. But so does Auto Scaling Groups (AWS terminology) and GCP/Azure equivalents.
The reality is that 99% of users don't actually need Kubernetes. It introduces a huge amount of complexity, overhead, and instability for no benefit in most cases. The tech industry is highly trend driven. There is a lot of cargo culting. People want to build their resumes. They like novelty. Many people incorrectly believe that Kubernetes is the way to deploy containers.
And they (and their employers) suffer for it. Most users would be far better off using boring statically deployed containers from a configuration management system. Auto-scaled when required. This can also be entirely infrastructure-as-code compliant.
Containers are the real magic. But somehow people confused Kubernetes as a replacement for Docker containers, when it was actually a replacement for Docker's orchestration framework: Docker Swarm.
In fact, Kubernetes is a very dangerous chainsaw that most people are using to whittle in their laps.
So many people miss this. k8s is a very complex system and the talent it takes to manage it well, rare.
Extremely rare.
I've never used it but might find myself using it in the future.. can you elaborate on these a bit? I'm curious what the pitfalls might be
As far as I can tell: those are imperative. At least in some areas.
Kubernetes is declarative. You mention the end state and it just "figures it out". Mind you, with issues sometimes.
All abstractions leak. Note that k8s's adamance about declarative configuration can make you bend over backwards. Example: running a migration script post deploys. Or waiting for other services to start before starting your own. Etc.
I think in many ways, those compete with Terraform which is "declarative"-ish. There's very much a state file.
I would be somewhat surprised to find out Puppet and Chef weren't declarative either. Because setting up a system in an imperative fashion is ripe for trouble. You may as well use bash scripts at that point.
I've used Ansible for close to 10 years for hobby projects. And setting up my development environment. Give me a freshly installed Ubuntu laptop, and I can have my development environment 100% setup with a single command.
But basic versions of these things are provided by Kubernetes natively and can be declared in a way that is divorced from configuring the underlying software. So you just learn how to configure these broader concepts as services or ingresses or network policies, etc, and don't worry about the underlying implementations. It's pretty nice actually.
Kubernetes isn't a silver bullet of course, there will be applications where running it in containers adds unnecessary complexity, and those are best run in a VM managed by a CM tool. I'd argue using k8s is safe default for deploying new applications going forward.
Unlike Ansible (and I suspect the others) where it's really only more of a 'run once' type of thing... And sometimes if you try running it a second time it won't even succeed.
What k8s really scales is the developer/operator power. Yes, it is complex, but pretty much all of it is necessary complexity. At small enough scale with enough time, you can dig a hole with your fingers - but a proper tool will do wonders to how much digging you can do. And a lot of that complexity is present even when you do everything the "old" way, it's just invisible toil.
And a lot of the calculus changes when 'managed services' stop being cost effective or aren't an option at all, or you just want to be able to migrate elsewhere (that can be at low scale too, because of being price conscious).
Then the question becomes "Is K8 more of a shovel or an excavator". I think its fair to say that its more the latter.
Sure, managed service costs are certainly a thing, but to my point that only really start to become an issue at significant scale, assuming you're well configured.
That's not actually simple at all, and you would need to build a lot of the other stuff that Kubernetes gives you for free.
Kubernetes gives you an industry standard platform with first-class cloud vendor support. If you roll your own solution with ECS, what you are really doing is making a crappy in-house Kubernetes.
Previously we had to deploy a lot of monitoring on each VM to ensure that containers are running, we get alerted when one of the application crashed and didn't restart because Docker daemon didn't handle it etc etc.
Now, we only run stateless services, in a private VPC subnet, Load balancing is delegated to ALB, we don't need service discovery, meshes etc. Configuration is declarative, but written in much friendlier HCL (I'm ok with YAML, but to a degree). ECS just works for us.
Just like K8S might work for a bigger team, but I wouldn't adopt it at our shop, simply because of all of the complexity and huge surface area.
And, we only have to learn one complex system and avoid learning each cloud, one of which decided product names which have little relation to what they do was a good idear
Not this one. :)
I run Docker on Windows Containers, no Linux required.
There are also the ugly named serverless deployments, where the kernel is meaningless.
You need some system mediating between people doing deployments and actual root access in both cases. The "docker" command is just as privileged as "apt-get install." I have always been behind some kind of API or web UI even in docker environments.
Currently mainstream k8s is text based, because it's still too fast moving and new. Creating a great GUI would be a serious overhead and there's not enough interest/demand for it. It'll come eventually.
Last year, my bare metal website had 99.995% uptime. Heroku only managed 99.98%.
Of course, I could further reduce risk by having a hot standby server. But I'm not sure the costs for that are warranted, given the extremely low risk of that happening.
Anything new needs a certain amount of sustained practice before you get the hang of it. I think I had to learn regex like four times before it stuck. I haven’t hit that point with TOML yet so I avoid it.
I’d suggest using the deeper indentation style where hyphens for arrays are also indented two spaces under the parent element. Like anything use a linter that enforces unambiguous indentation.
I prefer YAML for human-writeable config because JSON is just more typing and more finicky. The auto-typing of numbers and booleans in YAML is a pretty damn sharp edge though and I wish they’d solved that some other way.
Maintaining a cluster set up like that is a ton of work. And if you don’t perform an upgrade perfectly, you’ll have downtime. Tools like kops help a lot but you’ll still spend far more time than the $70/month it costs for a managed cluster.
Honestly, as an ops guy, I would prefer to get up at 3AM do deal with a failed VM or load balancer, compared to dealing with any kind of failure in a Kubernetes cluster at 10AM.
I can understand wanting to be able to deploy to Kubernetes, it’s extremely flexible and relatively easy. But managing and debugging Kubernetes is still a nightmare, even just monitoring it correctly isn’t exactly easy.
Yes, higher-level tools like Kustomize or Jsonnet or whatever else you use for templating the files are Turing-complete - but that's at the level of you on your machine generating input to Kubernetes, not at the level of Kubernetes itself. That's a valuable distinction - it means you can't have a Kubernetes manifest get halfway through and fail the way that you can have an Ansible playbook get halfway through and fail; there's no "halfway." If something fails halfway through your Jsonnet, it fails in template expansion without actually doing anything to your infrastructure.
(You can, of course, have it run out of resources or hit quota issues partway through deploying some manifest, but there's no ordering constraint - it won't refuse to run the "rest" of the "steps" because an "earlier step" failed, there's no such thing. You can address the issue, and Kubernetes will resume trying to shape reality to match your manifest just as if some hardware failed at runtime and you were recovering, or whatever.)
A B2B app that has at most 10 concurrent requests can run on the smallest EC2 instance whether it's written in PHP or written in C++.
Bandwidth doesn't change with language choice and neither does the storage requirements of your app so those billable items don't come into the equation either.
So the CPU cost can effectively be dropped off of 95% of the apps out there today. At that point your main variable cost between C++ and something like PHP/Javascript is going to be the cost of development. All I can say to that is that it's a lot harder to find developers who can write C++ web apps at the same pace as developers slinging PHP for web apps. There is a reason Facebook uses a PHP derivative for huge portions of its web backend.
So maybe the question we should ask ourselves is: why isn’t there a smaller, cheaper EC2 instance (or any other provider than AWS) ?
This industry is tailoring the levels. Of course it’s understandable because, well, they live on it. And they count on small instances to share hardware ressources to overbook said hardware.
And I don’t blame them for that, I’m doing the same on my own bare metal servers, hosting multiple websites for clients and making money on it.
But I have the feeling that there is a lot of ressource loss somewhere in it, just for the sake of loosing it because it’s easier. Maybe I’m wrong.
AWS LightSail.
... also the t1.micro is small and cheap. Could you give some concrete numbers?
The problem as you go cheaper is the cost of ipv4 addresses.
The domain I'm working in might be non-representative, but for me fixing my shit systematically means switching from C++ to Rust. The problems the borrow checker addresses come up all time either in the form of security bugs (because humans are not good enough for manual memory management without a lot of help) or in the form of bad performance (because of reference counting or superfluous copies to avoid manual memory management).
But otherwise I agree with you that if we never put in the effort to polish our current tools, we'll only ever get the next 80%-ready solution out of the hype train.
However, it doesn't matter how much modern we make C++, if you don't fully control your codebase, there is always going to exist that code snippet written in C style.
"Frankly, it’s been tough to convince the largest enterprises that a public grid represents an attractive future. Just as I’m sure George Westinghouse was confounded by the Chief Electricity Officers of the time that resisted buying power from a grid, rather than building their own internal utilities."
https://jonathanischwartz.wordpress.com/2006/03/20/the-netwo...
> Long term, but up front costs are what make cloud services appealing.
There are no up front costs. GP said rent dedicated, not buy your own metal. If there's anything in cloud it's the many pre-written services (queue, database etc) but GP is right: if you go k8s you aren't going to use many/at all so why not just go and rent cheap servers that get deployed in two minutes instead of renting expesive virtual servers which get deployed in a few seconds?
By the way, what does ansible do to help with scaling applications?
My org looked at Nomad at a time when there was a lot of pressure from above to deliver something as soon as possible. Two weeks just weren't enough to full lay of the land ¯\_(ツ)_/¯
Funny thing is even if I could plug in my own service discovery into Nomad, I would probably chuck it away and replace it with Consul after a few weeks anyway haha
1st Question : Define k8s network, in detail, with all of the services and a set of services
IF you make it out of that one and the follow ups we can move on to the rest
So we end up with a plethora of full stack developers who can barely keep up with their current development stacks willfully deploying their software on systems that they're just barely competent with.
I know this because I almost deployed a side project with Kubernetes because it was expected of me despite the fact that being mediocre at it was the best that I could hope to become and that's an easy way to chop off a leg or three.
The cost metrics that make "it's cheaper to use managed service than pay the cost of extra engineer to specialize in infrastructure" aren't universal. In fact, I usually have to work from the opposite direction, where hiring a senior Ops specialist who can wrangle everything from shelving the physical hw to network booting k8s cluster on-premises can be cheaper that Heroku/AWS/etc.
i would argue that relying on docker hiding public visibilty of your internal components is akin to using a mobile phone as a door-stop - it'll probably work but there are more appropriate (and auditable) tools for the job.
But, yaml is now everywhere in the ops space. Config management systems use it, metrics systems use it, its the defacto configuration format right now and that is unfortunate cause its bad.
I've had clusters running for years without issue. I've even used it for packaging B2B software, where customers use it both in cloud and on-prem - no issues whatsoever.
I've looked at k8s a few times, but it's vastly more complex than Swarm (which is basically Docker Compose with cluster support), and would add nothing for my use case.
I'm sure a lot of people need the functionality that k8s brings, but I'm also sure that many would be better suited to Swarm.
If K8s supported compose scripts out of the box (not Kompose) that'd basically make Swarm unnecessary (at least for me)
Switched to k8s in late 2017 and it’s been much more solid. And that’s where the world has moved, so I’m not sure why you’d choose swarm anymore.
I would argue that Kubernetes has _nothing whatsoever_ to do with the spirit of the Cloud, and that in fact "serverless" embodies the spirit of pay-as-you-use consumption models.
Although, managed kubernetes clusters let you auto-scale the cluster itself, so i think the GP is wrong.
This tool allows you to autoscale the cluster itself with various cloud providers. There is a list of cloud providers it supports at the end of the readme.
I'm assuming your team is using vault for PKI, but is there a similarly happy path for issuing certs without Vault.
I started off just using `openssl` but it all felt very janky, and I didn't really have any idea how CRLs should be setup
For now, we have CRLs disabled on all short-lived backends, enabled on long-lived backends and we're actually looking at disabling storing short-lived certs in the storage system at all, and just cranking the TTL down to really truly short. We've tested it as low as 30m, but a more real-world max-ttl is 1 week, with individual apps setting it as low as they can handle. For reference we run more than 10 PKI backends, and adding one (or a bunch) more is just a little terraform snippet for us.
The way it works via hashicorp template land, is that you just plop
{{ with secret "name-of-pki/issue/name-of-role" "common_name=my.allowed.fqdn" "ttl=24h" }} {{ .Data.certificate }} {{ end }}
into your Nomad template stanza, or use consul-template directly as a binary, or use vault agent with it's template capability. You can get the CA chain if required the same way, just hitting a different PKI endpoint.Also, as of Vault 1.4, Vault's internal raft backend is now production ready, making it a snap to run.
Try running through a few of the Vault quick-start guides, and replicating them in Terraform as much as possible. There's a few things TF does not handle gracefully last I checked (initial bootstrap), but you can get around that by using a null_resource or just handling that outside Terraform.
systemd-nspawn / machined makes the other systems look like very complicated solutions in search of a problem
Name may not be pretty but it's an official feature of systemd which is used to debug the systemd development and it is far easier to take backups incrementally because the container files are just plain files in /var/lib/machines/ and apparently you already have it if systemd is on your system. (May need an additional package to be installed from OS package repo.)
I run nspawn instances as development environments for developers and I can also run docker inside it.
I get a lot of mileage serving a lot of content for several domains on a free Google Cloud Platform f1 micro instance. I also prefer GCP when I need a lot of compute for a short time.
Hetzner has always been my choice when I need more compute for a month or two. For saving money for VPSs OVH and DO have also been useful but I don’t use them very often.
The only problem with these Linux based packaging for deployments are Mac users and their dev environment. Linux users are usually fine, but there always had to be some Docker like setup for Mac users.
If we could say that our servers run on Linux and all users run on some Linux (WSL for Windows users) then deployments could have been simple and reproducible rpm based deployments for code and rpm packages containing systemd configuration.
Complete breeze and no need for Docker or K8s.
dnf is a frontend to rpm, snap is not common for server use-cases, nix is interesting but not common, dpkg is a tool for installing .deb.
The older i get the more i realise the less i want in my stacks.
This article listed as a benefit, frequent, multiple major updates each year, new features and no sign of it slowing down. I just cringed and wondered who the fuck is asking for this headache?
Ive been working a lot with wordpress lately and the stability of the framework is spoiling me rotten.
And then one day I decided to set up kubernetes as a learning experiment. There is definitely some learning curve about making sure I understood what deployment, or replicaset or service or pod or ingress was, and how to properly set them up for my environment. But now that I have that, adding a new app to my cluster, and making it accessible is super low effort. i have previous yaml files to base my new app's config on.
It feels like the only reason not to use it would be learning curve and initial setup... but after I overcame the curve, it's been a much better experience than trying to orchestrate containers by hand.
Perhaps this is all doable without kubernetes, and there is a learning curve, but it's far from the complicated nightmare beast everyone makes it out to be (from the user side, maybe from the implementation details side)
It would mean I removed ~20% of the things that were annoying me and left 80% still to solve, while kubernetes goes 80% for me with the remaining 20% being mostly "assembly these blocks".
Plus, a huge plus of k8s for me was that it abstracted away horrible interfaces and behaviours of docker daemon and docker cli.
So there's this problem and a number of experiments are going on. One camp has the idea of wrapping data / config in more code. These are your Pulumi and Darklang like systems. Then there is another camp that say you should wrap code in data and move away from programming, recursion, and Turing completeness. This seems like the right way to me for a lot of reasons both technical and haman centric.
I've pivoted my company (https://github.com/hofstadter-io/Hof) to be around and powered by Cue. Of the logical camp, it is by far going to be the best and comes from a very successful lineage. I'm blown away by it like when I found Go and k8s.
If AWS goes belly-up then we use terraform to make the equivalent on whatever new cloud provider pops up and keep going about our day.
No vendor lock-in
It's confusing because a lot of people being exposed to K8s don't necessarily know how Linux networking and web servers work. So there is a mix of terminology (services, ingress, ipvs, iptables, etc) and context that may not be understood if you didn't come from running/deploying Linux servers.
The issue was mainly with the admins not provisioning enough capacity, but for us devs it was fucking magical.
I currently use exoframe with docker-compose files, and it's fantastic.
Yeah, another tool, I know.
It's hard to make tests maintainable. Doubly so if you aren't already versed in techniques to make code maintainable.
I wonder sometimes if we aren't repeating the same experiment with ops right now.
Mainly just kustomize piped into kube apply.
But, but, but. Having to a create a one-off database migration script imperatively.
No it's not. You can use it to run bunch of monoliths too. K8s provides a common API layer that all of your organisation can adhere to. Just like containers are a generic encapsulation of any runable code.
I can leave my current job, jump into a new one and start providing value within less than couple of days. Compared that to spending weeks if not months trying to understand their special snowflake of an infrastructure solving the same problems already solved a million times before.
No it doesn't. Let's assume you write that application as a C++ monolith. Congratulations, you now have source code that could potentially serve 10k users on a toaster... If only you could get it onto that toaster. How are you going to start the databases it needs? How are you going to restart it when it crashes, or worse: When it still runs but is unresponsive. How are you going to upgrade it to a new version without downtime? How are you going to do canary releases to catch bugs early in production without affecting all users? How do you roll back your infrastructure when there is an issue in production? How do you notice when your toaster server diverges from it's desired state? How do you handle authorization to be compliant with privacy regulations? I'd love to see that simple and safe shell script of yours which handles all those use cases. I'm sure you could sell it for quite a bit of money.
What you fail to understand is that k8s never was about efficiency. Your monolith may work at 10k users with a higher efficiency but it can never scale to a million. At some point you can't buy any bigger toasters and have no choice but to make a distributed system.
Besides, microservice vs monolith is orthogonal to using k8s.
I use Convox [1] which makes everything extremely simple and easy to set up on any Cloud provider. They have some paid options, but their convox/rack [2] project is completely free and open source. I manage everything from the command-line and don't use their web UI. It's just as easy as Heroku:
convox rack install aws production
convox apps create my_app
convox env set FOO=bar
convox deploy
You can also run a single command to set up a new RDS database, Redis instance, S3 bucket, etc. Convox manages absolutely everything: secure VPC, application load balancer, SSL certificates, logs sent to CloudWatch, etc. You can also set up a private rack where none of your instances have a public IP address, and all traffic is sent through a NAT gateway: convox rack params set Private=true
This single command sets up HIPAA and PCI compliant server infrastructure out of the box. Convox automatically creates all the required infrastructure and migrates your containers onto new EC2 instances. All with zero downtime. Now, all you need to do is sign a BAA with AWS and make sure your application and company complies with regulations (access control, encryption, audit logs, company policies, etc.)I run a simple monolithic application where I build a single Docker image, and I run this in multiple Docker containers across 3+ EC2 instances. This has made it incredibly easy to maintain 100% uptime for over 2 years. There were a few times where I've had to fix some things in CloudFormation or roll back a failed deploy, but I've never had any downtime.
My Docker images would be much smaller and faster if I built my backend server with C++ or Rust instead of Ruby on Rails. But I would absolutely still package a C++ application in a Docker image and use ECS / Kubernetes to manage my infrastructure. I think the main benefit of Docker is that you can build and re-use consistent images across CI, development, staging, and production. So all of my Debian packages are exactly the same version, and now I spend almost zero time trying to debug strange issues that only happen on CI, etc.
So now I already know I want to use Docker because of all these benefits, and the next question is just "How can I run my Docker containers in production?". Kubernetes just happens to be the best option. The next question is "What's the easiest way to set up Docker and Kubernetes?" Convox is the holy grail.
The application language or framework isn't really relevant to the discussion.
[2] https://github.com/convox/rack
P.S. Things move really fast in this ecosystem, so I wouldn't be surprised if there are some other really good options. But Convox has worked really well for me over the last few years.
* Effortlessly achieve 100% uptime with rolling deploys
* Running a single command to spin up a new staging environment that is completely identical to production
* Easily spinning up identical infrastructure in a different AWS region (Europe, Asia, etc.)
* Easily spinning up infrastructure inside a customer's own AWS or Google Cloud account for on-premise installations
* Automatic SSL certificates for all services. Just define a domain name in your Convox configuration, and it will automatically creates a new SSL certificate in ACM and attach it to your load balancer.
* Automatic log management for all services
* Very easily being able to set up scheduled tasks with a few lines of configuration
* Being able to run some or all of my service on AWS Fargate instead of EC2 with a single command
* Ease of deploying almost any open source application in a few minutes (GitLab, Sentry, Zulip Chat, etc.)
If we only got to keep two tools it would be kubernetes and terraform.
This isn’t a competition, they are tools. Ansible is widely used and will continue to be so for a long long time. Its foundations - ssh, python and yaml are also in for the long run to manage infrastructure...
If I understand this correctly, all of the things could have been automated in AWS fairly easily .
"If the primary node fails" Health check from EC2 or ELB.
"get a new EC2 box" ASG will replace host if it fails health check.
"format it" The AMI should do it.
"install software, setup certificates" Userdata, or Cloud-init.
"assign an elastic IP, configure it to be exactly like the previous secondary, then tie together replication and notify the consistent hashing" This could be orchestrated by some kind of SWF workflow if it takes a long time or just some lambda function if it's within a few mins.
What do we want to deploy, okay, stop monitoring/alerts, okay, flip the load balancer, install/copy/replace the image/binary, restart it, flip LB, do the other node(s), keep flipping monitoring/alerts, okay, do we need to do something else? Run DB schema change scripts? Oh fuck we forgot to do the backup before that!
Also now we haven't started that dependent service, and so we have to rollback, fast, okay, screw the alerts, and the LB, just rollback all at once.
And sure, all this can be scripted, run from a laptop. But k8s is basically that.
...
And we get distributedness very fast, as soon as you have 2+ components that manage state you need to think about consistency. Even a simple cache is always problematic (as we all know how the cache invalidation joke).
Sure, going all in on microservices just because is a bad idea. Similarly k8s is not for everyone, and running DBs on k8s isn't either.
But, the state of the art is getting there. (eg the crunchydata postgresql operator for k8s.)
This.
I was in a company where devops was just a fancy marketing term, developers would shit out a new release and then it was our problem (we the system engineers / operations people) to make it work on customers' installations.
I now work as a devops engineer in a company that does devops very well. I provide all the automation that developers need to run their services.
They built it, they run it.
I am of course available for consultation and support with that automation and kubernetes and very willing to help in general, but the people running the software are now the people most right for the job: those who built it.
As I said in my other comment: it's really about fixing the abstractions and establishing a common lingo between developers and operations.
If you want to be a successful indie company, avoid cloud and distributed like the plague.
If you want to advance in the big corp career ladder, user Kubernetes with as many tiny instances and micro-services as you can.
"Oversaw deployment of 200 services on 1000 virtual servers" sounds way better than "started 1 monolithic high-performance server". But the resulting SaaS product might very well be the same.
Php under apache.
All this docker and k8s stuff just feels like reinventing application servers, just 10x more complex as means to sell consulting services.
However, their learning curve is pretty steep (particularly Rust) and most developers don’t enjoy having to worry about low-level issues, which makes recruitment and retention a problem. Whereas one can be reasonably proficient with Python/Ruby in a week, Java/C# is taught in school, and everyone has to know JS anyway (thanks for nothing, tweenager Eich), so it’s easy to pick up manpower for those.
Disclaimer: Neither do I claim to be a very good developer, nor do I think you, the reader, is only average. Just given that you are reading Hackernews is a strong indicator for your interest in reflection and self improvement, regardless of your favorite language.
So big companies might be forced to settle for less skilled developers, simply because the top tier is doing their own thing. I assume that's also why acqui-hiring is a thing.
For every time I have to deal with k8s I deeply miss application servers.
Within our teams, we’ve found we can do with an (even) higher level of abstraction by running apps directly on PaaS setups. We found this sufficient for most of our use-cases in data products.
What Nomad doesn’t do is setup a cloud provider load balancer for you.
For persistent storage, Nomad uses CSI which is the same technology K8s does: https://learn.hashicorp.com/nomad/stateful-workloads/csi-vol...
Logging should be very similar to K8S. Both Nomad and K8S log to a file and a logging agent tails and ships the logs.
Disclosure, I am a HashiCorp employee.
Kinda feels bad that I don't have anything to use it on right now.
Thinking about completing my Hashicorp Bingo card.
Nomad, or rather, a Nomad/Consul/Vault stack doesn't have these things included. You need to go and pick a consul-aware loadbalancer like traefik, figure out a CSI volume provider or a consul-aware database clustering like postgres with patroni, think about logging sidecars or logging instances on container hosts. Lots of fiddly, fiddly things to figure out from an operative perspective until you have a platform your development can just use. Certainly less of an out-of-the-box experience than K8.
However, I would like to mention that K8 can be an evil half-truth. "Just self-hosting a K8 cluster" basically means doing all of the shit above, except its "just self-hosting k8". Nomad allows you to delay certain choices and implementations, or glue together existing infrastructure.
K8 requires you do redo everything, pretty much.
- Yes, just added CSI plugin support. Previously had ephemeral_disk and host_volume configuration options, as well as the ability to use docker storage plugins (portworx)
- I haven’t personally played with it, but apparently nomad does export some metrics, and they’re working on making it better
I run a monolithic ensemble that abstracts away the concept of multiple processes to deliver a unified API.
In short, it's multithreaded.
I have actually done a "lift and shift" where we moved code that had no support or directly antagonistic one to k8s because various problems reached situation where CEO said "replace the old vendor completely" - we ended up using k8s to wrestle with the amount of code to redeploy.
Honestly the last time I looked at k8s was like 5 years ago, but back then it looked like a pretty big pita to admin.
It is a completely different world that stretches far beyond Kubernetes, though I attribute much of the change to what has happened from / around k8s -> cncf
It's so easy, I can launch production level clusters is 15 minutes with four keystrokes and make backups and restore to new ephemeral clusters with a few more simple commands
https://github.com/hofstadter-io/jumpfiles
(I'll be pushing these updates this weekend, haven't slept in 24 hours as reworked everything to be powered by https://cuelang.org )
- well it's also a pita to update services without a downtime. - and it sucks to update operating systems without a downtime. - sometimes you reinvent the wheel, when you add another service or even a new website
however with k8s everything above is kinda the same, define a yaml file, apply it, it works.
and also k8s itself can be managed via ansible/k3s/kops/gke/kubeadmin/etc... it's way easier to create a cluster and manage it.
Containers are a standard abstraction over the operating system, not over the hardware (or the VM, even). This has its use cases, but making it “the standard” for deployment of all apps and workloads is just bananas, in my view.
Again, Kubernetes is far more than just deploying, running, mad scaling an application. It allows so many problems to be solved at the system level, outside of an application and developers awareness.
Take for example restricting base images at your organization. With Kubernetes, SecOps can install an application which scans all incoming jobs and either rejects them, or in more sophisticated setups, hot swaps the base image
However I always have static analysers enabled on my builds, so it is almost as if they were part of the language. Regardless if we are talking about Java, C# or C++.
Just like most people that are serious about Rust have clippy always enabled, yet it does stuff that ins't part of Rust language spec.
https://docsv2.convox.com/reference/hipaa-compliance
Note that dedicated instances are no longer required for HIPAA compliance [1]. Also note that the private Convox console is completely optional. You can achieve all of this with the free and open source convox/rack project: https://github.com/convox/rack
As I mentioned in my original comment, you still need to do a lot of work to set up company policies and make sure your application complies with all regulations.
You should also be aware that I'm comparing Convox with some other popular options for HIPAA-compliant hosting:
* Aptible: https://www.aptible.com (Starts at $999 per month)
* Datica: https://datica.com (I think it starts around $2,000 per month, but not 100% sure)
These companies do provide some additional security and auditing features, but I think there's no reason to spend thousands of dollars per month when Convox can get you 95% of the way in your own AWS account. PLUS: If you have any free AWS credits from a startup program, you might not need to pay any hosting bills for years.
[1] https://aws.amazon.com/blogs/security/aws-hipaa-program-upda...
Yaml will still be used in 100 years, k8s is yaml based...
Which is step 2 of my Enterprise adoption strategy. Step 1 is starting with validation, step 3 and beyond is where the real fun starts!
Configuration should be data, not code. Cue has just the right amount of expressivity - anything more complex shouldn't be done at the configuration layer, but in the application or a separate operator.
Darklang is solidly in the Pulumi camp, that's where outsiders put it. (I have seen the insides without beta / your demo, someone with a beta account showed me around a bit)
The real problem with Darklang is they have their own custom language and IDE. What exactly are you trying to solve?
I come from the opposite approach. I have 4 servers two digital ocean $5 and two vulr $2.50 instances. One holds the db. One server as the frontend/code. One server to do heavy work and another to server a heavy site and holds backups. For $15 I'm hosting hundreds of sites, running so many background processes. I couldn't imagine hitting that point where k8s would make sense just for myself unless for fun.
If you do, the recipe is to reduce the number of components, get the most reliable components you can find, and make the single points of failure redundant.
Saying you can use Kubernetes to turn whatever stupid crap people tend to deploy with it highly available, is like saying you can make an airliner reliable by installing some sort of super fancy electronic box inside. You don't get more reliability by adding more components.
You could run init container with Kaniko that pushes image to repo and then main container that pulls that back but for that you need to do kubectl rollout restart deploy <name>
If you are looking for pure CI/CD gitlab has awesome support or you could do Tekton or Argo. They can run on the same cluster.
Flux - https://github.com/fluxcd/flux
ArgoCD - https://argoproj.github.io/argo-cd/
How much of K8s is just an ad hoc, informally-specified, bug-ridden, slow implementation of half of Erlang.
I totally agree. I would dearly like something simpler than Kubernetes. But there isnt a managed Nomad service, and apparently nothing in between Dokku and managed Kubernetes either.
Sure, the steps are things like "if X hasn't been done yet, do it." That means it's idempotent imperative code. It doesn't mean it's declarative.
CFEngine is slightly less imperative, but when I was doing heavy CFEngine work I had a printout on my cubicle wall of the "normal ordering" because it was extremely relevant that CFEngine ran each step in a specific order and looped over that order until it converged, and I cared about things like whether a files promise or packages promise executed first so I could depend on one in the other.
Kubernetes - largely because it insists you use containers - doesn't have any concept of "steps". You tell it what you want your deployment to look like and it makes it happen. You simply do not have the ability to say, install this package, then edit this config file, then start this service, then start these five clients. It does make it harder to lift an existing design onto Kubernetes, but it means the result is much more robust. (For some of these things, you can use Dockerfiles, which are in fact imperative steps - but once a build has happened you use the image as an artifact. For other things, you're expected to write your systems so that the order between steps doesn't matter, which is quite a big thing to ask, but it is the only manageable way to automate large-scale deployments. On the flip side, it's overkill for scale-of-one tasks like setting up your development environment on a new laptop.)
The most simplistic task - execute some code in response to even in a bucket - makes kubernetes with all its sophisticated convergence capabilities completely useless. And even if somebody figures this out and puts the opensource project on github to do this on kubernetes - it just going to break at slightest load.
Not to mention all the work to run kubernetes at any acceptable level of security, or keep the cost down, do all patching, scaling, logging, upgrades... Oh, the configuration management itself for kubernetes? Ah sorry, I forgot, there are 17 great open-source projects exists :)
That's because you're not thinking web^Wcloud scale. To execute some code in response to event you need:
- several workers that will poll the source bucket for changes (of course you could've used existing notification mechanism like aws eventBridge, but that will couple you k8s to vendor-specific infra, so it kinda deminishes the point of k8s)
- distributed message bus with persistanse layer. Kafka will work nicely because they say so on Medium, even though it's not designed for this use case
- a bunch of stateless consumers for the events
- don't forget that you'll need to write processing code with concurrency in mind because you're actually executing it in truly destributed system at this point and you've made a poor choice for your messaging system
For example, I can declare a Pod that mounts a Secret. If the Secret does not exist, the Pod won't start -- but once I create the Secret the pod will start without requiring further manual intervention.
What Kubernetes really is, under the hood, is a bunch of controllers that are constantly comparing the desired state of the world with the actual state, and taking action if the actual state does not match.
The configuration model exposed to users is declarative. The eventual consistency model means you don't need to tell it what order things need to be done.
This is basically the benefit of "containerization" - it's not the containers themselves, it's the constraints they place on the problem space.
Kubernetes gives you limited tools for doing things to container images beyond running a single command - you can run initContainers and health checks, but the model is generally that you start a container from an image, run a command, and exit the container when the command exits. If you want the service to respawn, the whole container respawns. If you want to upgrade it, you delete the container and make a new one, you don't upgrade it in place.
If you want to, say, run a three-node database cluster, an Ansible playbook is likely to go to each machine, configure some apt sources, install a package, copy some auth keys around, create some firewall rules, start up the first database in initialization mode if it's a new deployment, connect the rest of the databases, etc. You can't take this approach in Kubernetes. Your software comes in via a Docker image, which is generated from an imperative Dockerfile (or whatever tool you like), but that happens ahead of time, outside of your running infrastructure. You can't (or shouldn't, at least) download and install software when the container starts up.
You also can't control the order when the containers start up - each DB process must be capable of syncing up with whichever DB instances happen to be running when it starts up. You can have a "controller" (https://kubernetes.io/docs/concepts/architecture/controller/) if you want loops, but a controller isn't really set up to be fully imperative, either. It gets to say, I want to go from here to point B, but it doesn't get much control of the steps to get there. And it has to be able to account for things like one database server disappearing at a random time. It can tell Kubernetes how point B looks different from point A, but that's it.
And since Kubernetes only runs containers, and containers abstract over machines (physical or virtual), it gets to insist that every time it runs some command, it runs in a fresh container. You don't have to have any logic for, how do I handle running the database if a previous version of the database was installed. It's not - you build a new fresh Docker image, and you run the database command in a container from that image. If the command exits, the container goes away, and Kubernetes starts a new container with another attempt to run that command. It can do that because it's not managing systems you provide it, it's managing containers that it creates. If you need to incrementally migrate your data from DB version 1 to 1.1, you can start up some fresh containers running version 1.1, wait for the data to sync, and then shut down version 1 - no in-place upgrades like you'd be tempted to do on full machines.
And yeah, for databases, you need to keep track of persistent storage, but that's explicitly specified in your config. You don't have any problems with configuration drift (a serious problem with large-scale Ansible/CFEngine/etc.) because there's nothing that's unexpectedly stateful. Everything is fully determined by what's specified in the latest version of your manifest because there's no other input to the system beyond that.
Again, the tradeoff is this makes quite a few constraints on your system design. They're all constraints that are long-term better if you're running at a large enough scale, but it's not clear the benefits are worth it for very small projects. I prefer running three-node database clusters on stateful machines, for instance - but the stateless web applications on top can certainly live in Kubernetes, there's no sense caring about "oh we used to run a2enmod but our current playbook doesn't run a2dismod so half our machines have this module by mistake" or whatever.
I can buy a new laptop and be back to 100% in a few minutes. Though the amount of time I spent learning how to get there far exceeds any time savings. Ever.
Ansible uses YAML, but when I used it few times it felt that you still use it in imperative way.
The saltstack (which also uses YAML) was the closest from that group (never used CFengine, but the author wrote research paper and shown that declarative is the way to go, so I would imagine he would also implement it that way).
If you truly want a declarative approach designated from a ground up, then you should try Nix or NixOS.
But it has full power of ruby at your disposal (both at load/compile time and run time). So it usually turns imperative quickly.
A few years ago I even bothered to have two EFI loaders: one for amd an one intel, in case I want to change architecture as well.
Instead, every project I work on has a shell.nix in the root (and if it's not a project I control, I have a shell.nix mapping elsewhere).
Check it out, run nix-shell. Profit.
Once you're really ready for the big leagues, run it with --pure.
Back to your second question, you can configure the system through `/etc/nixos/configuration.nix` it is enough to configure system as a service. Pretty much everything you could do through Chef/Puppet/Saltstack/Ansible/CFEngine etc.
home-manager is taking it a step further and do this kind of configuration per user. It is actually written in a way that can be added to NixOS (or nix-darwin for OS X users) to integrate with the main configuration so then when you're declaring users you can also provide a configuration for each of them.
So it all depends what you want to do, the main configuration.nix is good enough if your machine to run specific service, that's pretty much all you need, you don't care about each user configuration in that scenario, you just create users and start services using them.
If you have a workstation, home-manager while not essential can be used to take care of setting up your individual user settings, stuff like dot-files (although it goes beyond that). The benefit of using home-manager is that most of what you configure in it should be reusable on OS X as well.
If you care about local development, you can use Nix to declare what is needed, for example[1]. This is especially awesome if you have direnv + lorri installed
you can add these to home-manager configuration:
programs.direnv.enable = true;
services.lorri.enable = true;
When you do that you magically will get your CDE (that includes all needed tools, in this case proper python version, you also enter equivalent of virtualenv with all dependencies installed and extra tools) by just entering the directory, if you don't have them installed all you have to do is just call `nix-shell`.I also can't wait when Flakes[2] get merged. This will standardize setup like this and enable other possibilities.
There are many experiments into alternatives happening right now, so I do believe yaml's days are numbered. I'm actively replacing it where ever I encounter it with a far superior alternative. Cue is far more than a configuration language however, worth the time to learn and adopt at this point.
Because it is hard to manage the configuration. It's why tools like terraform exist.
Anecdote. I worked for a small company that was later acquired. It turned out one of the long time employees had set up the company's AWS account using his own Amazon account. Bad on it's own. We built out the infra in AWS. A lot of it was "click-ops". There was no configuration management. Not even CloudFormation (which is not all that great in my opinion). Acquiring company realizes mistake after the fact. Asks employee to turn over account. Employee declines. Acquiring company bites the bullet and shells out a five figure sum to employee to "buy" his account. Could have been avoided with some form of config management.
That is completely the wrong lesson from this anecdote.
1) The acquiring company didn't do proper due diligence. Sorry, this is diligence 101--where are the accounts and who has the keys?
2) Click-Ops is FINE. In a startup, you do what you need now and the future can go to hell because the company may be bankrupt tomorrow. You fix your infra when you need to in a startup.
3) Long-time employee seemed to have exactly the right amount of paranoia regarding his bosses. The fact that the buyout appears to have killed his job and paid so little that he was willing to torch his reputation and risk legal action for merely five figures says something.
+1 for "click-ops", perfectly put.
Pretty much. The lesson learned for me was to always have version control for the complete stack including the infra for the stack. I like terraform for this. Terragrunt at least solves the issue of modules for terraform reducing the verbosity. Assume things could go wrong and you will need to redeploy EVERYTHING. I've been there.
Sounds like the tiniest acquisition mistake I’ve ever heard of.
Spin up new instances, load data from snapshots, get back to work.
The console is fine as a learning tool for deployment/management, and for occasional experimentation, monitoring, and troubleshooting, but any IaC tool is vastly more manageable for non-toy deployments where you need repeatability and consistency and/or the ability to manage more than a very small number of resources.
Do you need to manage keys when ssh'n into a VM?
Do you know what the purpose of all the products are? If you don't know one, are you able to at least have an idea what it's for without going to documentation?
The have also directly opposed many efforts for Kubernetes, even to their own customers, until they realized they couldn't win. Only then did they cave, and they are really doing the bare minimum. The most significant contribution to OSS they have made was a big middle finger to Elastic search...
> How do you view all the VMs in a project across the globe at the same time?
I'm not sure what it's got to do with k8s? I can't see jobs that belong to different k8s clusters at the same time, either.
> Do you need to manage keys when ssh'n into a VM?
Well, in k8s everybody who has access to the cluster can "ssh" into each pod as root and do whatever they want, or at least that's how I've seen it, but I'm not sure it's an improvement.
> Do you know what the purpose of all the products are? If you don't know one, are you able to at least have an idea what it's for without going to documentation?
Man, if I got a dime every time someone asked "Does anyone know who owns this kubernetes job?", I'll have... hmm maybe a dollar or two...
Of course k8s can be properly managed, but IMHO, whether it is properly managed is orthogonal to whether it's k8s or vanilla AWS.
So once again, why developers need kubernetes for? If the most simple problem becomes a habitholy mess :)
The service just runs a script that uses netcat to listen on a special port that I also configured GitHub to send webhooks to, and processes the hook/deploys if appropriate.
Then when it's done, systemd restarts the script (it is set to always restart) and we're locked and loaded again. It's about 15 lines of shell script in total.
Deploying as distribution package tends to not work well when you want to deploy it more than once on a specific server (which quickly leads us to classic end-result of that approach, which is VM per deployment minimum - been there, done that, still have scars).
Management of cron jobs was a shitshow, is a shitshow, and probably will be a shitshow except for those that run their crons using non-cron tools (which includes k8s).
It's not complex to set up a load balancer in a given specific environment. But it's another kind of ask to say "set up a load balancer, but also make it so that the load balancer also exists in future dev environments that can be auto-set-up and auto-teared-down. And also make it so that load balancer will work on dev laptops, AWS, Azure, google, our private integration test cluster on site, and on our locally-hosted training environment, with the same configuration script." All of these things can be done in k8s, and basically are by default when you add your load balancer in k8s. They can be done other ways, too, or just ignored and not done, also. But k8s offers a standardized way to approach these kinds of things.
I've been having this thought very often lately.
The only way for humans to do something faster is to use a machine. Any machine is built on some assumption that something is repeatedly true, that some things can be repeatedly interacted with in the same way.
Finding true invariants is very hard, but our world is increasingly malleable. Over time it is getting easier to invent new invariants and pad things out so that the invariant holds.
Ansible and Vagrant are not perfect, but I think they are far simpler than a single node k8s instance, and more representative of an actual production environment.
This is not my strength in any way, but hearing from those teams, Kubernetes will be a godsend
The time I spent managing HAproxy for 5 services was bigger than the time I spent managing load-balancing and routing using k8s for >70 applications that together required >1000 load balanced entrypoints.
It's a lever for the sysadmin to spend less time on unnecessary work.
(Of course, if you're running untrusted user code, then you'll need every protection you can muster, but I'm talking about running an internally developed application. If you can't trust that, you already have a bigger problem.)
Containers share the kernel with the host, and are only as isolated as the uid the process in the container runs as and the privileges you grant that container.
The point about AWS was not a Kubernetes comparison. It was a GCP one, because you asked what was wrong with the God aweful AWS
This is a bit funny, considering Airbus jets use triple-redundancy and a voting system for some of their critical components. [1]
[1] https://criticaluncertainties.com/2009/06/20/airbus-voting-l...
Are you ok with your application going down for each upgrade? With Kubernetes, it's very simple to configure a deployment so that downtime doesn't happen.
On the other hand, atomic upgrades by stopping the old service and then starting the new service on a Linux command line (/Gitlab runner) can be done in 10 seconds (depending on the service of course – dynamic languages/frameworks sometimes are disadvantaged here). I doubt many customers will notice 10 second downtimes.
K8s only makes sense at near Google-scale, where you have a team dedicated to managing that infrastructure layer (on top of the folks managing the rest of the infrastructure). For almost everyone else, it's damaging to use it and introduces so much risk. Either your team learns k8s inside out (so a big chunk of their work becomes about managing k8s) or they cross their fingers and trust the black box (and when it fails, panic).
The most effective teams I've worked on have been the ones where the software engineers understand each layer of the stack (even if they have specialist areas of focus). That's not possible at FAANG scale, which is why the k8s abstraction makes sense there.
I don’t think AWS has talked about live migrate, but given stability of their VMs and rareness of “we need to restart it notices”, it seems like they have something.
On AWS, I was getting pagerduty'd because the solo gateway box was down, we couldn't ssh in so after an hour of no progress with dubugging or support, we just hit the reset button and hoped for the best. Fortunately this worked and later we where told there has a disk failure.
On GCP, I didn't even know and only discovered it in the logs when I was looking for other audit reasons. Turns out your long-running google VMs are being migrated all of the time and you had no idea. They actually have a policy / SLA around it, basically saying they refresh their entire fleet of servers every 6 weeks iirc. Honestly, if AWS is not doing something like this, I'd have increased concerns about leaky security neighbors (i.e. someone who has a VM running for multiple years without software updates. Hopefully you should be protected on shared servers, but it is software afterall)
Engineers have a plethora of quality control standards and centuries of built up knowledge to make this chaos manageable and the problems tractable.
1 rack in a datacenter is plenty for a database backed web app with a million users.
In my case, I work for a CDN, so we need to have data centers all around the world.
Besides, it's additional hassle and a chance for things to go wrong, the way I have it set up now is that production gets a new deployment whenever something gets pushed to master and I don't have to do anything else.
A text file with some setup notes is enough for simple needs, or something like Ansible if its more complex. A lot of web apps aren't much more than some files, a database, and maybe a config file or three (all of which should be versioned and backed up).
I don't hate it but if you need to login to a server regularly because you need to do an apt upgrade, you should have enabled automatic security updates and not login every few days.
If your server runs full because of some logfiles or stuff, you should fix the underlying issue and not needing to login to a server.
You should trust your machines, independently if it is only one machine, 2, 3 or 100. You wanna be able to go on holiday and know your systems are stable, secure and doing their job.
And logging in also implies a snow flake. Doesn't matter as long as that machine runs and as long as you have not that many changes but k8s actually makes it very simple to finally have an abstraction layer for infrastructure.
I think a better approach would be to have a specification for more robust negotiation protocols. When I see "standardisation," I already know that this means the same thing as "standardization" and furthermore that I should expect to see "colour"/"honour," organizations referred to in plural, "from today" rather than "beginning today" or "starting today," and even "jumpers" over "sweaters," "lorries" over "trucks," "biscuits" over "cookies," and more interrogative sentences in conversation. A British English speaker likely does the same process in reverse.
Within say, the volunteer-maintained documentation of MDN, the tradeoffs are quite different. There, ease of reading for reference by busy coders is much more valuable relative to ease of typing up a new contribution. Frequent switches between "color" and "colour" become a time-wasting distraction.
MDN should pick a standard and insist on it. And if Ubuntu chooses to require British spelling throughout, I'd say that's good.