Containers Don't Solve Everything(blog.deref.io) |
Containers Don't Solve Everything(blog.deref.io) |
The solution has to either come in the form of static compilation, or, even less feasible, getting devs to actually care if their software runs on platforms more than a year old. Containers just make everything worse in all cases beyond the contrived "it just worked and I never need to change anything".
Packaging is hard, and both debian-based and rpm-based (and really most other's I've seen) are pretty awful. (except BSDs, which I've had a lovely time with)
They're slow, they're stateful, writing them involves eldritch magic and a lot of boilerplate, and they're just frequently broken. Unless you're installing an entire OS from scratch you're probably going to have a hard time getting your system into the same state as somebody else's. And running that from-scratch OS install is definitely possible in a as-code way, it can take an hour.
Containers came along and provided a host of things traditional packaging systems didn't and they took over by storm and with them came a whole lot of probably unnecessary complexity from people wanting to add things. Adding things without ending up with a huge mass of complexity is hard and takes a lot of context knowledge.
So we ended up solving a host of problems with containers and creating a whole new set along the way.
A few random examples (not the best you could find, just something I've used recently):
- re-packaging pre-built binaries:
https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=visua...
https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=nomad...
- building C from source
https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=tinc-...
- building Go from source
https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=yay
- patching and building a kernel
https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=linux...
A big reason for that in the past much fewer developers were confronted with this problem domain.
In larger companies packaging and deployment was often the responsibility of ops, with some input from and interaction with development. That of course also meant much longer lead times, arguments about upgrading versions of libraries or other executable dependencies, divergence of production and development/test environments, and the associated unfamiliarity with the production environment for developers and hence often more difficult debugging.
Ever since Docker (+ Kubernetes and various cloud specific container solutions) became so popular, a lot of devs now at least partially deal with this on a regular basis.
Which is mostly a good thing, due to the negatives above.
Most corporate use of Docker I've encountered is a mess of stupid patterns like "RUN command A && command B && command C && ..." to reduce layers or some such nonsense which makes debugging build failures tedious.
Yes, absolutely, and I hope you mean that in the capital-F "Future Shock", Alvin Toffler sense, because there is a lot he wrote that hasn't even been carried over and digested. Software is an endlessly disorienting sea of change, getting faster and thus worse as time progresses, and it's frankly madness at this point.
It seems absolutely no one is committed to providing a stable platform for any purpose whatsoever. Even Java, where I spent many years being ingrained with the absolute necessity of backwards compatibility with old (perhaps even dumb) classfile versions, has been making breaking changes as part of its ramp up to semi-annual major version releases. Node Long Term Support "typically guarantees that critical bugs will be fixed for a total of 30 months."[1] Pfft. It's a joke. You can't get your damn API design straight by version 12? I'll do my damnedest to avoid you forever, then. It's so unserious and frankly irresponsible to break so much stuff so often.
But change only begets more change. We're all on an endless treadmill, constantly adapting to the change for no reason. And people have to adapt to our changes, and so it goes.
Containers side-stepped the deficiencies of Linux distributions, which had become so based on 'singleton' concepts; one init system, one version of that library etc.
A shame because there's an inherent hierarchy; everything from the filesystem, UIDs, $LD_LIBRARY_PATH that could really allow things to co-exist without kludge of overlay filesystems. Just it was never practical to eg. install an RPM in a subdirectory and use it.
Containers aren't a very _good_ solution, they're just just best we've got; and still propped up by an ever-growing Linux kernel API as the only stable API in the stack...
This is why we don't play games with siloing responsibilities on the tech stack. Every single developer on the team is responsible for making the entire product work on whatever machine it is intended to work on. No one gets to play "not my job", so they are encouraged to select robust solutions lest they be paged to resolve their own mess in the future.
Maybe those solutions are containers in some cases, but not for our shop right now. Our product ships as a single .NET binary that can run on any x86 machine supported by the runtime.
This is really not a new problem :) I remeber dealing with shared libary versioning issues from no long after I started in IT in the 90's and it's been a problem since.
Solving that problem seems like a win to me.
Considering the level of options from Kubernetes, heml, istio, etc can get complex, the developer can focus on the boundary requirements... expected environment variables and peer systems/services.
I would also not downplay the importance of Docker's support for software-defined networks and it's ability to arbitrarily configure networking at the container level.
I firmly believe that networking doesn't pop up so often while discussing Docker because Docker solves that problem so fantastically well that a complex problem simply ceases to exist and completely abandons everyone's mental model.
I relatively rarely work with Java and am probably mistaken.
The problem doesn’t start with virtualization, that is indeed a side-track.
Also:
> Consider also that Docker relies on Linux kernel-specific features to implement containers, so users of macOS, Windows, FreeBSD, and other operating systems still need a virtualization layer.
First, FreeBSD has its own native form of containers and Windows has its own native implementation. Docker != containers.
I really don't see how Docker (or containers as we mostly know them) relying on kernel-features from an open source operating system in order to run Linux OS images as something to even complain about, and there is nothing preventing Mac from implementing their own form of containers.
Is vanilla Kubernetes easy for new developers? No, but there is an entire ecosystem offering tools and platforms to make development using containers a seamless as possible. Microsoft saw this, so they really had no choice but to adopt the container terminology and partner with Docker to try to stay relevant.
My guess is without containers, Microsoft would have never even built WSL. If you want smooth developer experience with containers then that is what solutions like GitLab offer. Even Microsoft's GitLab is essentially built around running various actions inside containers.
I personally welcome the change. I can spin up a local Kubernetes cluster and test an entire cluster of applications locally if I want, or integrate it into Skaffold or whatever else and test live in the cloud. It really is a lot better than what we had before. I think the solutions though really come down to documentation and resources to help train new employees and acclimate them.
In the end, there's only a few missing pieces to offer a more robust solution. I do think that making it all webassembly will be the way to go, assuming the WASI model(s) get more flushed out (Sockets, Fetch, etc). The Multi-user web doom on cloudflare[1] is absolutely impressive to say the least.
I kind of wonder if Cloudflare could take what FaunaDB, CockroachDB or similar offers and push this more broadly... At least a step beyond k/v which could be database queries/indexes against multiple fields.
Been thinking on how I could use the existing Cloudflare system for something like a forum or for live chat targeting/queries... I think that the Durable Objects might be able to handle this, but could get very ugly.
But that's why anytime you integrate with one of these tools you should be aware that there is a cost for maintaining that integration.
My efforts => https://micro.mu
Oh and prior efforts https://github.com/asim/go-micro
I wish someone would rewrite docker-compose in a single go or rust binary so that I don't have to deal with the python3 crypto package being out of date or something when simply configuring docker/docker-compose for another user (usually me on a different machine or new account).
^ There's an rc of a compose command built into the standard docker CLI.
providers in turn responded by shilling their 'in house' containerization products and things like Lambda for lock-in.
Containers were the next logical step, as each virtual machine vendor tried to lock in their users. Containers allowed routing around it.
Both of these steps could be eliminated if a well behaved operating system similar to those in mainframes could be deployed, so that each application sat in its own runtime, had its own resources, and no other default access.
There's a market opportunity here, it just needs to be found.
Containers and VMs let you divide and solve problems in isolation in a convenient manner. You still have the same problems inside each container.
Firstly, Docker & k8s made using containers easy. Minimal distros like alpine simplify containers to a set of one or more executable. You could implement the same thing with a system of systemd services & namespaces.
But now that everything was a container, you need a way to manage what & where containers are running and how they communicate with each other.
It looks like 90% of the stuff different container tools and gadgets try to solve is the issues they created. You can no longer install a LAMP stack via 'apt install mysql apache php7.4' so instead you need a tool that sets up 3 containers with the necessary network & filesystem connections. It certainly better because it is all decoratively defined but it is still the same problem.
This is why I mostly stayed out of containers until recently. The complexity of containers really only helps if you need to replicate certain server/application. You will still need to template all of your configuration files even if you use Docker, etc.
What is changing everything IMO is NixOS because it solves the same issues without jumping all the way to Docker or k8s. Dependencies are isolated like containers but the system itself whether it is a host/standalone or a container can be defined in the same manner. This means that going from n=1 to n>1 is super easy and migrating from a multi-application server (i.e a pet server) to a containerized environment (i.e to a 'cattle' server/container) is straightforward. It's still more complex and a bit rough compared to Docker & k8s but using the same configuration system everywhere makes it worthwhile.
Nothing says love like realizing that you are segfaulting due to a library version you didn't test against subtly changing its behavior.
This amounts to using Perl, bash, and POSIX.
On the client side, of course, it is HTML and JS, which I use a very limited subset of to improve compatibility.
All applications run in its own container, unless they are granted granular permissions to do otherwise.
The code and assets for a program belong in its own quarantined section, not spread out over the filesystem or littered around /etc/, /var/
Built in networking for these containers.
Even if it is a joke, people want to have silver bullets. Those are killing the hairy problems which can be named werewolves.
Downside is hairy problems just like werewolves come from people. So it in the end it is people problems not some container tech or other stack problem. There are no werewolves without people :)
Next, I started working with Docker and languages with better package management. Dependencies were fetched in CI and were either statically linked or packaged in a container with the application I was working on. Still, these were mostly monoliths or applications with simple API boundaries and a known set of clients.
In the past few years, almost everything I have written has involved cloud services, and when I deploy an application, I do not know what other services may soon depend on the one I just wrote. This type of workflow that depends on runtime systems that I do not own - and where my app may become a runtime dependency of someone else - is what I am referring to as a "modern development workflow".
I know docker has made it part of the way there over the years with Compose and so on, but it's all felt pretty ad-hoc, whereas k8s feels like a cohesive system designed against a clear vision (which makes sense, since it was designed as borg 2.0)— no one else working in this space had the benefit of having already built a giant system for it and used it at scale for years beforehand.
That said, we do have an iOS client which is intended to run on such classes of devices. I loathe the fucker so much (dev experience is garbage) but our customers like it a lot so... here we are. 99% of the complexity lives on the server, so the app is not a daily struggle. We also have a UWP client, but it has its own set of "difficulties" that I won't get into at the moment.
At some point I want to try to build a pure HTML5/canvas solution that can be served from a cheap-ass linux box and consumed by any device with a reasonable web browser implementation.
1. https://github.com/dotnet/runtime/issues/43313
2. https://docs.microsoft.com/en-us/dotnet/maui/get-started/ins...
Also, setup all the containers to include unit test results in the runtime container... this gets extracted/merged in CI/CD. Beyond this, I can stand-up the entire application and run through full integration and UI test suites in the CI/CD pipeline. Same commands locally... it all is much smoother than prior experiences.
I will NEVER run a database install on my developer desktop again. Database deployes on the main application I work with, and unit tests all finish in about 5 seconds or less (not including initial download). I'm also able to run db admin apps right with the DB.
Persist volumes, run/test upgrades and from-scratch. It all goes really smoothly overall. Wouldn't ever want to go back to mile-long dependency instructions step by step to getting a development environment running ever again. WSL2 + Docker Desktop are pretty damned great.
But that’s in line with the whole premise of DevOps, right? That the strict separation between dev and ops is a bad thing, and it’s good that devs get involved with ops and vice versa.
I don’t think this has to do with containers per se, but they do help a lot with that goal.
Thanks. Now I wish the company I work for would drop their plan to bring me back in office next week and just settle instead for a day or two of mandatory presence in the office per month (crossing fingers while you do your magic).
Don't get me wrong, from what I hear Nix actually does deliver on the promise for the most part, it's just that you have to learn a new language to use it effectively and of course it has its own quirks.
Im not comparing whether Dockerfiles or buildpacks or nix packages are more ergonomic than one another but i do think your comment is...misguided. From what I have heard Nix is pretty wonderful to use and simplifies the problem - it just requires you learn about Nix a bit which i think is a fair trade-off for the benefits it supposedly provides
Nix ... I have so far spent about 10 hours learning it to manage my machine. I have forgotten about 98% of it and abandoned the project. You feel like you're sitting in the middle of a spider web, and you can sense the whole system at once. Literally none of your prior knowledge of how to use a computer will help you. None of your existing build tool CLI can be used. Every package manager needs a nix-ifier, like node2nix. Everything you see in a nix file will have to be googled, searched in the documentation, searched in GitHub repos for some kind of example. Nix has rebuilt the world from scratch.
If you're trying to make the next big thing, try to make it leverage people's existing knowledge. One truly excellent example is `compile_commands.json`. It does a very similar thing to Docker, where it extracts information from your existing build process, without actually changing the build process. The problem statement was that people wanted LSP (and predecessors) implementations to have access to a list of input files to a C/C++ compiler, but they didn't want to abandon Make and CMake etc. So they basically made a structured log of all the CC invocations, and a wrapper around CC that would parse the arguments and write to the log in JSON format. These days you get it for free with CMake[0]. You can use it with nearly every C/C++ build system on earth with a single CC=... argument to make.
[0]: https://cmake.org/cmake/help/latest/variable/CMAKE_EXPORT_CO...
To use an example from another community, no amount of performance improvements to NPM will ever make it a good idea to depend on hundreds of one-liner "is number odd" or "left pad" packages. Papering over the problem with yet more technology only ossifies it, making it harder to solve for real.
As you suggest, these are probably 'pieces' of the puzzle, by no means 100% identical to how containers are used today. But I think we'd have ended up in a different place.
There's definitely advantages that way, but there's also drawbacks.
The real killer app for Nix is in testing now, and that's the "flakes" feature. Lots of this stuff will get way easier to use when you can throw "github:owner/repo" in an `inputs` set and get a working Nix builder for your project without needing to read through nixpkgs. I hope you give it a try again sometime, as it has changed my perspective on how software should be built, distributed, trusted, and deployed.
Docker does not have this problem at all. Every build tool on earth works with it, with zero configuration.
Compile_commands.json may have similar output to node2nix etc, but it infects nothing, replaces nothing. It works with all the different makefile alternatives with no additional effort. The closest similarity is with Docker: Docker builders intercept at the file system layer, covering every build system ever; compile_commands at the standard GCC-compatible shell arguments layer, covering ~all C build systems and compilers. Nix does not intercept anywhere, it asks you to use a new tool to do everything from scratch in a Nix-compatible way, covering no build systems.
That’s not to say it isn’t great when you have already built a Nixified package manager replacement for your specific language ecosystem. But it’s not going to take over the world like Docker and compile_commands did. Imploring people to give it a shot is the only way, unfortunately. I will remain open to it, especially if someone can figure out a force-multiplier for these 2nix implementations.
Not true! Nix wraps other build tools, and provides hermetic and reproducible environments to those tools. If the tools exposed a way to get the URL and SHA256 hash of every dependency it downloads from the Internet, then the "infection" doesn't need to happen, as you would simply supply those hashes to Nix, which in turn will happily allow them to be downloaded in the sandbox by the tool. That tools like node2nix exist speaks to the walled garden created by these tools and ecosystems, because they do not (easily) expose dependencies to their environment, and/or they do not (easily) accept dependencies from their environment.
This would absolutely be a problem with Docker as well, if you added the same requirements that Nix enforces in its sandbox, because otherwise you are allowing Docker to fetch dependencies by URL without specifying their contents.
Yes. Good start. If you can make it so that exposing this information to Nix is easy enough that e.g. the NPM team does not need a PhD in Dhall to write it to a file, then Nix will be a much more solid proposition. That data alone isn't enough, but that + a DAG of what NPM will do to the downloaded tgzs is much closer. It's also enough for cargo. And many other languages. Dhall is cool to write by hand but, back to my original example, compile_commands.json could be written by a monkey. It needs to be that easy. It needs to be as easy as printing GraphViz DOT to stderr. Then and probably only then will Nix support start getting upstreamed.
Dhall is probably Nix's biggest liability at the moment; they sought to make a single language, with a rapidly changing API, for configuring your computer (by hand) as for making compilers reproducible. Compiler output! In an essentially esoteric configuration/programming language, which takes a lot of effort to port to a new ecosystem! No. Use JSON. Ideally you will never have to actually write Nix, the same way humans have never had to write compile_commands.json by hand, and the way nobody has ever had to construct a Docker image by hand out of individual tar files.