Docker adds very interesting filesystem ideas and software for the management of container images. Personally, I think we are on the cusp of a transition from VPS (xen/hvm) to VPS (containers). I also hope that Google throws some of their concepts at the Docker project. Interesting times for this space.
[1] http://sysadmincasts.com/episodes/24-introduction-to-contain...
[2] https://github.com/google/lmctfy
[3] http://www.wired.com/2013/03/google-borg-twitter-mesos/all/
A transition back to, I think ... the very first VPS provider (JohnCompanies, 2001)[1] was based entirely on FreeBSD jail and was entirely container based, even though those containers were presented as standalone FreeBSD systems.
[1] Yes, verio did have that odd VPS-like enterprise service earlier than that, but JC did the first VPS as we came to know it.
I'm not so sure of that. I think a lot of the use-cases for VMs are based on isolation between users and making sure everybody gets a fair slice. Something like docker would work well with a single tenant but for multi-tenant usage docker would give you all the headaches of a shared host and very little of the benefits of a VM. For those use cases you're probably going to see multiple docker instances for a single tenant riding on top of a VM.
The likes of Heroku, AWS, Google etc will likely use docker or something very much like it as a basic unit to talk to their customers, but underneath it they'll (hopefully) be walling off tenants with VMs first. VMs don't have to play friendly with each other, docker containers likely will have to behave nicely if they're not to monopolize the underlying machine.
Then we can start doing some interesting stuff past finding new ways to chop computers up.
Containers do this.
Also check out Joe Beda's deck from GlueCon: http://slides.eightypercent.net/GlueCon%202014%20-%20Contain...
Docker is a natural fit for GCE.
If anyone here specializes in similar things, I would be curious to know if this Pegasus system runs on top of or underneath Borg/Omega (or perhaps replaced it?), or is a separate system altogether.
[1] http://www.theregister.co.uk/2014/06/05/google_pegasus_syste...
Edit: [2] http://gigaom.com/2014/05/28/google-is-harnessing-machine-le...
There may be some of that, but I think more common will be continuing to have tradition IaaS bring-your-OS-image services, with a new tier in between IaaS and PaaS (call it CaaS -- Container host as a service), plus a common use of IaaS being to deploy an OS that works largely as a container host (something like CoreOS).
My problem is that I am sort of stuck in the past. Whether I am using VPSs, AWS, or rented physical servers, I have only a partially automated way to set up servers. This scales to small numbers of servers just fine, and that is mostly the world I live in, but I need to improve my workflow. This really hit home yesterday when I had to upgrade a Haskell GHC/platform because I tweaked a Haskell app making incompatible with an old GHC 7.4.* setup on an older server, and ended up wasting some time before fixing things.
Working as a contractor at Google last year was an eye opener. I really loved their infrastructure. Eye opening experience.
Docker seems like my best path forward.
1. When I need to move application between different systems running Ubuntu, Debian etc, now I use Virtual box. Can I use Docker now on?
2. A quick reading about docker tells me that instead of running a guest OS as in Virtual box docker only holds application related things. Then how could it handle deployment between Debian Squeeze and Ubuntu 14.04. I mean old and new Linux version
3. Compared to virtual box how easy it is to install and use Docker
4. Can you please tell some places where you people use docker
5. How many of you have migrated to Docker from virtual box and related things?
Disclaimer: Noob detected :)
It doesn't seem all that complex; sure, its (in the typical cloud use case) another level of organization, but done right it should actually simplify organization and deployments.
> It just seems like this adds so many more attack points by removing the virtual machine which was a good way to organize services.
Containers are different than VMs, but using them doesn't mean "removing the virtual machine". Particularly in the use cases that Google is embracing (e.g., on a cloud platform where the containers are for use on VMs.) How, specifically, does it add "attack points"?
In the real world, everyone wants infrastructure to have the same sexy qualities: automated deployment (CD/CI), automated scaling, automated failover/high availability, automated service discovery (read: functional service topology dependency resolution), security, resource and capacity planning support, real time least-cost-overhead provider selection for third party infrastructure providers meeting some disparate set of performance requirements, etc. Unfortunately, it's not an easy problem area to deliver a one size fits all solution to.
Docker doesn't really have most of that stuff in scope yet, even vaguely. Actually, it seems to have a really weird scope: it wants to wrap different LXC implementations and other container-style unix environments (potentially supporting non-Linux platforms) but doesn't want to deal with managing the host systems themselves, having - kind of, for practical reasons (though not entirely!) - outsourced this to CoreOS (ie. some particularly specific configuration of a Linux host system).
Whether all of this recent Redhat/Google docker bandwagon jumping will amount to any real solution remains to be seen .. Google AFAIK effectively runs its services on fat clusters made of commodity hardware, organized in to segments ('cells'), running highly customised Linux distributions, and so does Redhat where HA is required. I'm pretty familiar with these configurations as I do this myself. So will we ever see meaningful support for other OSs? Other distros? Physical systems via PXE to support these clusters? Hypervisor guests managed with the same developer and operations workflow?
My wager is not soon, at least in a manner that everyone agrees on... Google will keep doing its thing (using its unlimited supply of internal, world-class nerds to deliver and manage services on their custom OS in a custom way because saving 1/2c a month per machine pays ten world class nerd salaries at their scale), Redhat will keep doing its thing (selling prebuilt systems at expensive prices that still comfortably undercut the likes of IBM, pretending they are manageable, but actually rejigging the whole infrastructure every system release leaving little in the way of realistic upgrade paths without expensive consulting) and you and I will be left wondering where that magical docker solution went that everyone was talking about in early 2014.
Here are a couple of notes..
Deployment manager - https://developers.google.com/deployment-manager/
Saltstack integration - https://www.youtube.com/watch?v=0dOXbhenFl0
Makes me feel old... I used to "deploy" with cPanel or DirectAdmin.
Like, I have a generic Flask container repository that I clone as a starting base...change the domain name(s) on the config file, app name, etc. Then just copy things into the src/ directory and spin it up.
For me, this is really easy to spin up a new flask project. I do the same with golang [sort of, it really just downloads a binary+config file into a container that has all of my logging, etc setup].
The problem with doing LXC is where you have inheritance chains like this:
[Base Container of Ubuntu + Logging + Monitoring] -> [Base Flask Container] -> [Flask Container for Website X]
[Base Container of Ubuntu + Logging + Monitoring] -> [Base Flask Container] -> [Flask Container for API for Website X]
[Base Container of Ubuntu + Logging + Monitoring] -> [Container for Mail Service]
[Base Container of Ubuntu + Logging + Monitoring] -> [Container for Kibana]
[Base Container of Ubuntu + Logging + Monitoring] -> [Container for Redis-Cache]
etc.
Tbh, I think that is what docker really fixes. The ability to easily inherit containers so you only have to make changes in one spot to change all of them.
Builds, as an example. This was my first introduction to Docker - scaling Jenkins worker nodes was allocating yet another VM. As you scale to tens or hundreds of build slaves, you realize utilization across the cluster is down. Scaling builds with Docker was simple and efficient to implement and it allowed me to drive utilization up dramatically.
Now there's 15 different companies, integrations, and tooling surrounding this space to make that process repeatable to the masses. Amazing.
We're seeing the same thing in the orchestration, paas, sdn, iaas, YOUR ACRONYM OF CHOICE, and one common theme is that ecosystem is standardizing around Docker. That's pretty powerful.
Although it still supports lxc, docker now defaults to libcontainer (https://github.com/dotcloud/docker/tree/master/pkg/libcontai...), it's own container implementation.
Possibly it was just a little ahead of it's time and was also overshadowed by the rise of HW virtualisation in the later 2000's. Having to install a custom kernel (certainly when I used it) was also a bit of a hassle mind you. Anyway - maybe someone will re-invent the toolchain using Swift or Node and it'll become cool again ;-)
So all other things being equal, if you slice up your machine into 5 equally apportioned segments and you run a user process in one of those 5 slices that tries to hog the whole machine it will only manage to create 1/5th of the load that it would be able to create if it were running directly on the guest OS.
So yes, linux does 'fair slicing' if you can live with the fact that a single process will determine what is fair and what is not. That that process gets pre-empted and that other processes get to run as well does not mean the machine is not now 100% loaded.
Using quota for disk space, 'nice', per-process limits for memory, chroot jails for isolation and so on you can achieve much the same effect but a VM is so much easier to allocate. It does have significant overhead and of course it has (what doesn't) it's own set of issues but resource allocation is actually one of the stronger points of VMs.
The deployment manager sounded interesting but I'm not seeing any support for arbitrary platforms (in the OS sense), or infrastructure providers (in the 'run it on my own hardware, or someone else's' sense), nor the opsier side (like business concerns separate to technology) of the ops part.
Some thoughts roughly summarised at http://stani.sh/walter/pfcts/
Following up a bit.. Google just announced Kubernetes[1], an open source container manager. Also, Eric Brewer is now on Docker's Governance Committee[2] to help push for open container standards.
Seems like a good step forward.
[1] https://github.com/GoogleCloudPlatform/kubernetes [2] http://googlecloudplatform.blogspot.com/2014/06/an-update-on...
Of course you could try to escalate from a VM to the host (see cloudburst) but that's a rarity.
Docker seems to be less well protected against that sort of thing, but I'm nowhere near qualified to make that evaluation so I'll stick to 'seems' for now. It looks like the jump is a smaller one than from a VM.
Shared hosting of random antagonistic processes is something that many developers are not quite ready to embrace. If you are willing to run your service with poor isolation and questionable security then containers are just the thing. You'll definitely spend less money if you can serve in such an environment.
So they're orthogonal only as long as the security assumptions hold.
Since you asked, the drawback of just logging into an image, editing files, and installing software is that you can only reproduce that image by grabbing an entire file system.
When I use Docker, I create a git repository containing a 'Dockerfile', which is basically a series of shell commands to configure a machine. I also add copies of any configuration files I'll need, and use the Dockerfile to copy them onto the machine during setup.
This can be extremely fast in practice: Docker has a caching system which "runs" unchanged lines in Dockerfile by looking up a cached VM image, so I can often edit the Dockerfile and rebuild the image in a second or so.
This approach is really nice when I have to look at an image a year later and figure out how I created it, perhaps with the goal of upgrading to a new OS release or whatever. I just glance at the Dockerfile, change the base OS version, and re-run it.
sudo docker run -i -t ubuntu /bin/bash sudo docker commit [container id] [tag]
Is that what you are looking for?
But for my part I tend to just layer changes piece by piece. So e.g. I have a docker image that's a "base development" image, and create separate docker containers to run each web app I'm experimenting with, with a shared volume with a separate docker container that I ssh into and run a screen session in with the code/git repositories. In effect that means that bringing up a new docker image for a new project is at most a couple of lines of change unless the project has particular/different dependencies.
If you ever heard some of Heroku's propaganda surrounding "eternal applications", it also applies here: the stack you're deploying on will never update out from under you, because it's pinned by your Dockerfile. You can confidently move your app to any old server and it'll run exactly the same. One good analogy I've heard is that a Docker image is the reductio ad absurdum of a static binary: it doesn't depend on your system, just on itself.