An update on container support on Google Cloud Platform(googlecloudplatform.blogspot.com) |
An update on container support on Google Cloud Platform(googlecloudplatform.blogspot.com) |
My slides from the talk: https://speakerdeck.com/jbeda/containers-at-scale PDF: http://slides.eightypercent.net/GlueCon%202014%20-%20Contain...
Did Google switch to containers in a year? Maybe that answer is in your slides? If so crazy...
Essentially it comes across as "I'm a fan of your early work, but nothing you've done since matters."
This also doesn't speak to the number of long running containers. There are plenty that don't stop/start during the week I grabbed that number.
They certainly seem to work well for that. Heroku, for example, uses containers for not just persistent processes (application servers, workers) but also short-lived processes. Tasks that run on a schedule (hourly, daily, etc.) are run by, you guessed it, starting up a container running a processes which exits when it's finished. One-off commands like maintenance scripts or REPLs work the same way.
[1] http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y2...
Containers are much more flexible that statically linked binaries. You could have multiple binaries in a container sharing a common set of dynamically linked files.
Fat binaries inside containers sounds a bit like the worst of both worlds...
When the size of the libraries (megabytes) is compared with the typical heap size of running jobs (gigabytes - a small number of large instances per job is typically a lot more efficient than a large number of small instances) the space savings of shared libraries become pretty negligible.
Back then, one of the main bottlenecks in the system was the central scheduler, in particular the amount of work it had to do tracking what binary packages were installed on each machine and what were needed for the candidate jobs it might plan to run on those machines. Having many packages per job just makes the scheduler bottleneck worse.
There were two places where it did actually make sense to share packages between jobs:
- Java Virtual Machine and supporting libraries - libc
These were external (so changed much less frequently), quite large, and were needed by very large numbers of jobs, so the space savings of having one copy of each needed version per machine outweighed the extra scheduling load required for them.
Also Ubuntu 14.04 LTS has much worse performance than Ubuntu 12.04 LTS, so it's better to stick to 12.04 LTS for now.
In my opinion it's better to port to Debian or to run Ubuntu image inside container hosted on Debian, until they will have native Ubuntu support.
[1] http://gigaom.com/2014/04/02/google-launches-andromeda-a-sof...
[2] http://googlecloudplatform.blogspot.co.il/2014/04/enter-andr...
The simpler solution is to use their ready-made Debian images, which can be easily configured via CLI or API.