GPUs as a service with Kubernetes Engine are now generally available(cloudplatform.googleblog.com) |
GPUs as a service with Kubernetes Engine are now generally available(cloudplatform.googleblog.com) |
What I want to use Kubernetes + instant-GPU-fleet for deep learning hyperparameter grid searching. (i.e. spin up a lot of preemptible GPUs; for each parameter config, train the model on a single GPU in parallel for linear scanning speed scaling).
Kubeflow (https://github.com/kubeflow/kubeflow) is close to this functionality, but not quite there yet in user-friendlyness. (you have to package everything in a huge Docker container and launch jobs from the CLI; ideally what I want to do is to spawn containers and start training directly from the JupyterHub notebook on the master node)
> (assuming that Google allows enough GPU quota for a fleet of GPUs for nonenterprise users anyways
This is actually why we have separate preemptible quota [1], which we grant more freely. You can't stock out our full-price customers, so we're happy to let you spin up tons of V100s (and as of this morning TPUs!).
[1] https://cloud.google.com/compute/quotas#quotas_for_preemptib...
def train(param1, param2, ..):
# tf code here (import any libraries here)
dict = { 'lr': [0.001, 0.0001, 0.00001], 'dropout': [0.4, 0.6, 0.8]}
experiment.launch(spark, train, dict)
The reason this works on clusters is that there is a persistent conda environment installed on all hosts in the cluster for every 'project' (think Github project). The above function 'train' is a pyspark mapper and the python libraries needed in it are found in the local conda environment. You install conda libraries by search/click in a UI, not by writing a Dockerfile. You create your cluster by selecting how many GPUs/memory you need in a UI, not by writing a YML file.Diclosure: i work on this project.
Cloud ML Engine is a bit opaque in terms of price efficiency (how powerful is a "training unit"?) but I suppose that's another option.
Everything generally works well, maybe except the initial phase when some containers won't port well from nvidia-docker-compose due to problems with Cuda libraries. Ideally, you need to match the version of Cuda everywhere.
My dev setup for quick experimentation with GPU docker container on GKE: https://tensorflight.blog/2018/02/23/dev-environment-for-gke... .
First off, this was for a _small_ personal project. Something that I originally intended to run on an f1-micro. I decided to check out Kubernetes mostly to learn, but also to see if it could offer a more maintainable setup (typically I just write a mess of provisioning shell scripts and cloud-init scripts to bring up servers. A bit of a mess to maintain long-term). So basically, I was using Kubernetes "wrong"; its target audience is workloads that intend to use a fleet of machines. But I trudged forward anyway.
This resulted in the first problem. You can't spin up a Kubernetes cluster with just one f1-micro. Google won't let you. I could either do a 3x f1-micro cluster, which would be ~$12/month, or 1x f1-small, should would be about the same price. Contrast with my original plan of a single f1-micro, which is ~$4/mo. Hmm...
Well after playing around I discovered a "bug" in gcloud's tooling. You can spin up a 3x f1-micro cluster. Then add a node pool with just one or two f1-micros in it. Then kill the original node pool. This is all allowed, and results in a final cluster with only one or two nodes in it. Nice. "I know what I'm doing, Google is just being a dick!" I thought. I could still spin up Pods on the cluster, no problem.
Then the second discovery. The Kubernetes console was reporting unschedulable system pods. Turns out, Google has a reason for these minimums.
All the system pods, the pods that help orchestrate the show, provide logging, metrics, etc; they take up a whopping 700 MB of RAM and a good chunk of CPU as well. I was a bit shocked.
I'm sure most developers are just shrugging right now. 700 MB is nothing these days. But remember, my original plan was a single f1-micro which only has 700 MB. This is a personal project, so every bit counts to keep long-term costs down. And, in deep contrast to Kubernetes' gluttony, the app I intend to run on this system only uses ~32 MB under usual loads. That's right; 32 MB. It's a webapp running on a Rust web server.
So hopefully you can imagine my shock at Kube's RAM usage. As I dug in I discovered most all of the services are built using Go. No wonder. I love Go, but it's a memory hog. My mind started imagining what the RAM usage would be like if all these services had been written in Rust...
Point being, 700MB exceeds what one f1-micro can handle. And it exceeds what two f1-micros can handle, because a lot of those services are per-node services. Combined with the base RAM usage of the (surprisingly) bloated Container OS that Google runs on the nodes in the cluster. (Spinning up a Container Optimized image on a GCE instance I measured something like 500 or more RAM usage on a bare install...). And hence why Google won't let you spin up a cluster of less than three f1-micros. You can, however, use a single f1-small since it has 1.7MB of RAM in a single instance.
At this point I resigned myself to just having a cluster of three nodes. shrug the expense of learning, I suppose. And perhaps I could re-use the cluster to host other small projects.
It was at this point I hit another road block. To expose services running on your cluster you, more or less, have to use the LoadBalance feature of Kube. It's convenient; a single line configuration option and BAM your service is live on the internet with a static IP. Except for one small detail that Google barely mentions. Their load balancers cost, at minimum, $18/mo. That's more than my whole cluster! And my original budget was $4/mo...
There are workarounds, but they are ugly. NodePort doesn't work because you can't expose port 80 or 443 with it. You can use host networking or a host port; something like that. Basically build your own load balancer Pod, assign it to a specific node, and manually assign a static IP to that node. (hand waving and roughly recalling the awkward solution I conjured). But it requires manual intervention every time you want to perform maintenance on your cluster. The opposite of what I was trying to achieve.
To sum it all up; you need to be willing to spend _at least_ $30/mo on any given Kubernetes based project.
So I gave up on that idea. For now I've fallen on provisioning shell scripts again. Though I've shoved my application into containers and am using Docker-Compose to at least make it a little nicer deployment.
I also took a few hours to run through the Kubernetes The Hard Way tutorial; thirsty for a deeper understanding of how Kube works under the hood. It's a fascinating system. But after working through the tutorial it became _very_ clear that Kube isn't something you'd want to run yourself. Not unless you have a dedicate sys/devops to manage it.
Also interesting is that Kube falls over when you need to run a relational database. The impedance mismatch is too great. Kube is designed for services that can be spread across many disposable nodes. Not something Postgres/etc are designed well for. So the current recommendation, if you're using a relational database, is to just use traditional provisional or a managed service like Cloud SQL.
P.S. For as long as I've used Google Cloud, I have and continue to be eternally frustrated by the service. It's a complete mess. Last week while doing this exploration I ran into the problem where half my nodes were zombies; never starting and taking an hour to finally die. I had to switch regions to "fix" the problem. Gcloud provides _no_ support by default, even though I'm a paying customer. Rather, you have to pay _more_ for the _privilege_ of talking to someone about problems with the services you're already paying for. Incredibly frustrating, but that's Google's typical M.O.
Not to mention 1) poor, out-dated documentation; 2) The gcloud CLI is abysmally slow to tab-complete even simple stuff; 3) The web console is made of molasses and eats an ungodly amount of CPU resources just sitting and doing nothing; 4) little to no way to restrict billing; the best you can do for most services is just set up an alert and pray that you're awake if shit hits the fan. 5) I'm not sure I can recall a single gcloud command I've run lately that hasn't spewed at least one warning or deprecation notice at me.
You'd think that Canonical's tools would work on their LTS right? wrong.. absolutely wrong. It couldn't even do an openstack install either. Absolutely abysmal.
This is a known issue that is not easy to solve in the general case. I think Tim Hockin ran a conversation about how to autoscale on the very low end at last year's Kubecon, with people like you in mind. The other use case he brought up is how to set up services in a Minikube cluster that might be running in a 2GB VM.
700MB was the sum of all the requested minimum RAM for all those service pods. So yeah, you're probably right that they're ceilings of sorts. Still it's a bit crazy to see a logging service, who's job is merely to haul logs off to a different server, requesting 200MB.
I'm also bewildered by Container Optimized OS's memory consumption. IIRC it was 500MB+ bare; doing nothing. As reported by top. I forget which, but I stood up either Debian Stretch or Ubuntu 18.04 and it was only ~200MB with Docker installed.
W.r.t. to the GCP console and tools - I guess it's a preference thing - but I vastly prefer them over the AWS tools. They work fine for me. I like the feature in the GUI where it shows you the equivalent gcloud command line.
To be clear on #1, App Engine has been nothing but reliable for me. Yet it receives few updates; for example, only supporting Python 2.7...
#2: It works great with Datastore, but for SQL you have to use a separate instance or Cloud SQL; either will cost additional money and maintenance. And, last I checked, Postgres was a no-go for App Engine.
#3: It can be hard to secure App Engine apps properly. User data leaking in the logs, for example. And I've encountered a few bugs that lead me to distrust the runtime they use. (I reported the bugs, but still). Where security is of the utmost importance, I have to opt for my own stack.
[This is all Standard Environment. Flexible is brutally expensive.]
Many of us really really really love GKE and get a lot of use out of it. And yes, I run it myself on about 1h of work a week (no dedicated devops) for my entire startup.
the great Kelsey Hightower's tutorials take advantage of the credits. https://github.com/kelseyhightower/kubernetes-the-hard-way/b...
2) True, but you would need this no matter where you are running right? Also, I think the new sandbox runtimes (Java8, Node.js, Python3) should support Postgres.
3) Leaking data in the logs can happen with any app though? What in particular are you seeing? WRT sandbox, the new sandbox should be a lot more robust as well.
I'd definitely give App Engine another shot, maybe in the near future. We are definitely investing a lot on it, I'm sorry it hasn't felt that way for a while.
(I work for GCP)
Was COS a standalone GCE instance or GKE? In the latter case, memory will be used by the usual suspects: fluentd, kubelet, kube-proxy, docker, node-problem-detector. For both, there are also a few Google daemons in Python (ugh).