GPUs as a service with Kubernetes Engine are now generally available

GPUs as a service with Kubernetes Engine are now generally available(cloudplatform.googleblog.com)

81 points by rey12rey 8 years ago | 23 comments

minimaxir 8 years ago |

With the new discounts on preemptible GPUs (https://cloudplatform.googleblog.com/2018/06/Introducing-imp...), the economics of quickly spinning up a fleet of GPUs with Kubernetes for a quick parallelizable ML task become very interesting. (assuming that Google allows enough GPU quota for a fleet of GPUs for nonenterprise users anyways)

What I want to use Kubernetes + instant-GPU-fleet for deep learning hyperparameter grid searching. (i.e. spin up a lot of preemptible GPUs; for each parameter config, train the model on a single GPU in parallel for linear scanning speed scaling).

Kubeflow (https://github.com/kubeflow/kubeflow) is close to this functionality, but not quite there yet in user-friendlyness. (you have to package everything in a huge Docker container and launch jobs from the CLI; ideally what I want to do is to spawn containers and start training directly from the JupyterHub notebook on the master node)

boulos 8 years ago | |

Disclosure: I work on Google Cloud (and helped launch Preemptible VMs).

> (assuming that Google allows enough GPU quota for a fleet of GPUs for nonenterprise users anyways

This is actually why we have separate preemptible quota [1], which we grant more freely. You can't stock out our full-price customers, so we're happy to let you spin up tons of V100s (and as of this morning TPUs!).

[1] https://cloud.google.com/compute/quotas#quotas_for_preemptib...

minimaxir 8 years ago | | |

I was wondering why there was a separate quota. That makes much more sense!

jamesblonde 8 years ago | |

If you are adventurous enough to try a more user-friendly platform for Distributed TensorFlow (Horovod, grid-search using Spark/TensorFlow), have a look at http://www.hops.io (watch video on front page). In Hops, you don't program infrastructure (no YAML file, no Dockerfiles). You can run with 100s of GPUs for grid search using this python code:

  def train(param1, param2, ..):
    # tf code here (import any libraries here)

  dict = { 'lr': [0.001, 0.0001, 0.00001], 'dropout': [0.4, 0.6, 0.8]}
  experiment.launch(spark, train, dict)

The reason this works on clusters is that there is a persistent conda environment installed on all hosts in the cluster for every 'project' (think Github project). The above function 'train' is a pyspark mapper and the python libraries needed in it are found in the local conda environment. You install conda libraries by search/click in a UI, not by writing a Dockerfile. You create your cluster by selecting how many GPUs/memory you need in a UI, not by writing a YML file.

Diclosure: i work on this project.

w1nk 8 years ago | |

Out of curiosity, why are you grid searching if you have access to google's infrastructure (vizier, etc).

minimaxir 8 years ago | | |

You mean Cloud ML Engine? (https://cloud.google.com/ml-engine/)

Cloud ML Engine is a bit opaque in terms of price efficiency (how powerful is a "training unit"?) but I suppose that's another option.

kozikow 8 years ago |

We have been using GPUs with GKE for a while. At some point, we used 20+ GPUs in an production workflow without any problems.

Everything generally works well, maybe except the initial phase when some containers won't port well from nvidia-docker-compose due to problems with Cuda libraries. Ideally, you need to match the version of Cuda everywhere.

My dev setup for quick experimentation with GPU docker container on GKE: https://tensorflight.blog/2018/02/23/dev-environment-for-gke... .

rjain15 8 years ago |

Are these GPUs on bare metal or virtualized GPUs on VMs

wmf 8 years ago | |

Everything in GCE/GKE is running inside VMs, but when you attach a GPU you get the whole GPU (via PCI passthrough).

fpgaminer 8 years ago |

On a related note, last week I took a dive into Kubernetes on gcloud for a personal project and came out with some interesting knowledge.

First off, this was for a _small_ personal project. Something that I originally intended to run on an f1-micro. I decided to check out Kubernetes mostly to learn, but also to see if it could offer a more maintainable setup (typically I just write a mess of provisioning shell scripts and cloud-init scripts to bring up servers. A bit of a mess to maintain long-term). So basically, I was using Kubernetes "wrong"; its target audience is workloads that intend to use a fleet of machines. But I trudged forward anyway.

This resulted in the first problem. You can't spin up a Kubernetes cluster with just one f1-micro. Google won't let you. I could either do a 3x f1-micro cluster, which would be ~$12/month, or 1x f1-small, should would be about the same price. Contrast with my original plan of a single f1-micro, which is ~$4/mo. Hmm...

Well after playing around I discovered a "bug" in gcloud's tooling. You can spin up a 3x f1-micro cluster. Then add a node pool with just one or two f1-micros in it. Then kill the original node pool. This is all allowed, and results in a final cluster with only one or two nodes in it. Nice. "I know what I'm doing, Google is just being a dick!" I thought. I could still spin up Pods on the cluster, no problem.

Then the second discovery. The Kubernetes console was reporting unschedulable system pods. Turns out, Google has a reason for these minimums.

All the system pods, the pods that help orchestrate the show, provide logging, metrics, etc; they take up a whopping 700 MB of RAM and a good chunk of CPU as well. I was a bit shocked.

I'm sure most developers are just shrugging right now. 700 MB is nothing these days. But remember, my original plan was a single f1-micro which only has 700 MB. This is a personal project, so every bit counts to keep long-term costs down. And, in deep contrast to Kubernetes' gluttony, the app I intend to run on this system only uses ~32 MB under usual loads. That's right; 32 MB. It's a webapp running on a Rust web server.

So hopefully you can imagine my shock at Kube's RAM usage. As I dug in I discovered most all of the services are built using Go. No wonder. I love Go, but it's a memory hog. My mind started imagining what the RAM usage would be like if all these services had been written in Rust...

Point being, 700MB exceeds what one f1-micro can handle. And it exceeds what two f1-micros can handle, because a lot of those services are per-node services. Combined with the base RAM usage of the (surprisingly) bloated Container OS that Google runs on the nodes in the cluster. (Spinning up a Container Optimized image on a GCE instance I measured something like 500 or more RAM usage on a bare install...). And hence why Google won't let you spin up a cluster of less than three f1-micros. You can, however, use a single f1-small since it has 1.7MB of RAM in a single instance.

At this point I resigned myself to just having a cluster of three nodes. shrug the expense of learning, I suppose. And perhaps I could re-use the cluster to host other small projects.

It was at this point I hit another road block. To expose services running on your cluster you, more or less, have to use the LoadBalance feature of Kube. It's convenient; a single line configuration option and BAM your service is live on the internet with a static IP. Except for one small detail that Google barely mentions. Their load balancers cost, at minimum, $18/mo. That's more than my whole cluster! And my original budget was $4/mo...

There are workarounds, but they are ugly. NodePort doesn't work because you can't expose port 80 or 443 with it. You can use host networking or a host port; something like that. Basically build your own load balancer Pod, assign it to a specific node, and manually assign a static IP to that node. (hand waving and roughly recalling the awkward solution I conjured). But it requires manual intervention every time you want to perform maintenance on your cluster. The opposite of what I was trying to achieve.

To sum it all up; you need to be willing to spend _at least_ $30/mo on any given Kubernetes based project.

So I gave up on that idea. For now I've fallen on provisioning shell scripts again. Though I've shoved my application into containers and am using Docker-Compose to at least make it a little nicer deployment.

I also took a few hours to run through the Kubernetes The Hard Way tutorial; thirsty for a deeper understanding of how Kube works under the hood. It's a fascinating system. But after working through the tutorial it became _very_ clear that Kube isn't something you'd want to run yourself. Not unless you have a dedicate sys/devops to manage it.

Also interesting is that Kube falls over when you need to run a relational database. The impedance mismatch is too great. Kube is designed for services that can be spread across many disposable nodes. Not something Postgres/etc are designed well for. So the current recommendation, if you're using a relational database, is to just use traditional provisional or a managed service like Cloud SQL.

P.S. For as long as I've used Google Cloud, I have and continue to be eternally frustrated by the service. It's a complete mess. Last week while doing this exploration I ran into the problem where half my nodes were zombies; never starting and taking an hour to finally die. I had to switch regions to "fix" the problem. Gcloud provides _no_ support by default, even though I'm a paying customer. Rather, you have to pay _more_ for the _privilege_ of talking to someone about problems with the services you're already paying for. Incredibly frustrating, but that's Google's typical M.O.

Not to mention 1) poor, out-dated documentation; 2) The gcloud CLI is abysmally slow to tab-complete even simple stuff; 3) The web console is made of molasses and eats an ungodly amount of CPU resources just sitting and doing nothing; 4) little to no way to restrict billing; the best you can do for most services is just set up an alert and pray that you're awake if shit hits the fan. 5) I'm not sure I can recall a single gcloud command I've run lately that hasn't spewed at least one warning or deprecation notice at me.

erikb 8 years ago |

How about making vanilla k8s usable on-premise first...

ethanwillis 8 years ago | |

I have a lot of bare metal servers in a rack at my home.. Tried getting kube running well last night actually.. Well if you're using Ubuntu server 18.04 GOOD LUCK. There's plenty of issues in the various repositories around kube. I eventually just tried using Canonical's conjure-up tool to install kube.

You'd think that Canonical's tools would work on their LTS right? wrong.. absolutely wrong. It couldn't even do an openstack install either. Absolutely abysmal.

erikb 8 years ago | | |

Yep and there are altogether problems that come up after you got it running. E.g. you probably want to make use of PVCs, but there is no stable dynamic-storage provider yet. Google isn't even working on it as far as github is showing.