Instrumentation checklist for running large GPU clusters | Dark Hacker News