ParallelKernelBench: Can LLMs write fast multi-GPU kernels? | Dark Hacker News