AutoMegaKernel: Compile an LLM into one provably-correct CUDA megakernel | Dark Hacker News