AutoMegaKernel: Compile an LLM into one provably-correct CUDA megakernel(github.com)4 points by OsamaJaber 25 days ago | 0 comments