AutoMegaKernel: Compiling a LLM into a single CUDA kernel(arxiv.org)3 points by OsamaJaber 23 days ago | 0 commentsNo comments yet