Modular beat Nvidia's cuBLAS kernels on B200s in 170 LOC | Dark Hacker News