Optimizing Parallel Reduction in Metal for Apple M1 | Dark Hacker News