Scaling Transformers at Cohere: What I Learned | Dark Hacker News