Writing an LLM from scratch, part 14 – the complexity of self-attention at scale | Dark Hacker News

Writing an LLM from scratch, part 14 – the complexity of self-attention at scale | Dark Hacker News