Every Flop Counts: Scaling 300B Moe LLMs Without Premium GPUs [pdf] | Dark Hacker News