M6 – 10T Parameters at 1% GPT-3’s Energy Cost | Dark Hacker News