Speculative sampling: LLMs writing a lot faster using smaller LLMs | Dark Hacker News