Accelerating LLM Inference with Parallel Draft Models (PARD) | Dark Hacker News