Accelerate CPU Based LLM Inference with a Vector Index on the Output Embeddings | Dark Hacker News