What is SOTA for retrieval in RAG systems now?

2 points by stormfather 364 days ago | 0 comments

Have there been significant improvements this year?

The simple flow we landed on in 2024 was:

1. Chunk and embed docs with embedding model 2. Embed query (maybe using an LLM to reformulate first) 3. Retrieve N1 docs using cosine similarity 4. Narrow to N2 using a reranking model 5. Inject these docs into context to generate answer

Have there been significant advancements? Has anyone had seen improvements using graph structures like Neo4j for more sophisticated retrieval?

No comments yet