Stop Blaming Embeddings, Most RAG Failures Come from Bad Chunking Everyone keeps arguing about embeddings, vector DBs, and model choice, but in real systems, those aren’t the things breaking retrieval.
Chunking drift is. And almost nobody monitors it.
A tiny formatting change in a PDF or HTML file silently shifts boundaries. Overlaps become inconsistent. Semantic units get split mid-thought. Headings flatten. Cross-format differences explode. By the time retrieval quality drops, people start tweaking the model… while the actual problem happened upstream.
If you diff chunk boundaries across versions or track chunk-size variance, the drift is obvious. But most teams don’t even version their chunking logic, let alone validate segmentation or check adjacency similarity.
The Industry treats chunking like a trivial preprocessing step. It’s not.
It’s the single biggest source of retrieval collapse, and it’s usually invisible.
Before playing with new embeddings, fix your segmentation pipeline. Chunking is repetitive, undifferentiated engineering, but if you don’t stabilize it, the rest of your RAG stack is built on sand. |