It is completely incoherent. Apparently we just need markdown and git, but also a knowledge graph and pgvector which accounts for most of the performance.
We don’t need semantic search, because we use… hybrid search (semantic search plus bm25)???
Really bad look for an AI consulting company this.
But good news, this company also has a free ebook. I am sure it is fantastic.
Seeing the confusion this article caused maybe someone will find it useful.
I personally turned of indexing feature in Cursor and I use it without it - I haven't noticed any accuracy drop, though my codebase is not enterprise-size one.
git.exe can't tell you things like how many references a specific type has. It can get close with grep and friends, but it's not very precise. Preprocessing the codebase into various SQL tables using compiler tools can provide these insights in a much more stable way.
A single SQLite database implements columns/metadata handling, and comes baked-in with FTS and BM25 ranking too.