Ask HN: Open-source alternatives to enterprise-grade code indexing/RAG systems? I've been exploring various AI coding assistants (Cursor, GitHub Copilot, Devin, etc.) and noticed they all share a common foundation: sophisticated retrieval-augmented generation (RAG) systems that enable deep understanding of codebases. These systems excel at: - Rapidly indexing entire codebases - Semantic search across code snippets - Contextual ranking of relevant code sections - Integration with LLMs for enhanced code understanding While proprietary solutions are abundant, I'm looking for open-source alternatives that could provide similar functionality. Specifically: - Tools for building and maintaining code indexes - Systems that can integrate with existing LLMs - Solutions for semantic code search and retrieval - Frameworks for contextual code understanding Has anyone built or worked with open-source tools that could serve as building blocks for such a system? I'm particularly interested in hearing about: - Real-world implementations - Performance comparisons with commercial solutions - Scalability considerations - Integration challenges The goal is to understand what's possible with current open-source technology in this space, and potentially contribute to building more accessible alternatives to proprietary systems. |