This post explains how our vector index is implemented within the same storage engine as the graph (eliminating the need for a separate vector store), how we avoid double vector storage, and how scalar type choices (f32/f16/etc) affect memory usage. It also covers some implementation details (USearch-backed index, concurrency, and recovery behavior).
We included a benchmark on 1M nodes with 1024-dim embeddings comparing versions 3.7.2 and 3.8.0, and saw large RAM reductions in the newer version while keeping load and response times similar. Happy to answer technical questions.