While building semantic search and LLM-powered apps, one needs to try out various vector databases because they differ widely in feature set, cost and other characteristics. The Vector-io library introduces a universal open format for storing vector datasets (Vectors, along with their metadata), along with import and export scripts for a wide range of vector databases (more to come!). This will allow easier backup, snapshots, sharing of vector datasets and managing data across different vector DBs. I'm also curating a list of publicly available datasets in this format, which can be loaded directly from HuggingFace into your favorite VectorDB: https://huggingface.co/collections/aintech/vector-datasets-v.... If you have data in a vector DB, please try it out and let me know if you have feedback. Thanks! |
No comments yet