This work builds on our previous efforts that give a 10x performance improvement from generating the LLM embedding[2] from input text along with tuning vector recall[3] in a single process to avoid excessive network transit.
We'd love your feedback on our roadmap[4] for this extension, if you have other use cases for an ML application database. So far, we've implemented our best practices for scalable vector storage to provide an example reference implementation for interacting with an ML application database based on Postgres.
[1]: https://github.com/postgresml/postgresml/tree/master/pgml-sd... [2]: https://postgresml.org/blog/generating-llm-embeddings-with-o... [3]: https://postgresml.org/blog/tuning-vector-recall-while-gener... [4]: https://github.com/postgresml/postgresml/tree/master/pgml-sd...