Show HN: Efficient Data Formats for GPT(nikas.praninskas.com) |
Show HN: Efficient Data Formats for GPT(nikas.praninskas.com) |
I wonder if we might also see LLM specific data serialisation formats in the future, to make use of tokenization in the most efficient manner and enhance the generative capability of the models.
As for using DBs, that's certainly an option (i.e. langchain and such), but at some point you do still need to bring in the data inside the context, so I'd say it's still interesting to consider what would be an efficient way to represent that data via text.