Ask HN: Why aren't there Open source embedding models with context length > 512? |
Ask HN: Why aren't there Open source embedding models with context length > 512? |
Since the models trained to work on single sentences (like Mini-V2, the SBERT default) work worse at length, pooling representations of sentences is typically more useful.
For deliberately longer representations, generative model embeddings or document embeddings are the right answer sometimes.