undefined | Dark Hacker News

11 points by skelts 1 year ago

skelts 1 year ago |

Netflix and Cornell University researchers have exposed significant flaws in cosine similarity. Their study reveals that regularization in linear matrix factorization models introduces arbitrary scaling, leading to unreliable or meaningless cosine similarity results. These issues stem from the flexibility of embedding rescaling, affecting downstream tasks like recommendation systems.

throwaway314155 1 year ago | |

> Netflix and Cornell University researchers

Strange times.

refulgentis 1 year ago |

Flagged. Significant parts AI-generated, and it misrepresents the writeup. It is titled "Cosine Similarity Isn't the Silver Bullet We Thought It Was", and the article is (editorializing) "it's a bad idea to use LLM embeddings, trained for dot product, for cosine similarity"

From paper:

- "when a model is trained w.r.t. the dotproduct, its effect on cosine-similarity can be opaque and sometimes not even unique. One solution obviously is to train the model w.r.t. to cosine similarity"

- "We find that the underlying reason is not cosine similarity itself, but the fact that the learned embeddings have a degree of freedom that can render arbitrary cosine-similarities even though their (unnormalized) dot-products are well-defined and unique"

emschwartz 1 year ago |

Very interesting. Thanks for sharing!