Show HN: Fingerprinting Text Embedding Models via Floating-Point Artifacts(colab.research.google.com) Implemented a sliding window based mean n-gram histogram vector solution for fingerprinting embedding models after coming across the post[1] below by Han Xiao of Jina AI and it surprisingly worked way better than I expected! Link to Colab notebook [2] and a quick visualization [3] below. I had this idea couple of years ago but couldn't get myself to work on it. Seeing the post got me thinking about it again and I was pleasantly surprised at the results. 1 - https://jina.ai/news/identifying-embedding-models-from-raw-n... 2 - https://colab.research.google.com/drive/1CTFltQrHRTViYSs3JLr... |
No comments yet