Similarity joins on large datasets – Minhash | Dark Hacker News