Rendezvous Hashing Explained

Rendezvous Hashing Explained(randorithms.com)

63 points by yinso 4 years ago | 11 comments

fredrb 4 years ago |

There are also other approaches to the problem, like Jump Hash, Multi-Probe and Maglev hashing. Each of them offer a different combination of load balancing, scalability and performance trade-offs. [1] provides a good analysis of the available methods and their tradeoffs.

[1]: https://dgryski.medium.com/consistent-hashing-algorithmic-tr...

46Bit 4 years ago |

I've implemented this recently (https://github.com/46bit/distributed-systems/tree/main/rende....) Does anyone know a similarly accessible guide to adding ordering (e.g. Vector Clocks)?

eru 4 years ago | |

It's a shame that Go essentially forces you hard-code the key and value types. (Or alternatively makes you do unspeakable things to avoid that hard coding.)

Btw, you might want to add weights, they are very satisfying, once you understand them.

eru 4 years ago |

It's a shame they don't explain the weighted version, and neither does the linked talk.

The way weighted rendezvous hashing works is really beautiful, and easy enough to derive semi-formally to explain in a short blog post.

cryptica 4 years ago |

Rendezvous hashing is simple to understand. I'm surprised that the article didn't mention the skeleton-based variant which provides O(log N) lookup (same as consistent hashing).

Based on my experience implementing the skeleton-based variant, it does add a fair bit of additional complexity compared to plain rendez-vous hashing but IMO, it's still much simpler to implement than ring-based consistent hashing (with optimizations).

The main downside of skeleton-based rendez-vous hashing compared to ring-based consistent hashing is that when a server (aka 'site') leaves the cluster, you need to iterate over all the keys in order to do the re-mapping. Skeleton-based rendez-vous hashing minimizes the number of keys which will need to be moved when a server leaves the cluster (at least as well as consistent hashing does), but you still need to go through every key in order to check whether or not it has moved; there are no shortcuts to check only a subset of keys. With consistent hashing, if a server leaves the cluster, you only need to check keys which fall within the affected 'sector' of the ring so you can completely ignore all the other keys.

That said, I find that skeleton-based rendez-vous hashing is simple to implement and maintain, it's fast and it spreads keys pretty evenly (especially in the long run). Consistent hashing can become uneven over time if multiple nearby severs/sites leave the ring; that's why some solutions make use of 'virtual sites/servers' which never leave the ring (but this adds complexity and overhead).

LgWoodenBadger 4 years ago |

"To ensure that each key gets a unique permutation, we also have to make the hash function depend on the key. But this is not difficult - the solution is to concatenate the key with each server or to use the server ID as a hash seed."

Did the author mean to say "use the key as a hash seed?" Otherwise, I don't see how using server ID as a seed makes the hash dependent on the key.

ComputerGuru 4 years ago | |

“Concatenate the key with each server” -> use (key + unique id) as the key, giving a deterministically unique result.

“Use the server id as the hash seed” -> apart from the encryption, seed the result by first hashing the unique id then updating with the actual content before finalizing the hash, depending on the hash algorithm in question akin to keyed_hash(unique_id, actual_payload)

zvrba 4 years ago |

> If our first choice for a server goes offline, we simply move the key to the second server in the list (which becomes our new first choice).

It doesn't make sense: how do you move the keys from an offline server? The pictures in the post suggest that keys are not redundantly stored on other servers.

pkhuong 4 years ago | |

Many load-balancing use cases (e.g., caches or stateless services that simply work better with affinity) don't need the old data. If you need redundancy, the ordering also tells you where to duplicate the data: durably send updates to the first k + 1 servers in the list, and you're safe against up to k failures.

henns 4 years ago | |

You can couple Rendezvous Hashing with a redundancy method like replication or erasure coding.

The default for Tahoe-LAFS (a distributed filesystem that uses a form of Rendezvous Hashing) uses erasure coding to split each file into (by default) 10 segments of which any 3 are necessary to reconstruct the file. Those 10 segments are then stored across the first 10 servers in the list. [https://tahoe-lafs.readthedocs.io/en/latest/architecture.htm...]

That way, even when servers go away with your data (whether due to crash or even network partitions(!)), you still have a decent chance to locate your data.

bencoleman 4 years ago |

Author here - thanks for the feedback everybody! Agree that the weighted version and the skeleton version would be cool to talk about. Perhaps in future :)