Post

Consistent Hashing

Distribute keys across nodes so that only K/N keys are remapped when a node is added or removed (vs. all keys with simple modulo hashing).

Consistent Hashing

Distribute keys across nodes so that only K/N keys are remapped when a node is added or removed (vs. all keys with simple modulo hashing).

The Problem with Simple Hashing

serverIndex = hash(key) % N — when N changes, almost every key maps to a different server → massive cache invalidation / data reshuffling.

How It Works

  1. Map both servers and keys onto a hash ring (0 → 2³²)
  2. Each key is assigned to the next server clockwise
  3. Add/remove a node → only its adjacent keys are affected

Virtual Nodes

One physical server maps to multiple points on the ring → better load distribution.

Key Properties

Property Value
Keys remapped on node change ~K/N (minimal)
Without virtual nodes Uneven load distribution
With virtual nodes Balanced distribution
Lookup complexity O(log N) with sorted ring
Space complexity O(N × V) where V = virtual nodes

When to Use / Avoid

✅ Distributed caches (horizontal scaling) ✅ Sharding databases across many nodes ✅ Load balancing where nodes join/leave frequently ❌ Small fixed clusters (simple modulo is fine) ❌ When you need strict ordering guarantees

Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
import bisect
import hashlib

class ConsistentHash:
    def __init__(self, nodes=None, virtual_nodes=3):
        """
        Initialize consistent hash ring.

        Args:
            nodes: list of node identifiers (e.g., server names)
            virtual_nodes: replicas per physical node for load balancing
        """
        self.virtual_nodes = virtual_nodes
        self.ring = {}  # Maps hash value -> node
        self.sorted_keys = []  # Sorted hash values for binary search

        if nodes:
            for node in nodes:
                self.add_node(node)

    def _hash(self, key):
        """Hash a key to an integer position on the ring."""
        return int(hashlib.md5(str(key).encode()).hexdigest(), 16)

    def add_node(self, node):
        """Add a node (physical server) to the ring."""
        for i in range(self.virtual_nodes):
            virtual_key = f"{node}#{i}"
            hash_value = self._hash(virtual_key)
            self.ring[hash_value] = node

            # Keep sorted_keys updated for O(log N) lookup
            bisect.insort(self.sorted_keys, hash_value)

    def remove_node(self, node):
        """Remove a node from the ring."""
        for i in range(self.virtual_nodes):
            virtual_key = f"{node}#{i}"
            hash_value = self._hash(virtual_key)

            del self.ring[hash_value]
            self.sorted_keys.remove(hash_value)

    def get_node(self, key):
        """
        Find the node responsible for a key.

        Returns the first node clockwise from the key's hash position.
        """
        if not self.ring:
            return None

        hash_value = self._hash(key)

        # Find the first node >= hash_value (walk clockwise)
        index = bisect.bisect_left(self.sorted_keys, hash_value)

        # Wrap around if we reach the end
        if index == len(self.sorted_keys):
            index = 0

        return self.ring[self.sorted_keys[index]]

# Example usage
if __name__ == "__main__":
    nodes = ["server_a", "server_b", "server_c"]
    ch = ConsistentHash(nodes, virtual_nodes=3)

    # Route some keys
    keys = ["user:123", "user:456", "cache:abc"]
    for key in keys:
        print(f"{key} -> {ch.get_node(key)}")

    # Add a new node (only ~1/3 of keys are remapped)
    ch.add_node("server_d")
    print("\nAfter adding server_d:")
    for key in keys:
        print(f"{key} -> {ch.get_node(key)}")

How Real Systems Use This

Cassandra

Cassandra uses consistent hashing with virtual nodes (vnodes) on a token ring. Each physical server gets ~256 virtual nodes by default, distributing data and repairs evenly. The Murmur3 partitioner hashes partition keys to 64-bit token values on a ring from -2^63 to 2^63-1. This minimizes key rebalancing when nodes join/leave — only keys from the new node’s predecessor are reassigned. Configuration: num_tokens: 256 in cassandra.yaml.

DynamoDB

DynamoDB applies consistent hashing to partition keys across partitions, with an internal sharding layer. When you provision throughput, AWS automatically rebalances partitions using consistent hashing principles. If a partition exceeds its capacity, DynamoDB splits it (similar to adding a virtual node); affected items are rehashed to the new partition. This design allows seamless scaling without downtime — only affected partition keys are remapped.

Memcached

Memcached clients implement consistent hashing on the client side (not the server), using the libketama algorithm. The client maintains a local hash ring of all cache servers and routes requests accordingly. This avoids the “thundering herd” problem of simple modulo hashing: when a server fails, only ~1/N of cache keys rehash to other servers instead of all keys remapping. Many high-traffic sites use Memcached with 100+ servers using client-side consistent hashing.

Akamai CDN

Akamai pioneered consistent hashing (Karger et al. 1997 paper) for content routing across edge servers. When a user requests content, Akamai hashes the URL to determine the closest edge server. Adding a new edge location is seamless: only a small fraction of content is reassigned, avoiding the need to re-provision or flush large portions of the network. Akamai’s variant uses hierarchical locality (geographic regions) combined with the base consistent hash algorithm.

Redis Cluster

Redis Cluster uses a hash slot-based variant (not pure consistent hashing) with 16,384 fixed slots instead of a continuous ring. Each key is hashed modulo 16,384 to assign it to a slot; slots are then distributed across cluster nodes. This design trades some of consistent hashing’s benefits (minimal rebalancing) for simpler configuration and faster rehashing during node failures. When a node fails, affected slots are quickly reassigned; when a node joins, slots are gradually migrated to it.

References

This post is licensed under CC BY 4.0 by the author.