Consistent HashingHash RingVirtual NodesRebalancingData DistributionDynamoDBCassandra

Consistent Hashing

Understand consistent hashing — hash ring mechanics, virtual nodes, minimal rebalancing on node changes, and hotspot mitigation. The algorithm behind every distributed cache and database.

28 min read10 sections
01

The Big Picture — Why Traditional Hashing Fails

The simplest way to distribute data across N servers ishash(key) % N. Key "user:42" hashes to 7, 7 % 3 = server 1. Fast, simple, deterministic. But it has a fatal flaw: when N changes, almost everything moves.

Why hash(key) % N Breakstext
3 servers: hash(key) % 3
  key Ahash=77 % 3 = 1Server 1
  key Bhash=1212 % 3 = 0Server 0
  key Chash=1919 % 3 = 1Server 1
  key Dhash=2525 % 3 = 1Server 1
  key Ehash=3131 % 3 = 1Server 1

Add 1 server4 servers: hash(key) % 4
  key Ahash=77 % 4 = 3Server 3MOVED
  key Bhash=1212 % 4 = 0Server 0same
  key Chash=1919 % 4 = 3Server 3MOVED
  key Dhash=2525 % 4 = 1Server 1same
  key Ehash=3131 % 4 = 3Server 3MOVED

Result: 3 out of 5 keys moved (60%).
At scale: adding 1 server to a 100-server cluster moves ~99% of keys.
Every moved key = cache miss = database hit = potential overload.
🪑

The Circular Table Analogy

Imagine people sitting around a circular table. Each person handles the tasks (keys) closest to them going clockwise. If someone leaves, only their tasks are redistributed — to the next person clockwise. Everyone else keeps their tasks. If a new person sits down, they only take over the tasks between them and the previous person. Minimal disruption. This is consistent hashing — a circular arrangement where adding or removing a node only affects its immediate neighbors.

🔥 The Core Problem

With hash(key) % N, changing N reshuffles almost everything. With consistent hashing, changing N only movesK/N keys (where K is total keys). Adding 1 server to a 100-server cluster moves ~1% of keys instead of ~99%.

02

What Is Consistent Hashing?

Consistent hashing maps both keys and nodes onto the same circular hash space (a "ring"). Each key is assigned to the nearest node clockwise on the ring. When a node is added or removed, only the keys between it and its neighbors are affected — everything else stays put.

🎯 Goals

  • Distribute data evenly across nodes
  • Minimize data movement when nodes change
  • Deterministic — same key always maps to same node
  • No central coordinator needed

🏗️ Used In

  • Amazon DynamoDB (partition routing)
  • Apache Cassandra (token ring)
  • Memcached / Redis cluster (key distribution)
  • CDNs (request routing to edge servers)
  • Load balancers (sticky session routing)
03

Hash Ring Mechanics

The hash ring is a circular number line from 0 to 2³² - 1 (or any large range). Both nodes and keys are hashed onto this ring. A key is assigned to the first node encountered going clockwise from the key's position.

🕐

The Clock Face

Think of a clock. Servers are placed at specific hours (e.g., Node A at 2 o'clock, Node B at 6 o'clock, Node C at 10 o'clock). When a key hashes to 4 o'clock, you walk clockwise until you hit the next server — that's Node B at 6 o'clock. Node B owns all keys between 2 o'clock and 6 o'clock.

Hash Ring — Step by Steptext
Hash space: 0 ────────────────────────── 2³²
Wrapped into a ring (0 connects back to 2³²)

Step 1: Place nodes on the ring
  hash("NodeA") = 100position 100
  hash("NodeB") = 300position 300
  hash("NodeC") = 600position 600

Ring:
        0

  NodeA ● (100)

  NodeB ● (300)


  NodeC ● (600)

        └──── wraps back to 0

Step 2: Place keys on the ring
  hash("user:42")  = 150walk clockwiseNodeB (300)
  hash("user:7")   = 50walk clockwiseNodeA (100)
  hash("user:99")  = 450walk clockwiseNodeC (600)
  hash("user:123") = 700walk clockwiseNodeA (100) ← wraps around

Key ownership:
  NodeA owns: keys in range (600, 100]  ← wraps around 0
  NodeB owns: keys in range (100, 300]
  NodeC owns: keys in range (300, 600]

Lookup Algorithm

1

Hash the key

Compute hash('user:42') = 150. This gives the key's position on the ring.

2

Walk clockwise

Starting from position 150, find the first node clockwise. Nodes are at 100, 300, 600. The first node after 150 is NodeB at 300.

3

Route to that node

Send the request for 'user:42' to NodeB. Every client independently computes the same result — no coordinator needed.

🎯 Interview Insight

The ring-based approach is powerful because it's decentralized. Every client can independently compute which node owns a key — just hash the key and walk clockwise. No central routing table, no single point of failure, no coordination.

04

Virtual Nodes (VNodes)

With only 3 physical nodes on the ring, the distribution is often uneven — one node might own 60% of the ring while another owns 10%. Virtual nodes solve this by giving each physical node multiple positions on the ring.

🎪

Multiple Booths Instead of One

Imagine a circular food court. With 3 vendors, each gets one booth — but the booths might be unevenly spaced, so one vendor gets most of the foot traffic. Virtual nodes: each vendor sets up 5 smaller booths spread evenly around the circle. Now traffic is distributed much more evenly, even though there are still only 3 vendors. If a vendor leaves, their 5 booths are removed — and the keys from each booth go to 5 different neighbors, not all to one.

Virtual Nodes — How They Worktext
Without vnodes (3 physical nodes, 3 positions):
  NodeA at position 100owns range (600, 100] = 50% of ring
  NodeB at position 300owns range (100, 300] = 20% of ring
  NodeC at position 600owns range (300, 600] = 30% of ring
Uneven: NodeA has 2.5x more data than NodeB

With vnodes (3 physical nodes, 5 vnodes each = 15 positions):
  NodeA: positions 50, 200, 400, 550, 800
  NodeB: positions 100, 280, 450, 650, 900
  NodeC: positions 150, 350, 500, 750, 950

  Ring has 15 evenly-ish distributed points.
  Each physical node owns ~33% of the ring.
Much more balanced distribution.

When NodeB is removed:
  Its 5 vnodes are removed from the ring.
  Keys from each vnode go to the NEXT node clockwise.
  Load is spread across NodeA and NodeC (5 small transfers).
No single node gets overwhelmed.

Without vnodes, removing NodeB dumps ALL its keys onto one neighbor.
With vnodes, the load is distributed across multiple neighbors.
FeatureWithout VNodesWith VNodes
Positions per node150-200 (configurable)
Load balanceOften unevenNear-uniform
Node removal impactAll keys go to 1 neighborKeys spread across many neighbors
Node addition impactTakes keys from 1 neighborTakes small slices from many neighbors
Heterogeneous hardwareCan't handle (all nodes equal)More vnodes for stronger machines
ComplexitySimpleSlightly more complex (more ring entries)

VNode Benefits

  • Even data distribution regardless of node count
  • Graceful scaling — load spreads across many nodes
  • Heterogeneous clusters — give powerful nodes more vnodes
  • Faster rebalancing — many small transfers vs one large one
  • Used by Cassandra (256 vnodes default), DynamoDB, Riak

VNode Trade-offs

  • More metadata to maintain (ring has N × vnodes entries)
  • Slightly more complex lookup (more positions to search)
  • Rebalancing touches more nodes (coordination overhead)
  • Too many vnodes = excessive metadata, diminishing returns
  • Typical sweet spot: 100-256 vnodes per physical node

🎯 Interview Insight

Virtual nodes are the answer to "how do you handle uneven distribution in consistent hashing?" Always mention them. Explain: "Each physical node gets 100-200 virtual positions on the ring. This ensures even distribution and means adding or removing a node spreads the impact across many neighbors instead of one."

05

Rebalancing on Node Changes

The defining feature of consistent hashing: when a node is added or removed, only a small fraction of keys need to move. This is what makes elastic scaling possible without massive data reshuffling.

Adding a Node

Adding NodeD to the Ringtext
Before (3 nodes):
  Ring: ... NodeA(100) ... NodeB(300) ... NodeC(600) ...
  NodeB owns: keys in range (100, 300]

Add NodeD at position 200:
  Ring: ... NodeA(100) ... NodeD(200) ... NodeB(300) ... NodeC(600) ...

What moves:
  Keys in range (100, 200] → move FROM NodeB TO NodeD
  Keys in range (200, 300] → stay on NodeB
  Keys on NodeAunchanged
  Keys on NodeCunchanged

Total movement: only keys between NodeA and NodeD
  = roughly 1/N of NodeB's keys (with vnodes, even less per node)

Contrast with hash(key) % N:
  Adding 1 node to 3hash(key) % 4
  ~75% of ALL keys across ALL nodes move. Catastrophic.

Removing a Node

Removing NodeB from the Ringtext
Before (3 nodes):
  Ring: ... NodeA(100) ... NodeB(300) ... NodeC(600) ...
  NodeB owns: keys in range (100, 300]

Remove NodeB:
  Ring: ... NodeA(100) ... NodeC(600) ...

What moves:
  Keys in range (100, 300] → move FROM NodeB TO NodeC
  (NodeC is the next node clockwise after the gap)
  Keys on NodeAunchanged
  Keys on NodeC (original) → unchanged

With vnodes:
  NodeB had 100 vnodes spread across the ring.
  Each vnode's keys go to the next clockwise node.
  Load is distributed across NodeA AND NodeC (and others).
  No single node absorbs all of NodeB's data.
Operationhash(key) % NConsistent HashingConsistent + VNodes
Add 1 node to 10~90% keys move~10% keys move~10% spread across all nodes
Remove 1 node from 10~90% keys move~10% keys move to 1 neighbor~10% spread across all nodes
Add 1 node to 100~99% keys move~1% keys move~1% spread across all nodes
Scaling impactCatastrophicMinimalMinimal + balanced

🎯 Interview Insight

This is the #1 reason consistent hashing exists. When asked "why not just use modulo hashing?" — explain: "Adding a server to a 100-node cluster with modulo moves 99% of keys. With consistent hashing, it moves 1%. That's the difference between a 5-second rebalance and a 5-hour one."

06

Hotspot Mitigation

Even with consistent hashing and virtual nodes, hotspots can occur when certain keys receive disproportionately more traffic than others. A celebrity's profile, a viral product, or a trending hashtag can overwhelm the node that owns that key.

🔥

One Counter Getting All the Customers

Even if you distribute booths evenly around the food court, if one vendor sells the viral TikTok drink, their booth gets 100x more traffic than everyone else. The even distribution of booths doesn't help — the problem is the uneven distribution of demand. You need to either split that vendor's booth into multiple locations or redirect overflow to other vendors.

Causes of Hotspots

📊

Skewed Key Distribution

If most keys hash to a narrow range, one node handles disproportionate load. Poor hash functions or correlated key patterns cause this.

Popular Keys

A celebrity's profile (user:beyonce) or a viral product gets millions of reads. The node owning that key is overwhelmed while others are idle.

Time-Based Patterns

All users in one timezone hit the same shard at the same time. Or a batch job writes millions of sequential keys that all hash to one node.

Solutions

1

Virtual Nodes (baseline)

More vnodes = more even distribution. If one physical node has 200 positions on the ring, the chance of a large contiguous range landing on one node is very low. This handles skewed key distribution.

2

Key Salting / Splitting

For known hot keys, append a random suffix: 'user:beyonce' becomes 'user:beyonce:0', 'user:beyonce:1', ..., 'user:beyonce:9'. These 10 keys hash to different nodes. Reads fan out to all 10 and merge. Writes go to a random suffix.

3

Read Replicas

Replicate hot data to multiple nodes. Reads are distributed across replicas. The primary node handles writes, replicas handle reads. This is how Cassandra handles hot partitions.

4

Caching Layer

Put a cache (Redis) in front of the hot key. The cache absorbs 99% of reads. Only cache misses reach the node. This is the simplest and most common solution for read-heavy hotspots.

🎯 Interview Insight

When an interviewer asks about hotspots, mention the layered approach: "Virtual nodes handle general distribution. For known hot keys, I'd use key salting to spread them across multiple nodes. For read-heavy hotspots, a cache layer absorbs most traffic. For write-heavy hotspots, I'd consider splitting the partition."

07

End-to-End Scenario

Let's design a distributed cache cluster using consistent hashing and walk through real operational scenarios.

System: Redis Cache Cluster (5 nodes)

Distributed Cache — Setuptext
Cluster: 5 Redis nodes, 200 vnodes each = 1,000 ring positions
Hash function: SHA-256 (uniform distribution)
Total keys: 10 million cached objects

Initial distribution:
  Node 1: ~2.0M keys (20.1%)
  Node 2: ~2.0M keys (19.8%)
  Node 3: ~2.0M keys (20.3%)
  Node 4: ~2.0M keys (19.7%)
  Node 5: ~2.0M keys (20.1%)
Near-perfect balance with 200 vnodes
1

Traffic doubles — you need to add Node 6

What happens to the data?

Answer: Node 6 gets 200 vnodes placed on the ring. Each vnode takes over a small slice from its clockwise neighbor. Total keys moved: ~10M / 6 = ~1.67M keys (16.7%). These keys are distributed across all 5 existing nodes — no single node loses more than ~333K keys. The cluster continues serving traffic during rebalancing. Clients that request a moved key get a cache miss (one-time), fetch from the database, and cache on the new node.

2

Node 3 crashes unexpectedly

What happens and how does the system recover?

Answer: Node 3's 200 vnodes are removed from the ring. Its ~2M keys are now owned by the next clockwise nodes — distributed across Nodes 1, 2, 4, and 5 (thanks to vnodes). These keys are cache misses until re-fetched from the database. With 4 remaining nodes, each absorbs ~500K extra keys temporarily. The database sees a spike of ~2M cache misses spread over a few minutes. If Node 3 recovers, its vnodes rejoin the ring and keys migrate back.

3

A celebrity product goes viral — 100K requests/sec for one key

How do you handle the hotspot?

Answer: The key 'product:viral-item' hashes to one node, which is overwhelmed. Solutions: (1) Add a local in-process cache (L1) on each app server — 30-second TTL absorbs 99% of reads. (2) If still hot, use key salting: split into 'product:viral-item:0' through ':9', fan out reads across 10 nodes. (3) For the cache layer specifically, set a longer TTL on this key and use background refresh to prevent miss storms.

💡 This Is How Production Systems Work

DynamoDB, Cassandra, and every major distributed database use exactly this approach. Consistent hashing with vnodes for distribution, replication for durability, and caching for hotspots. The ring is the foundation — everything else builds on top of it.

08

Trade-offs & Decision Making

ApproachProsConsBest For
Modulo Hashing hash(key) % NSimplest possible, zero overhead~99% keys move on resize, catastrophic at scaleFixed-size clusters that never change
Consistent Hashing (no vnodes)Minimal key movement on resize (~1/N)Uneven distribution, one neighbor absorbs all on removalSmall clusters with uniform keys
Consistent Hashing + VNodesEven distribution, balanced rebalancingMore metadata, slightly complex lookupProduction distributed systems (DynamoDB, Cassandra)
Directory-Based RoutingFull control, can handle any distributionCentral coordinator = single point of failure, bottleneckSystems needing custom placement (geo-aware sharding)

VNode Count Trade-offs

VNodes per NodeDistributionMetadataRebalancing
1 (no vnodes)Very unevenMinimalAll keys to 1 neighbor
10-50ModerateLowSpread across a few neighbors
100-256 (sweet spot)Near-uniformModerateWell-distributed
1000+Very uniformHigh (memory overhead)Diminishing returns

🎯 Interview Framework

When discussing data distribution, start with: "I'd use consistent hashing with virtual nodes. Each physical node gets 150-200 vnodes for even distribution. When we scale from 10 to 11 nodes, only ~9% of keys move, spread across all existing nodes." Then mention hotspot mitigation if relevant.

09

Interview Questions

Q:Why not use hash(key) % N for distributing data?

A: Because when N changes (adding or removing a server), almost all keys get reassigned. Adding 1 server to a 100-server cluster moves ~99% of keys. In a cache cluster, this means 99% cache misses — the database is flooded with requests. Consistent hashing solves this: adding 1 server to 100 moves only ~1% of keys. The rest stay on their current nodes. This is the difference between a smooth scale-up and a cascading failure.

Q:What are virtual nodes and why are they needed?

A: Virtual nodes give each physical node multiple positions on the hash ring (typically 100-256). Without vnodes, 3 nodes might own 60%, 25%, and 15% of the ring — very uneven. With 200 vnodes each, the 600 ring positions create near-uniform distribution (~33% each). VNodes also improve rebalancing: when a node is removed, its keys are spread across many neighbors (one per vnode) instead of all going to one neighbor. This prevents any single node from being overwhelmed during scaling.

Q:How does consistent hashing handle scaling?

A: When a new node joins: it gets vnodes placed on the ring. Each vnode takes over a small key range from its clockwise neighbor. Total keys moved: ~K/N (K=total keys, N=new node count). When a node leaves: its vnodes are removed. Each vnode's keys go to the next clockwise node. With vnodes, this load is distributed across many nodes. The key property: only keys between the new/removed node and its neighbors move. All other keys stay put. This enables elastic scaling without downtime.

1

You're designing a distributed cache for 1 billion keys across 50 servers

How would you distribute the data?

Answer: Consistent hashing with 200 vnodes per server = 10,000 ring positions. Hash function: SHA-256 for uniform distribution. Each server owns ~20M keys (~2% of total). Adding a 51st server moves ~20M keys (~2%), spread across all 50 existing servers. For hotspot protection: L1 in-process cache on app servers for the top 1,000 keys, key salting for known viral keys. For durability: replicate each key to the next 2 clockwise nodes (replication factor 3).

2

After adding a new node, your cache hit rate drops from 99% to 85%

What happened and how do you fix it?

Answer: When the new node joined, ~1/N of keys were reassigned to it. Those keys are now cache misses on the new node (it has empty cache). The 85% hit rate is temporary — as requests come in, the new node's cache warms up. To fix: (1) Pre-warm the new node by copying data from its neighbors before it joins the ring. (2) Use a 'shadow' period where the new node receives traffic but the old node still serves as fallback. (3) Accept the temporary hit rate drop — it recovers within minutes as the cache warms.

10

Common Mistakes

🎯

Not using virtual nodes

Using consistent hashing with one position per node. With 5 nodes, one might own 40% of the ring while another owns 10%. When the overloaded node is removed, all its keys go to one neighbor — which then becomes overloaded. The cascading failure that consistent hashing was supposed to prevent still happens.

Always use virtual nodes. 100-256 vnodes per physical node is the standard. This ensures even distribution and spreads rebalancing load across all nodes. Every production system (Cassandra, DynamoDB, Riak) uses vnodes.

🔥

Ignoring hotspots

Assuming consistent hashing solves all distribution problems. It distributes keys evenly, but it can't distribute traffic evenly if some keys are 1000x more popular than others. One viral key can overwhelm a node even with perfect key distribution.

Monitor per-node traffic, not just key count. For read hotspots: add a cache layer or read replicas. For write hotspots: use key salting to split the hot key across multiple nodes. For predictable hotspots (celebrity accounts): pre-split them.

🔧

Poor hash function choice

Using a hash function that doesn't distribute uniformly (e.g., simple modulo, CRC32 for small key spaces). Keys cluster in certain ring regions, creating uneven load. Or using a hash function that's too slow (SHA-512) for a hot path.

Use a well-distributed hash function: MD5 (fast, good distribution, not for security), MurmurHash3 (very fast, excellent distribution), or SHA-256 (slower but cryptographically uniform). Test distribution with your actual key patterns.

📐

Misunderstanding rebalancing scope

Thinking that adding a node requires rehashing all keys. Or thinking that consistent hashing means zero keys move. Some keys always move — the point is that it's O(K/N) keys instead of O(K). Teams either over-engineer (trying to move zero keys) or under-prepare (not handling the keys that do move).

Expect ~1/N of keys to move when adding a node. Plan for temporary cache misses during rebalancing. Pre-warm new nodes when possible. Use replication so moved keys have a backup on another node.