LRULFUTTLCache InvalidationCache StampedeCache WarmingEviction

Eviction & Invalidation

Master cache eviction policies, invalidation strategies, cache stampede prevention, and cache warming — the fundamentals of keeping caches fast and correct.

26 min read10 sections
01

The Big Picture — Why Caches Can't Store Everything

A cache is fast because it lives in memory (RAM). But memory is limited and expensive. A Redis instance with 64 GB of RAM can't hold a 2 TB database. So the cache must decide: what stays, what goes, and when does stored data become too old to trust?

🧊

The Fridge Analogy

Your fridge has limited shelf space (memory). You can't store every item from the grocery store. Eviction is deciding what to throw out when the fridge is full — you toss the item you haven't touched in weeks (LRU), or the one you rarely use (LFU), or the one past its expiry date (TTL). Invalidation is checking if something is still good — the milk says 'best before Dec 10' but it smells fine (TTL-based), or you know the store recalled that batch so you throw it out immediately (event-based invalidation). Keeping expired food is dangerous (stale data). Throwing out everything daily is wasteful (over-invalidation). The skill is finding the balance.

🔥 Key Insight

Eviction answers: "The cache is full — what do I remove to make room?" Invalidation answers: "This data might be wrong — should I remove or refresh it?" Both are essential. A cache without eviction runs out of memory. A cache without invalidation serves stale data.

02

Cache Lifecycle Overview

Cache Entry Lifecycletext
1. ADDData is written to cache (on miss or on write)
2. USEData is read from cache (cache hit)
3. UPDATEData is refreshed (invalidation + re-fetch)
4. EVICTData is removed (memory full, TTL expired, or invalidated)

                ┌──────────────────────────────┐
CACHE MEMORY
                │                              │
   ADD ──────→  │  key:value  (TTL: 300s)      │  ──────→ EVICT
key:value  (TTL: 60s)       │    (LRU / LFU / TTL)
   USE ←──────  │  key:value  (TTL: 600s)      │
                │                              │
                └──────────────────────────────┘

                    INVALIDATE
                  (data changed in DB)
🗑️

Eviction

Cache is full → remove something. The eviction policy decides WHAT to remove. LRU removes the oldest-accessed. LFU removes the least-used. TTL removes expired entries.

🔄

Invalidation

Data changed in the database → the cached copy is now wrong. Invalidation removes or refreshes the stale entry. Can be triggered by writes, events, or time.

Expiration (TTL)

Each entry has a time-to-live. After TTL seconds, the entry is considered stale and eligible for removal. A safety net that limits how long stale data can persist.

💡 The Freshness vs Performance Trade-off

Aggressive invalidation = fresh data, more cache misses, higher DB load. Lazy invalidation (long TTL) = stale data, fewer cache misses, lower DB load. Every caching decision is a point on this spectrum.

03

Eviction Policies

When the cache is full and a new entry needs space, the eviction policy decides which existing entry to remove. The right policy keeps hot data in cache and removes cold data.

LRU — Least Recently Used

📚

The Desk Stack

You have a stack of books on your desk. Every time you use a book, you put it on top. When the desk is full and you need space, you remove the book at the bottom — the one you haven't touched in the longest time. That's LRU: the item that was accessed least recently gets evicted first.

LRU — How It Workstext
Cache capacity: 4 items

Access sequence: A, B, C, D, E, A, F

State after each access:
  A → [A]
  B → [B, A]
  C → [C, B, A]
  D → [D, C, B, A]        ← cache full
  E → [E, D, C, B]        ← A evicted (least recently used)
  A → [A, E, D, C]        ← B evicted, A moves to front
  F → [F, A, E, D]        ← C evicted

Key insight: recently accessed items stay, old items go.
Redis uses an approximated LRU (samples random keys, evicts the oldest).

LFU — Least Frequently Used

📊

The Popularity Contest

Instead of tracking WHEN you last used something, LFU tracks HOW MANY TIMES you've used it. The item with the fewest total accesses gets evicted. A book you read once last week gets evicted before a book you've read 50 times — even if the 50-time book hasn't been touched today.

LFU — How It Workstext
Cache capacity: 3 items

Access sequence: A, A, A, B, B, C, D

State (item: access count):
  A(1) → A(2) → A(3) → A(3),B(1) → A(3),B(2) → A(3),B(2),C(1)  ← full
  D arrivesevict C (count=1, lowest) → A(3),B(2),D(1)

Key insight: frequently accessed items stay, rarely used items go.
Problem: a once-popular item that's no longer relevant stays forever
  (high count but no recent accesses).
Fix: decay counters over time, or use a hybrid LRU-LFU approach.

TTL — Time-To-Live

TTL isn't strictly an eviction policy — it's an expiration mechanism. Each entry has a timestamp. After TTL seconds, the entry is considered expired and removed (either lazily on next access or actively by a background sweep).

TTL — How It Workstext
SET user:42 = {name: "Alice"} TTL 300  (expires in 5 minutes)

T=0s:    GET user:42 → {name: "Alice"}  ✅ (fresh)
T=200s:  GET user:42 → {name: "Alice"}  ✅ (still valid)
T=301s:  GET user:42null             ❌ (expired, cache miss)
Fetch from DB, re-cache with new TTL

TTL is a safety net:
  Even if invalidation fails, data won't be stale for more than TTL seconds.
  Short TTL (10s) = fresher data, more DB hits
  Long TTL (1hr) = staler data, fewer DB hits
PolicyEvicts Based OnBest ForWeakness
LRULeast recently accessedGeneral purpose, most workloadsScan pollution (one-time access pushes out hot data)
LFULeast frequently accessedStable hot sets (popular products)Stale popular items never evicted; cold start problem
TTLTime since creation/last refreshData with known freshness requirementsDoesn't consider access patterns; all entries treated equally
RandomRandom selectionWhen access patterns are uniformMight evict hot data

🎯 Interview Insight — Why LRU Is Most Common

LRU is the default in Redis, Memcached, and most caching systems. It works well for the majority of workloads because recent access is a strong predictor of future access. LFU is better when you have a stable set of popular items (top 100 products), but it's more complex and has the stale-popularity problem. In interviews, default to LRU unless there's a specific reason for LFU.

04

Cache Invalidation Strategies

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton. Cache invalidation is hard because you must keep two data stores (cache and database) in sync, and any gap means users see wrong data.

Strategy 1: TTL-Based (Time Expiration)

How it works

  • Every cache entry has a TTL (e.g., 5 minutes)
  • After TTL expires, entry is removed
  • Next read triggers a fresh fetch from DB
  • Simplest approach — no coordination needed

Trade-offs

  • Data is stale for up to TTL duration
  • Short TTL = more DB load, fresher data
  • Long TTL = less DB load, staler data
  • Doesn't react to actual data changes

Strategy 2: Write-Through Invalidation

Write-Through — Cache Updated on Every Writetext
User updates profile name:

1. App writes to DB:  UPDATE users SET name='Alicia' WHERE id=42
2. App writes to cache: SET user:42 = {name: "Alicia"}
   (or: DELETE user:42 to force re-fetch on next read)

Result: cache is always in sync with DB
Trade-off: every write is slower (must update both)
Risk: if step 2 fails, cache has stale data
  → Use "delete" instead of "update" — safer (forces re-fetch)

Strategy 3: Cache-Aside Invalidation (Delete on Write)

Cache-Aside — Delete Cache Key on Writetext
function updateUser(userId, data) {
  // 1. Write to database
  db.query("UPDATE users SET name = ? WHERE id = ?", data.name, userId);

  // 2. Delete from cache (NOT update)
  cache.delete("user:" + userId);

  // Next read will miss → fetch fresh data from DB → re-cache
}

Why DELETE instead of UPDATE?
If the DB write succeeds but cache update fails, cache has stale data
If you DELETE and it fails, worst case: cache still has old data (TTL safety net)
DELETE is idempotent and safer than UPDATE

Strategy 4: Event-Based Invalidation

Event-Based — Pub/Sub Invalidationtext
Architecture:
  App Server A writes to DB
  App Server A publishes event: "user:42:updated"
  Redis Pub/Sub (or Kafka) broadcasts to all app servers
  All app servers delete "user:42" from their local cache

Flow:
  DB writepublish eventall caches invalidated

Benefits:
Works across multiple app servers / cache instances
Near real-time invalidation
Decoupledwriters don't need to know about caches

Trade-offs:
  ⚠️ More infrastructure (message broker)
  ⚠️ Events can be delayed or lost
  ⚠️ TTL still needed as a safety net
StrategyFreshnessComplexityDB LoadBest For
TTL onlyStale up to TTLVery lowModerateData where brief staleness is OK
Delete on writeFresh after next readLowLow (only on miss)Most CRUD applications
Write-throughAlways freshMediumLowConsistency-critical data
Event-basedNear real-timeHighLowMulti-server, distributed caches

🎯 Interview Insight

The safest default: delete-on-write + TTL as safety net. When data changes, delete the cache key. If the delete fails, TTL ensures the stale entry expires eventually. This covers 90% of use cases with minimal complexity.

05

Cache Stampede

A cache stampede (also called thundering herd) happens when a popular cache entry expires and hundreds or thousands of requests simultaneously hit the database to regenerate it. The database gets overwhelmed, response times spike, and the system can crash.

🏪

The Shop Opening Analogy

A popular store opens at 9 AM. At 8:59, 500 people are waiting outside. The moment the doors open, everyone rushes in at once — the entrance is jammed, shelves are overwhelmed, staff can't cope. That's a cache stampede. The 'store' is your database, the 'doors opening' is the cache entry expiring, and the '500 people' are concurrent requests that all miss the cache at the same instant.

Cache Stampede — What Happenstext
Popular cache key: "homepage:trending" (TTL: 60 seconds)
Traffic: 10,000 requests/second for this key

Normal operation:
  9,999 requests → cache HIT (1ms)
  1 request → cache MISS → fetch from DB → re-cache

At TTL expiration (T=60s):
  Cache key expires
  Next 100 requests (within 50ms) ALL miss the cache
  All 100 hit the database simultaneously
  DB query takes 200ms (expensive aggregation)
  DB is now handling 100 identical queries at once
  DB CPU spikes to 100%, other queries slow down
  Cascading failure: more timeoutsmore retriesmore load

  ┌─────────┐
Cachekey expired!
  └────┬─────┘
MISS × 100

  ┌─────────┐
Database │  💥 100 identical queries at once
  └─────────┘

Solutions

1

Request Coalescing (Single Flight)

When multiple requests miss the same key simultaneously, only ONE request fetches from the database. All other requests wait for that single fetch to complete and share the result. 100 concurrent misses → 1 DB query instead of 100. Libraries: Go's singleflight, Node's async-sema.

2

Locking (Mutex on Cache Key)

When a cache miss occurs, the first request acquires a lock on the key, fetches from DB, and populates the cache. Other requests for the same key wait for the lock to release, then read from cache. Prevents duplicate DB queries. Risk: if the lock holder crashes, others wait forever — use a lock with a timeout.

3

Staggered Expiration (Jitter)

Instead of all keys expiring at exactly TTL seconds, add random jitter: TTL = 300 + random(0, 60). Keys expire at different times, spreading the DB load. Simple and effective for preventing synchronized expiration of many keys.

4

Early Refresh (Stale-While-Revalidate)

Before the TTL expires, a background process refreshes the cache. The key never actually expires — it's always fresh. Requests always hit the cache. The refresh happens asynchronously. Most sophisticated but eliminates stampedes entirely.

SolutionComplexityEffectivenessTrade-off
Request coalescingLowHighRequires in-process coordination
LockingMediumHighLock contention, timeout handling
Staggered TTLVery lowMediumDoesn't prevent single-key stampede
Early refreshHighVery highBackground process, extra DB reads

⚠️ This Is a Real Production Problem

Cache stampedes have caused outages at major companies. The fix is usually request coalescing (simplest, most effective). Add staggered TTL as a baseline for all keys. Use early refresh for your hottest keys (homepage, trending content).

06

Cache Warming

Cache warming is the process of preloading frequently accessed data into the cache before users request it. Instead of waiting for the first request to trigger a cache miss (cold start), you fill the cache proactively.

When Cache Warming Matters

🚀

After Deployment

A new server starts with an empty cache. The first 1000 requests all miss → DB is hammered. Warming preloads hot data before the server takes traffic.

💥

After Cache Failure

Redis crashes and restarts with empty memory. Suddenly 100% of requests hit the DB. Warming restores the hot dataset quickly.

📊

Predictable Traffic

You know the homepage, top products, and trending content will be requested. Preload them at startup or on a schedule instead of waiting for misses.

Cache Warming — Approachestext
Approach 1: Startup Warming
  On server start, before accepting traffic:
Fetch top 1000 products from DBcache them
Fetch homepage datacache it
Fetch user session data for active userscache it
  Then: start accepting traffic (cache is warm)

Approach 2: Scheduled Warming
  Every 5 minutes, a background job:
Queries DB for most-accessed keys (from access logs)
Preloads them into cache
Refreshes keys that are about to expire

Approach 3: Peer Warming
  New server joins the cluster:
Copies hot keys from an existing server's cache
No DB queries needed
Fastest approach but requires cache-to-cache transfer

Benefits

  • Eliminates cold-start latency for users
  • Prevents DB overload after deployment or cache failure
  • Predictable performance from the first request
  • Reduces cache stampede risk during startup

Trade-offs

  • Extra computation and DB load during warming
  • Might preload data that's never actually requested
  • Delays server startup (if warming is synchronous)
  • Stale data if warming happens too far in advance

💡 Practical Tip

Warm the top 1-5% of keys that generate 80% of traffic. Don't try to warm the entire dataset — that defeats the purpose of caching. Use access logs or analytics to identify the hot set.

07

End-to-End Scenario

Let's design the eviction, invalidation, and stampede prevention for a social media feed — one of the most cache-intensive systems.

📱 Social Media Feed — 50M DAU

Hot data: trending posts, popular user profiles, feed cache per user.

Cache size: 128 GB Redis cluster. Total data: 5 TB in PostgreSQL.

Read QPS: 200,000. Write QPS: 5,000 (new posts, likes).

1

Eviction Policy: LRU

128 GB can't hold all 50M users' feeds. LRU ensures active users' feeds stay cached. A user who hasn't opened the app in 3 days gets evicted. When they return, their feed is rebuilt from DB (cache miss). LRU is the right choice because recent access strongly predicts future access in social media.

2

Invalidation: Delete-on-Write + TTL Safety Net

When a user posts, delete their followers' feed cache keys. Each follower's next feed request rebuilds from DB with the new post included. TTL of 5 minutes as a safety net — even if invalidation fails, feeds are never more than 5 minutes stale.

3

Stampede Prevention: Request Coalescing + Early Refresh

Trending posts are requested 10,000 times/sec. Use request coalescing: if the key expires, only 1 request fetches from DB while others wait. For the top 100 trending posts, use early refresh: a background job refreshes them 30 seconds before TTL expires. These keys never actually expire.

4

Cache Warming: Startup + Scheduled

On Redis restart: preload the top 10,000 trending posts and the feeds of the 100,000 most active users. Background job every minute: refresh trending content and soon-to-expire hot keys. This prevents cold-start latency and keeps the hottest data always fresh.

Architecture Summarytext
Redis Cluster (128 GB, LRU eviction):
  ├── feed:{userId}        → user's feed (TTL 5min, invalidate on new post)
  ├── trending:poststop posts (early refresh, never expires)
  ├── user:{userId}:profileuser profile (TTL 10min, delete on update)
  └── post:{postId}        → post content (TTL 30min)

Invalidation:
  New postdelete feed:{followerId} for all followers
  Profile updatedelete user:{userId}:profile
  Safety net: TTL on every key

Stampede prevention:
  All keys: request coalescing (singleflight)
  Top 100 trending: early refresh (background job)
  All TTLs: jitter ±10% (staggered expiration)

Warming:
  On startup: preload trending + top 100K user feeds
  Every minute: refresh trending, refresh expiring hot keys
08

Trade-offs & Decision Making

Freshness vs Performance

ApproachFreshnessPerformanceComplexity
No cacheAlways freshSlowest (every read hits DB)None
Long TTL (1hr)Stale up to 1 hourFastest (99% hit rate)Low
Short TTL (10s)Stale up to 10 secondsFast (70-80% hit rate)Low
Delete on writeFresh after next readFast (high hit rate)Medium
Event-based invalidationNear real-timeFastHigh
Write-throughAlways freshFast reads, slow writesMedium

Memory Usage vs Hit Rate

Cache SizeHit RateDB LoadCost
10% of dataset~70-80%20-30% of originalLow
20% of dataset~90-95%5-10% of originalMedium
50% of dataset~98-99%1-2% of originalHigh
100% of dataset~100%Near zeroVery high (defeats purpose)

🎯 The Sweet Spot

Cache 10-20% of your dataset for 90%+ hit rate. Beyond that, you're paying exponentially more memory for diminishing returns. Use LRU to automatically keep the right 10-20% — the data that's actually being accessed.

09

Interview Questions

Q:Why is cache invalidation hard?

A: Because you have two sources of truth (cache and database) that must stay in sync. Any write to the database makes the cache stale. If you invalidate too aggressively, you lose the performance benefit (too many misses). If you invalidate too lazily, users see wrong data. Race conditions make it worse: what if a read re-caches old data between a DB write and cache delete? There's no perfect solution — only trade-offs between freshness, performance, and complexity.

Q:What is a cache stampede and how do you prevent it?

A: A cache stampede occurs when a popular cache key expires and many concurrent requests all miss the cache simultaneously, overwhelming the database with identical queries. Prevention: (1) Request coalescing — only one request fetches from DB, others wait and share the result. (2) Locking — first request acquires a lock, others wait. (3) Staggered TTL — add random jitter so keys don't expire at the same time. (4) Early refresh — refresh keys before they expire so they never actually go stale.

Q:LRU vs LFU — when would you use each?

A: LRU (Least Recently Used): default choice for most workloads. Works well when recent access predicts future access — social feeds, user sessions, web pages. Simple, low overhead. LFU (Least Frequently Used): better when you have a stable set of popular items — top 100 products, trending content. Items that are accessed thousands of times stay cached even if not accessed in the last minute. Trade-off: LFU has a 'stale popularity' problem — once-popular items with high counts never get evicted. Use LFU with decay or a hybrid approach.

1

After deploying a new Redis instance, your API latency spikes for 10 minutes

What's happening and how do you fix it?

Answer: Cold cache. The new Redis instance has no data — every request is a cache miss hitting the database. Fix: cache warming. Before routing traffic to the new instance, preload the hot dataset (top products, active user sessions, trending content). Use access logs to identify the top 1-5% of keys. Warm them synchronously before the instance takes traffic. Also: use staggered rollouts — don't replace all cache instances at once.

2

Your cache hit rate dropped from 95% to 60% overnight

What could cause this and how do you investigate?

Answer: Possible causes: (1) Cache size too small — data grew, LRU is evicting hot keys. Check eviction metrics. Fix: increase cache size. (2) TTL too short — keys expire before being re-accessed. Check TTL distribution. (3) Access pattern changed — a new feature or traffic spike introduced new keys that displaced hot data. (4) Cache warming stopped — a background job failed. Check job logs. Investigate: check Redis INFO stats (evicted_keys, hit_rate, memory usage), compare with the previous day.

10

Common Mistakes

🔄

Ignoring invalidation entirely

Setting a TTL and assuming the cache will 'figure it out.' A user changes their email, but the old email shows for 30 minutes because the cache wasn't invalidated. For some data (prices, inventory), this causes real business problems.

Use delete-on-write as the primary invalidation mechanism. TTL is a safety net, not the strategy. When data changes in the DB, explicitly delete the cache key. The next read will fetch fresh data.

Using only TTL blindly

Setting the same TTL (e.g., 5 minutes) for all cache keys regardless of the data's characteristics. Product descriptions (change monthly) and inventory counts (change every second) should not have the same TTL.

Match TTL to the data's change frequency and staleness tolerance. Static content: hours. Product details: 10-30 minutes. Inventory: 5-10 seconds. User sessions: match session timeout. Trending content: 1-2 minutes.

💥

Not handling cache stampede

A popular key expires and 1,000 requests hit the database simultaneously. The DB query takes 500ms, so for 500ms you have 1,000 concurrent identical queries. DB CPU spikes, other queries slow down, cascading failure begins.

Implement request coalescing as a baseline for all cache reads. Add staggered TTL (jitter) to prevent synchronized expiration. For your hottest keys (homepage, trending), use early refresh so they never expire.

📦

Overloading cache with unnecessary data

Caching every DB query result, including data accessed once per day. The cache fills with cold data, evicting hot data. Hit rate drops. You're paying for 128 GB of Redis that's 80% full of data nobody reads.

Cache only hot data. Monitor hit rates per key pattern. If a key pattern has <50% hit rate, stop caching it. Use LRU eviction to automatically prioritize hot data. The goal is 90%+ hit rate on 10-20% of your dataset.