Optimistic LockingPessimistic LockingDistributed LockRedlockRow LockDeadlockConcurrency

Locking Strategies

Master concurrency control — optimistic vs pessimistic locking, distributed locks (Redlock), and row-level vs table-level locks. Trade-offs between performance and correctness.

24 min read9 sections

The Big Picture — Why Contention Happens

When two users try to book the last concert ticket at the same millisecond, or two servers try to process the same payment simultaneously, you have contention — multiple actors competing for the same resource. Without coordination, both succeed, and you've sold one ticket twice or charged a customer double.

📝

The Shared Document Analogy

Two people editing the same Google Doc paragraph at the same time. Without coordination, Alice types 'Hello' and Bob types 'World' in the same spot — one overwrites the other. That's a lost update. Pessimistic locking: Alice locks the paragraph. Bob sees 'locked by Alice' and waits. No conflict, but Bob is blocked. Optimistic locking: both edit freely. When Bob tries to save, the system detects Alice already changed it — Bob must merge or retry. No blocking, but conflicts need handling.

🔥 Key Insight

Locking is about choosing between blocking (wait for access) and retrying (try and handle conflicts). Blocking is safe but slow. Retrying is fast but complex. The right choice depends on how often conflicts actually happen.

Locking Overview

The Contention Problemtext

Two users buy the last ticket simultaneously:

Without locking:
  User A: SELECT stock FROM tickets WHERE id=1;  → stock = 1
  User B: SELECT stock FROM tickets WHERE id=1;  → stock = 1
  User A: UPDATE tickets SET stock = 0 WHERE id=1;  ✅
  User B: UPDATE tickets SET stock = 0 WHERE id=1;  ✅ (but stock was already 0!)
  → Two tickets sold, only one existed. Lost update.

With locking:
  User A: SELECT stock FROM tickets WHERE id=1 FOR UPDATE;  → stock = 1 (row locked)
  User B: SELECT stock FROM tickets WHERE id=1 FOR UPDATE;  → BLOCKED (waiting for A)
  User A: UPDATE tickets SET stock = 0; COMMIT;  → lock released
  User B: SELECT stock → stock = 0 → "Sold out!" → no sale
  → Correct behavior. One ticket, one sale.

💥

Lost Updates

Two transactions read the same value, both modify it, last write wins — the first update is silently lost. Classic race condition.

🏃

Race Conditions

The outcome depends on the timing of operations. Same code, same inputs, different results depending on which thread runs first.

🔁

Double Processing

Two workers pick up the same job, both process it. Customer charged twice, email sent twice, inventory decremented twice.

Optimistic vs Pessimistic Locking

Optimistic Locking — Assume No Conflict

Optimistic locking doesn't lock anything upfront. It lets everyone read and modify freely, then checks for conflicts at write time using a version number or timestamp. If someone else changed the data since you read it, your write is rejected and you retry.

Optimistic Locking — Version Columnsql

-- Table has a version column
-- products: id=42, name="Laptop", price=999, version=3

-- User A reads:
SELECT id, price, version FROM products WHERE id = 42;
-- → price=999, version=3

-- User B reads (same time):
SELECT id, price, version FROM products WHERE id = 42;
-- → price=999, version=3

-- User A updates (includes version check):
UPDATE products SET price = 899, version = 4
WHERE id = 42 AND version = 3;
-- → 1 row affected ✅ (version matched, update succeeded)

-- User B tries to update (same version check):
UPDATE products SET price = 799, version = 4
WHERE id = 42 AND version = 3;
-- → 0 rows affected ❌ (version is now 4, not 3!)
-- → User B detects conflict, retries with fresh data

Strengths

✅No blocking — readers and writers don't wait for each other
✅High concurrency — many transactions proceed in parallel
✅No deadlocks (no locks held)
✅Simple to implement (version column + WHERE clause)
✅Best for: low-contention systems (conflicts are rare)

Costs

❌Retry overhead when conflicts occur
❌Under high contention: many retries → worse than pessimistic
❌Application must handle retry logic
❌Starvation possible (one transaction keeps losing to others)
❌Not suitable for: high-contention hot rows

Pessimistic Locking — Assume Conflict

Pessimistic locking acquires a lock before reading or writing. Other transactions that need the same resource must wait until the lock is released. No conflicts possible — but concurrency is reduced.

Pessimistic Locking — SELECT FOR UPDATEsql

-- User A locks the row:
BEGIN;
SELECT * FROM products WHERE id = 42 FOR UPDATE;
-- → Row is now LOCKED. No one else can modify it.

-- User B tries to read the same row:
SELECT * FROM products WHERE id = 42 FOR UPDATE;
-- → BLOCKED. Waiting for User A to release the lock.

-- User A updates and commits:
UPDATE products SET price = 899 WHERE id = 42;
COMMIT;
-- → Lock released.

-- User B's query now proceeds:
-- → Reads the updated price (899)
-- → Can now modify and commit

-- No conflict possible. But User B waited.

Strengths

✅Prevents all conflicts (lock before access)
✅No retry logic needed (conflicts can't happen)
✅Predictable behavior (wait, then proceed)
✅Best for: high-contention resources (last ticket, account balance)

Costs

❌Blocking — transactions wait for locks (reduced concurrency)
❌Deadlocks possible (A locks X, waits for Y; B locks Y, waits for X)
❌Lock duration matters — long transactions hold locks longer
❌Throughput drops under high concurrency
❌Not suitable for: low-contention, high-throughput systems

Dimension	Optimistic	Pessimistic
Assumption	Conflicts are rare	Conflicts are likely
Mechanism	Version check at write time	Lock before read/write
Blocking	No (non-blocking)	Yes (waits for lock)
Conflict handling	Detect and retry	Prevent entirely
Deadlocks	Impossible	Possible
Throughput	High (when conflicts rare)	Lower (blocking)
Best for	Product catalog, user profiles	Ticket booking, payments, inventory

🎯 Interview Insight

The choice depends on conflict frequency. Low contention (most web apps): optimistic locking — fast, no blocking, retry on the rare conflict. High contention (last seat, flash sale): pessimistic locking — block to prevent conflicts entirely. Say: "I'd use optimistic locking by default and switch to pessimistic for hot resources where conflicts are frequent."

Distributed Locks (Redlock)

Database locks work within a single database. But what if you need to coordinate across multiple services, each with its own database? You need a distributed lock — a lock that works across machines.

🔑

The Hotel Room Key

A hotel room key ensures only one guest can access the room at a time. The front desk (lock service) issues the key. If the guest doesn't return the key within checkout time (TTL), the desk issues a new one. If the desk has multiple locations (distributed), they must agree on who has the key — that's the hard part.

Distributed Lock — Basic Pattern (Redis)text

Acquire lock:
  SET lock:payment:order-123 "worker-A" NX EX 30
  → NX: only set if key doesn't exist (atomic)
  → EX 30: expires in 30 seconds (TTL safety net)
  → Returns OK if acquired, nil if already locked

Do work:
  Process payment for order-123...

Release lock:
  Check if we still own it (value = "worker-A")
  DEL lock:payment:order-123

Why TTL?
  If worker-A crashes without releasing the lock,
  the lock expires after 30 seconds → other workers can proceed.
  Without TTL: lock held forever → deadlock.

Redlock — Multi-Node Distributed Lock

Redlock Algorithm (Simplified)text

Problem: single Redis node = single point of failure.
  If Redis crashes, the lock is lost. Two workers proceed.

Redlock solution: acquire lock on MAJORITY of N Redis nodes.

5 independent Redis nodes: [R1, R2, R3, R4, R5]

Acquire:
  1. Try SET lock NX EX 30 on ALL 5 nodes
  2. If acquired on ≥ 3 nodes (majority) within time limit → LOCKED ✅
  3. If acquired on < 3 nodes → FAILED, release all acquired locks

  Worker A acquires on R1, R2, R3 → majority (3/5) → lock granted ✅
  Worker B tries → R1, R2, R3 already locked → fails → waits

Release:
  DEL lock on ALL 5 nodes (even ones where acquire failed)

Failure scenario:
  R3 crashes → Worker A still has lock on R1, R2 (2/5, not majority)
  → Lock is effectively lost. This is a known Redlock limitation.
  → For critical systems, use a consensus-based lock (Zookeeper, etcd).

When to use distributed locks

✅Preventing duplicate job processing across workers
✅Coordinating access to external APIs (rate-limited)
✅Ensuring only one instance runs a scheduled task
✅Cross-service resource coordination

Why distributed locks are hard

❌Network partitions can cause split-brain (two holders)
❌Clock skew can cause premature TTL expiration
❌GC pauses can cause a holder to lose the lock unknowingly
❌Redlock has known theoretical issues (Kleppmann critique)
❌For strong guarantees: use Zookeeper or etcd (consensus-based)

🎯 Interview Insight

Distributed locks are a last resort, not a first choice. Say: "I'd first try to avoid distributed locks by using idempotent operations or partitioning work so no coordination is needed. If a lock is unavoidable, I'd use Redis with TTL for non-critical coordination, or Zookeeper/etcd for critical operations that need strong guarantees."

Row-Level vs Table-Level Locks

🔹 Row-Level Locks

Lock only the specific row being modified
Other rows in the same table remain accessible
High concurrency — many transactions on different rows
More overhead per lock (track each row individually)
Default in PostgreSQL and MySQL/InnoDB

🔹 Table-Level Locks

Lock the entire table
No other transaction can read or write any row
Low concurrency — one transaction at a time
Less overhead (one lock for the whole table)
Used for: schema changes, bulk operations, MyISAM

Row-Level vs Table-Level — Impacttext

Table: orders (10M rows)

Row-level lock (PostgreSQL default):
  Tx A: UPDATE orders SET status='shipped' WHERE id = 42;  → locks row 42 only
  Tx B: UPDATE orders SET status='shipped' WHERE id = 99;  → proceeds ✅ (different row)
  Tx C: SELECT * FROM orders WHERE id = 500;               → proceeds ✅ (read, no conflict)
  → High concurrency: 3 transactions run in parallel

Table-level lock:
  Tx A: LOCK TABLE orders; UPDATE ... WHERE id = 42;       → locks ENTIRE table
  Tx B: UPDATE orders WHERE id = 99;                       → BLOCKED ❌ (table locked)
  Tx C: SELECT * FROM orders WHERE id = 500;               → BLOCKED ❌ (table locked)
  → Low concurrency: only Tx A runs, B and C wait

For 10K concurrent transactions:
  Row-level: most proceed in parallel (different rows)
  Table-level: all wait in a queue (one at a time)

Dimension	Row-Level Lock	Table-Level Lock
Granularity	Single row	Entire table
Concurrency	High (parallel on different rows)	Low (one transaction at a time)
Overhead	Higher (per-row tracking)	Lower (one lock)
Deadlock risk	Higher (complex lock ordering)	Lower (coarse granularity)
Use case	OLTP, web apps, concurrent updates	Schema migrations, bulk imports, maintenance
Default in	PostgreSQL, MySQL/InnoDB	MySQL/MyISAM, explicit LOCK TABLE

🎯 Interview Insight

Modern databases use row-level locking by default. Table-level locks are for maintenance operations (schema changes, bulk imports), not for application logic. If someone mentions table-level locking for a web app, it's a red flag — it would serialize all transactions and destroy throughput.

End-to-End Scenario

Let's design the locking strategy for a ticket booking system — the classic high-contention scenario.

🎫 Concert Ticket Booking — 50K Users, 1,000 Seats

50,000 users trying to book 1,000 seats when sales open.

Requirements: no double-booking, no overselling, fast response.

Seat selection → Pessimistic locking (SELECT FOR UPDATE)

When a user selects seat A7, the system locks that specific row: SELECT * FROM seats WHERE id = 'A7' AND status = 'available' FOR UPDATE. If another user tries to select A7 simultaneously, they wait. The first user completes the booking, the second sees 'seat taken.' Row-level lock — other seats remain bookable in parallel.

Payment processing → Distributed lock

Payment involves an external API (Stripe). Use a Redis distributed lock keyed on order_id to prevent duplicate payment calls. Lock: SET lock:payment:order-456 NX EX 60. Process payment. Release lock. If the worker crashes, the TTL releases the lock after 60 seconds.

Inventory count → Optimistic locking

The 'remaining seats' counter uses optimistic locking. Read: remaining = 847, version = 153. After booking: UPDATE events SET remaining = 846, version = 154 WHERE id = 1 AND version = 153. If another booking changed the version, retry with fresh data. Low contention on the counter (one row, but updates are fast).

Seat map display → No locking

The seat map showing available/taken seats is read-only. No locking needed. Serve from a cache (Redis) updated on every booking. Slight staleness is acceptable — a seat might show as available for 1 second after being booked. The actual booking step (Step 1) prevents double-booking.

Architecture — Locking Per Operationtext

User selects seat A7:
  1. Seat map (read)     → No lock (cached, read-only)
  2. Reserve seat (write) → Pessimistic lock (SELECT FOR UPDATE on seat row)
  3. Process payment      → Distributed lock (Redis, keyed on order_id)
  4. Update counter       → Optimistic lock (version column)
  5. Confirm booking      → Release all locks

Concurrency:
  50K users selecting different seats → parallel (row-level locks)
  2 users selecting same seat → one waits (pessimistic lock)
  Payment processing → one at a time per order (distributed lock)
  Counter update → retry on conflict (optimistic lock)

Trade-offs & Decision Making

Strategy	Blocking	Conflict Handling	Deadlock Risk	Throughput	Best For
Optimistic	No	Detect + retry	None	High (low contention)	Catalogs, profiles, settings
Pessimistic	Yes	Prevent entirely	Possible	Lower	Bookings, payments, inventory
Distributed (Redis)	Yes (with TTL)	Prevent across services	TTL prevents	Medium	Cross-service coordination
Distributed (Zookeeper)	Yes	Prevent with consensus	Handled	Lower	Critical coordination, leader election

Local vs Distributed Locks

Dimension	Local (DB Lock)	Distributed (Redis/ZK)
Scope	Single database	Across services and databases
Reliability	High (DB guarantees)	Lower (network failures, clock skew)
Complexity	Low (built into DB)	High (TTL, retry, split-brain)
Performance	Fast (local)	Slower (network round trip)
Use when	Contention within one DB	Contention across services

🎯 Decision Framework

(1) Can you avoid locking entirely? Use idempotent operations or partition work. (2) Contention within one database? Use DB locks (optimistic for low contention, pessimistic for high). (3) Contention across services? Use distributed locks (Redis for best-effort, Zookeeper for strong guarantees). Always prefer the simplest option that meets your correctness requirements.

Interview Questions

Q:Optimistic vs pessimistic locking — when to use each?

A: Optimistic when conflicts are rare (most web apps): read freely, check version at write time, retry on conflict. No blocking, high throughput. Pessimistic when conflicts are frequent or consequences are severe (ticket booking, payments): lock before reading, others wait. No conflicts possible, but lower throughput. Rule of thumb: if <5% of transactions conflict → optimistic. If >20% conflict → pessimistic. In between → measure and decide.

Q:What is a distributed lock and why is it hard?

A: A distributed lock coordinates access across multiple machines. It's hard because: (1) the lock service itself can fail — if Redis crashes, the lock is lost. (2) Network partitions can cause two processes to both think they hold the lock. (3) Clock skew can cause TTL to expire prematurely. (4) GC pauses can cause a process to hold the lock past its TTL without knowing. Redlock (multi-node Redis) mitigates some issues but has theoretical limitations. For strong guarantees, use consensus-based systems (Zookeeper, etcd).

Q:Why are distributed locks hard?

A: The fundamental issue: in a distributed system, you can't distinguish between 'the lock holder is slow' and 'the lock holder has crashed.' If you wait too long, you block progress. If you release too early (TTL), the original holder might still be working — now two processes hold the 'lock.' This is the fencing token problem. Solution: use fencing tokens — each lock acquisition gets a monotonically increasing token. The resource rejects operations with old tokens. This requires the resource to participate in the locking protocol.

Two users try to book the last seat on a flight simultaneously

How do you prevent double-booking?

Answer: Pessimistic locking: SELECT * FROM seats WHERE flight_id = 123 AND seat = '14A' AND status = 'available' FOR UPDATE. The first transaction locks the row. The second waits. First completes the booking, second sees 'seat taken.' Alternative: optimistic locking with a version column — but for a hot resource (last seat, flash sale), pessimistic is better because the retry rate would be very high with optimistic.

A background job runs on 10 servers and must process each task exactly once

How do you prevent duplicate processing?

Answer: Option 1: Distributed lock per task. Before processing task-123, acquire lock: SET lock:task:123 NX EX 300. Only one server gets the lock. Others skip. TTL ensures the lock is released if the server crashes. Option 2 (better): Use a message queue (SQS, Kafka) with visibility timeout. Each message is delivered to one consumer. If the consumer doesn't ACK within the timeout, the message is redelivered to another consumer. No explicit locking needed — the queue handles coordination.

Pitfalls

🔒

Overusing locks

Locking every database operation 'for safety.' Locks reduce concurrency. A system that locks every read and write serializes all transactions — throughput drops to a fraction of capacity. Most reads don't need locks. Most writes on different rows don't conflict.

✅Lock only when contention is real. Use optimistic locking as the default (no blocking). Use pessimistic locking only for hot resources where conflicts are frequent. Never lock reads unless you need repeatable-read isolation for that specific transaction.

🔄

Not handling retries in optimistic locking

Using optimistic locking (version column) but not implementing retry logic. When the version check fails, the application returns an error to the user: 'Something went wrong, try again.' The user is confused and frustrated.

✅Implement automatic retries with a limit. On version conflict: re-read the data, re-apply the change, retry the update. Limit to 3-5 retries. If still failing (extreme contention), fall back to pessimistic locking or return a meaningful error: 'This item was just updated. Please review and try again.'

💀

Deadlocks in pessimistic locking

Transaction A locks row 1, then tries to lock row 2. Transaction B locks row 2, then tries to lock row 1. Both wait forever. The database detects the deadlock and kills one transaction — but the application must handle the error and retry.

✅Always acquire locks in a consistent order (e.g., by ID ascending). If you need rows 1 and 2, always lock 1 first, then 2. Set lock timeouts (SET lock_timeout = '5s') so transactions don't wait forever. Handle deadlock errors in application code (retry the entire transaction).

🌐

Incorrect distributed lock implementation

Using a single Redis SET NX for a distributed lock without TTL. If the lock holder crashes, the lock is held forever — all other workers are permanently blocked. Or: releasing someone else's lock (not checking the lock value before DEL).

✅Always use TTL: SET lock:key value NX EX 30. Always check ownership before releasing: only DEL if the value matches your worker ID (use a Lua script for atomicity). For critical operations, use Redlock (multi-node) or Zookeeper. Monitor lock acquisition failures and TTL expirations.

Locking Strategies

Table of Contents

The Big Picture — Why Contention Happens

The Shared Document Analogy

Locking Overview

Lost Updates

Race Conditions

Double Processing

Optimistic vs Pessimistic Locking

Optimistic Locking — Assume No Conflict

Strengths

Costs

Pessimistic Locking — Assume Conflict

Strengths

Costs

Distributed Locks (Redlock)

The Hotel Room Key

Redlock — Multi-Node Distributed Lock

When to use distributed locks

Why distributed locks are hard

Row-Level vs Table-Level Locks

🔹 Row-Level Locks

🔹 Table-Level Locks

End-to-End Scenario

🎫 Concert Ticket Booking — 50K Users, 1,000 Seats

Seat selection → Pessimistic locking (SELECT FOR UPDATE)

Payment processing → Distributed lock

Inventory count → Optimistic locking

Seat map display → No locking

Trade-offs & Decision Making

Local vs Distributed Locks

Interview Questions

Q:Optimistic vs pessimistic locking — when to use each?

Q:What is a distributed lock and why is it hard?

Q:Why are distributed locks hard?

Two users try to book the last seat on a flight simultaneously

A background job runs on 10 servers and must process each task exactly once

Pitfalls

Overusing locks

Not handling retries in optimistic locking

Deadlocks in pessimistic locking

Incorrect distributed lock implementation