WALEvent SourcingMessage QueueKafkaBatch WritesAsyncWrite Scaling

Write Offloading

Handle high write throughput with WAL, event sourcing, async writes via message queues, and buffered batch writes. Decouple write ingestion from persistence.

26 min read10 sections

The Big Picture — Why Writes Bottleneck

A database write is expensive. It must validate constraints, update indexes, write to the WAL, flush to disk, and replicate to followers — all before acknowledging the client. A single PostgreSQL instance tops out at ~10,000-20,000 writes per second. When your system needs 500,000 writes per second, direct synchronous writes to the database simply don't work.

🍽️

The Restaurant Order Queue

A busy restaurant doesn't have the chef cook each dish the instant it's ordered. That would overwhelm the kitchen during rush hour. Instead: the waiter writes the order on a ticket (WAL), puts it on the order rail (queue), and tells the customer 'your order is confirmed.' The kitchen processes tickets at its own pace, batching similar orders together (batch writes). If the kitchen falls behind, orders queue up but nobody is turned away. The customer doesn't wait for the food to be cooked before getting confirmation — they wait for the ticket to be written. That's write offloading.

🔥 Key Insight

Write offloading decouples write ingestion (accepting the data) from write persistence (storing it durably). The client gets a fast acknowledgment. The actual database write happens asynchronously, in batches, or through a pipeline. This turns a synchronous bottleneck into an asynchronous, scalable flow.

Write Offloading Overview

Synchronous vs Offloaded Writestext

SYNCHRONOUS (direct DB write):
  Client → App Server → Database → Acknowledge
  Latency: 20-100ms (wait for DB)
  Throughput: limited by DB write speed
  During spike: DB overwhelmed → timeouts → failures

OFFLOADED (buffered, async):
  Client → App Server → Buffer/Queue → Acknowledge (fast!)
                              ↓
                    Worker → Database (at its own pace)
  Latency: 1-5ms (write to queue only)
  Throughput: limited by queue capacity (much higher)
  During spike: queue absorbs burst → DB processes steadily

📈

Smooth Traffic Spikes

A flash sale generates 100K writes/sec for 5 minutes. The DB handles 10K/sec. Without a queue: 90K writes fail. With a queue: all 100K are accepted, processed over ~50 minutes.

⚡

Improve Throughput

Batching 1,000 individual INSERTs into one bulk INSERT is 10-50x faster. The queue collects writes; the worker batches them. Same data, fraction of the DB load.

Write-Ahead Log (WAL)

The Write-Ahead Log is the foundation of database durability. Before any change is applied to the actual data files, it's first written to a sequential log on disk. If the system crashes mid-operation, the WAL can be replayed to recover the exact state.

WAL — How It Workstext

Write Request: UPDATE users SET name = 'Alicia' WHERE id = 42

Step 1 — Append to WAL (sequential disk write):
  WAL entry: {txn: 1001, table: users, id: 42, name: "Alicia"}
  → Sequential write to disk: ~0.1ms (fast!)
  → Write is now DURABLE (survives crash)

Step 2 — Apply to data files (random disk write):
  Update the actual row in the users table on disk
  Update indexes
  → Random I/O: ~5-20ms (slow, but can be deferred)

Step 3 — Acknowledge to client:
  "Write confirmed" (after WAL write, not after data file update)

Crash Recovery:
  System crashes after Step 1 but before Step 2
  On restart: replay WAL entries → apply missing changes
  → No data loss, database is consistent

Why WAL is fast

✅Sequential writes (append-only) — 10-100x faster than random I/O
✅Client acknowledged after WAL write, not after full persistence
✅Actual data file updates can be batched and deferred
✅Used by every major database: PostgreSQL, MySQL, SQLite, MongoDB

Trade-offs

❌Extra disk write (WAL + data files = write amplification)
❌WAL files accumulate and must be managed (rotation, archival)
❌Recovery time depends on WAL size (more entries = longer replay)
❌WAL is the bottleneck for write-heavy workloads on a single disk

🎯 Interview Insight

WAL is not a scaling pattern you implement — it's built into every database. But understanding it explains why databases can acknowledge writes quickly (sequential WAL write) while actual persistence is slower (random I/O). It also explains replication: replicas receive the WAL stream and replay it to stay in sync.

Event Sourcing

Traditional databases store current state: "Alice's balance is $500." Event sourcing stores every event that led to that state: "Alice deposited $1000, withdrew $300, deposited $100, withdrew $300." The current state is derived by replaying all events.

Event Sourcing — State vs Eventstext

TRADITIONAL (store current state):
  users table: { id: 42, name: "Alicia", balance: 500 }
  → You know the current balance
  → You DON'T know how it got there

EVENT SOURCING (store events):
  events table:
    { id: 1, user: 42, type: "AccountCreated", data: {name: "Alice"} }
    { id: 2, user: 42, type: "Deposited",      data: {amount: 1000} }
    { id: 3, user: 42, type: "Withdrawn",       data: {amount: 300} }
    { id: 4, user: 42, type: "NameChanged",     data: {name: "Alicia"} }
    { id: 5, user: 42, type: "Deposited",       data: {amount: 100} }
    { id: 6, user: 42, type: "Withdrawn",       data: {amount: 300} }

  Current state = replay all events:
    Created → balance: 0
    +1000   → balance: 1000
    -300    → balance: 700
    +100    → balance: 800
    -300    → balance: 500
    Name: "Alicia"

  → You know the current balance AND the full history

Strengths

✅Complete audit trail — every change is recorded
✅Time travel — reconstruct state at any point in time
✅Debugging — replay events to reproduce bugs
✅Event-driven architecture — events feed other systems
✅Append-only writes — fast (no updates, no deletes)

Costs

❌Rebuilding state is expensive (replay all events)
❌Event schema evolution is complex (old events, new format)
❌Storage grows forever (events are never deleted)
❌Querying current state requires snapshots or projections
❌Significant complexity vs traditional CRUD

Event Sourcing — Snapshots for Performancetext

Problem: user has 10,000 events. Rebuilding state = replay 10,000 events.

Solution: periodic snapshots
  Every 1,000 events, save a snapshot of current state:
    Snapshot at event 9000: { balance: 4,200, name: "Alicia" }

  To get current state:
    Load snapshot (event 9000) → replay events 9001-10000 only
    → 1,000 replays instead of 10,000

🎯 Interview Insight

Event sourcing is powerful but complex. Use it when: (1) you need a complete audit trail (financial systems, compliance), (2) you need to reconstruct past states (debugging, analytics), (3) you're building an event-driven architecture. Don't use it for simple CRUD — the complexity isn't justified.

Async Writes via Message Queues

The most common write scaling pattern: instead of writing directly to the database, the application publishes a message to a queue. A worker consumes messages and writes to the database at a sustainable pace. The client gets an immediate acknowledgment without waiting for the DB.

Async Writes — How It Workstext

SYNCHRONOUS (without queue):
  Client → App → INSERT INTO orders (...) → wait 50ms → "Order confirmed"
  Problem: DB handles 10K writes/sec. Flash sale: 100K writes/sec → DB dies.

ASYNC (with queue):
  Client → App → Publish to Kafka/SQS → "Order received" (2ms)
                        ↓
              Worker → INSERT INTO orders (...) → at DB's pace

  Client latency: 2ms (queue write only)
  DB load: steady 10K/sec (worker processes at sustainable rate)
  Flash sale: queue absorbs 100K/sec burst, worker drains over time

  Queue depth during spike:
    T=0:   0 messages
    T=1m:  (100K - 10K) × 60 = 5.4M messages queued
    T=5m:  spike ends, queue starts draining
    T=14m: queue fully drained, all orders processed

🛒

Order Processing

User places order → message to queue → worker validates, charges payment, creates order in DB. User sees 'Order received' instantly. Confirmation email sent after processing.

🔔

Notifications

User action triggers notification → message to queue → worker sends push notification, email, SMS. The action completes instantly; notifications are delivered async.

📊

Analytics Events

Page view, click, purchase → message to Kafka → consumer writes to analytics DB in batches. The user experience is never slowed by analytics tracking.

Strengths

✅Decouples producer from consumer (independent scaling)
✅Absorbs traffic spikes (queue buffers the burst)
✅Retry built-in (failed messages re-queued automatically)
✅Multiple consumers can process in parallel
✅Most common write scaling pattern in production

Trade-offs

❌Eventual consistency (data not in DB immediately)
❌Message ordering challenges (parallel consumers)
❌Dead letter queue needed for poison messages
❌Extra infrastructure (Kafka/SQS/RabbitMQ)
❌Debugging is harder (async flow, distributed tracing needed)

🎯 Interview Insight

When asked "how do you handle write spikes?" — message queues are the default answer. Say: "I'd put a Kafka topic between the API and the database. The API publishes messages at any rate. Workers consume at the DB's sustainable rate. During spikes, the queue absorbs the burst. We trade immediate consistency for throughput and resilience."

Buffered Batch Writes

Instead of writing each record individually, buffer multiple writes in memory and flush them as a single batch. One bulk INSERT of 1,000 rows is 10-50x faster than 1,000 individual INSERTs — because you pay the network round-trip and transaction overhead only once.

Individual vs Batch Writestext

INDIVIDUAL WRITES (1,000 separate INSERTs):
  INSERT INTO events (user_id, type, data) VALUES (1, 'click', '...');
  INSERT INTO events (user_id, type, data) VALUES (2, 'view', '...');
  ... (998 more)
  → 1,000 network round trips
  → 1,000 transaction commits
  → ~5 seconds total

BATCH WRITE (1 bulk INSERT):
  INSERT INTO events (user_id, type, data) VALUES
    (1, 'click', '...'),
    (2, 'view', '...'),
    ... (998 more);
  → 1 network round trip
  → 1 transaction commit
  → ~100ms total (50x faster)

Buffering flow:
  Incoming writes → In-memory buffer (array)
  When buffer reaches 1,000 items OR 100ms has passed:
    → Flush: bulk INSERT all buffered items
    → Clear buffer, start collecting again

⏱️ Time-Based Flushing

Flush every N milliseconds (e.g., every 100ms)
Guarantees maximum latency before data is persisted
Batch size varies (small during low traffic)
Good for: systems with latency requirements

📦 Size-Based Flushing

Flush when buffer reaches N items (e.g., 1,000)
Optimal batch size for DB performance
Latency varies (slow during low traffic)
Good for: high-throughput systems where latency is flexible

Strengths

✅10-50x throughput improvement over individual writes
✅Reduces DB connection usage (fewer transactions)
✅Simple to implement (buffer + flush timer)
✅Works with any database (SQL, NoSQL, object storage)

Risks

❌Data loss if process crashes before flushing buffer
❌Increased latency (data sits in buffer before persisting)
❌Buffer memory usage during high traffic
❌Must handle flush failures (retry the entire batch)

💡 Combining Queue + Batch

The most powerful pattern: Kafka queue + batch consumer. The queue absorbs spikes. The consumer reads messages in batches (e.g., 1,000 at a time) and does a single bulk INSERT. You get spike absorption AND batch efficiency. This is how most analytics and logging pipelines work.

End-to-End Scenario

Let's design the write pipeline for a logging platform ingesting 500,000 events per second from 10,000 servers.

📋 Logging Platform — 500K Events/sec

Each event: ~500 bytes (timestamp, service, level, message).

Storage: ClickHouse (column-store, optimized for analytics).

ClickHouse bulk insert capacity: ~50K rows/sec per node.

Ingestion → Kafka (message queue)

10,000 servers publish log events to Kafka topics. Kafka handles 500K writes/sec easily (append-only, sequential I/O). Events are partitioned by service name for ordering. Client SDKs get acknowledgment in ~2ms. No event is lost — Kafka persists to disk.

Buffering → Consumer batches

10 consumer workers read from Kafka. Each consumer buffers events in memory. Flush trigger: 10,000 events OR 1 second (whichever comes first). Each flush = one bulk INSERT of 10,000 rows into ClickHouse.

Persistence → ClickHouse (batch writes)

10 consumers × 10,000 rows/flush × 1 flush/sec = 100K rows/sec to ClickHouse. But we're ingesting 500K/sec — so consumers batch more aggressively during spikes. ClickHouse handles bulk inserts efficiently (column-store, compressed).

Event sourcing → Full history

Kafka retains all events for 7 days. If ClickHouse needs to be rebuilt (schema change, data corruption), replay events from Kafka. The event log IS the source of truth. ClickHouse is a derived view.

Architecture — Full Write Pipelinetext

10,000 Servers
      │
      │ 500K events/sec
      ▼
┌──────────────┐
│    Kafka     │  ← absorbs full ingestion rate
│  (3 brokers) │  ← partitioned by service name
│              │  ← 7-day retention (event sourcing)
└──────┬───────┘
       │
       │ Consumer group (10 workers)
       │ Each buffers 10K events → bulk INSERT
       ▼
┌──────────────┐
│  ClickHouse  │  ← receives batched bulk inserts
│  (3 nodes)   │  ← column-store, compressed
│              │  ← handles 100K+ rows/sec per node
└──────────────┘

Spike handling:
  Normal: 500K/sec → Kafka → consumers keep up
  Spike (2M/sec): Kafka queues 1.5M/sec excess
  After spike: consumers drain the backlog
  No data loss, no DB overload

Trade-offs & Decision Making

Technique	Throughput Gain	Latency Impact	Durability	Complexity	Best For
WAL	Moderate (sequential I/O)	Low (fast append)	High (crash recovery)	Built-in	Every database (foundation)
Event Sourcing	High (append-only)	Low (fast append)	High (full history)	Very high	Audit trails, financial systems
Message Queue	Very high (decoupled)	Higher (async processing)	High (queue persists)	Medium	Spike absorption, decoupling
Batch Writes	10-50x	Higher (buffer delay)	Risk (buffer loss)	Low	Logging, analytics, bulk ingestion

Sync vs Async Writes

Dimension	Synchronous	Asynchronous (Offloaded)
Client latency	20-100ms (wait for DB)	1-5ms (write to queue/buffer)
Consistency	Immediate (data in DB on response)	Eventual (data in DB later)
Throughput	Limited by DB write speed	Limited by queue capacity (much higher)
Spike handling	DB overwhelmed → failures	Queue absorbs burst → smooth processing
Complexity	Simple (direct write)	Higher (queue, workers, retry logic)
Use when	Strong consistency required	Throughput > consistency

Event Sourcing vs Traditional DB

Dimension	Traditional (Store State)	Event Sourcing (Store Events)
What's stored	Current state only	Every event that changed state
History	Lost (overwritten on update)	Complete (replay any point in time)
Write speed	Moderate (update in place)	Fast (append-only)
Read speed	Fast (read current state)	Slow without snapshots (replay events)
Storage	Compact (current state only)	Grows forever (all events)
Complexity	Low (standard CRUD)	High (event schema, snapshots, projections)

🎯 Decision Framework

Need immediate consistency? → Synchronous writes. Need to handle spikes? → Message queue. Need audit trail? → Event sourcing. Need throughput? → Batch writes. Most systems combine: queue for spike absorption + batch consumer for throughput + WAL in the DB for durability.

Interview Questions

Q:Why use a Write-Ahead Log?

A: WAL ensures durability without waiting for slow random I/O. The database appends changes to a sequential log file (fast) before updating the actual data files (slow). If the system crashes after the WAL write but before the data file update, the WAL is replayed on recovery — no data loss. WAL also enables replication: replicas receive the WAL stream and replay it to stay in sync. Every major database (PostgreSQL, MySQL, MongoDB) uses WAL internally.

Q:What is event sourcing?

A: Instead of storing current state (balance = $500), you store every event that led to that state (deposited $1000, withdrew $300, deposited $100, withdrew $300). Current state is derived by replaying events. Benefits: complete audit trail, time-travel debugging, natural fit for event-driven architectures. Costs: rebuilding state is expensive (use snapshots), event schema evolution is complex, storage grows forever. Use for: financial systems, compliance-heavy domains, event-driven architectures.

Q:Why use message queues for writes?

A: Message queues decouple write ingestion from persistence. The API publishes to the queue (2ms) and responds immediately. Workers consume from the queue and write to the DB at a sustainable pace. Benefits: (1) absorbs traffic spikes without DB overload, (2) automatic retry on failure, (3) independent scaling of producers and consumers, (4) multiple consumers can process in parallel. This is the most common write scaling pattern — used by virtually every large-scale system.

Q:When would you batch writes?

A: When you have high-volume, individually small writes — logging events, analytics clicks, IoT sensor data. One bulk INSERT of 1,000 rows is 10-50x faster than 1,000 individual INSERTs (one network round trip, one transaction commit). Combine with a message queue: Kafka absorbs the stream, a consumer reads batches of 1,000 messages, does one bulk INSERT. Trade-off: increased latency (data sits in buffer) and risk of data loss if the buffer crashes before flushing.

Pitfalls

💥

Ignoring failure handling in async writes

Publishing to a queue and assuming the write will succeed. The worker might crash, the DB might reject the data, the message might be malformed. Without retry logic and dead letter queues, writes silently disappear.

✅Implement: (1) automatic retries with exponential backoff, (2) dead letter queue for messages that fail after N retries, (3) monitoring on queue depth and DLQ size, (4) idempotent consumers (processing the same message twice produces the same result).

🔁

Not handling retries correctly

Retrying a failed write without idempotency. The first attempt succeeded but the acknowledgment was lost. The retry creates a duplicate order, charges the customer twice, or inserts duplicate data.

✅Make all write operations idempotent. Use idempotency keys: the client sends a unique request ID. The server checks if that ID was already processed before executing. If yes → return the previous result. If no → process and store the ID.

📦

Data loss in buffers

Buffering 10,000 events in application memory for batch writes. The process crashes → 10,000 events lost. For analytics, this might be acceptable. For orders or payments, it's catastrophic.

✅Use a durable buffer (Kafka, not in-memory). Kafka persists messages to disk before acknowledging. If the consumer crashes, messages are re-delivered from Kafka. The buffer is the queue, not application memory. For critical data, never buffer in memory without a durable backing store.

🔧

Overcomplicating with event sourcing

Using event sourcing for a simple CRUD application. The team now manages event schemas, snapshot logic, projection rebuilds, and eventual consistency — all for a system that handles 100 writes/sec and doesn't need an audit trail.

✅Event sourcing is for specific use cases: financial systems (audit trail required by law), event-driven architectures (events are the primary communication mechanism), or systems that need time-travel debugging. For standard web apps, a traditional database with soft deletes and an audit log table is simpler and sufficient.

Write Offloading

Table of Contents

The Big Picture — Why Writes Bottleneck

The Restaurant Order Queue

Write Offloading Overview

Smooth Traffic Spikes

Improve Throughput

Write-Ahead Log (WAL)

Why WAL is fast

Trade-offs

Event Sourcing

Strengths

Costs

Async Writes via Message Queues

Order Processing

Notifications

Analytics Events

Strengths

Trade-offs

Buffered Batch Writes

⏱️ Time-Based Flushing

📦 Size-Based Flushing

Strengths

Risks

End-to-End Scenario

📋 Logging Platform — 500K Events/sec

Ingestion → Kafka (message queue)

Buffering → Consumer batches

Persistence → ClickHouse (batch writes)

Event sourcing → Full history

Trade-offs & Decision Making

Sync vs Async Writes

Event Sourcing vs Traditional DB

Interview Questions

Q:Why use a Write-Ahead Log?

Q:What is event sourcing?

Q:Why use message queues for writes?

Q:When would you batch writes?

Pitfalls

Ignoring failure handling in async writes

Not handling retries correctly

Data loss in buffers

Overcomplicating with event sourcing