Transactions & Concurrency
PostgreSQL's core guarantee. Understanding transactions, isolation levels, and locking deeply separates senior engineers from everyone else.
Table of Contents
ACID Properties
ACID is not a buzzword — it's the set of guarantees that makes PostgreSQL trustworthy for critical data. Every property has a concrete mechanism behind it.
| Property | Guarantee | Mechanism |
|---|---|---|
| Atomicity | All or nothing — partial transactions never visible | WAL + rollback via undo |
| Consistency | Constraints enforced, invariants maintained after every transaction | CHECK, FK, UNIQUE, EXCLUDE constraints |
| Isolation | Concurrent transactions don't interfere (with caveats per level) | MVCC + snapshot isolation |
| Durability | Committed data survives crashes | WAL written to disk before commit acknowledged |
The Bank Transfer
Transfer $100 from Account A to Account B. Atomicity: both the debit and credit happen, or neither does — you never lose $100 into the void. Consistency: total money in the system is unchanged after the transfer. Isolation: a concurrent balance check sees either the state before or after the transfer, never the intermediate state where A is debited but B isn't credited. Durability: once the transfer is confirmed, it survives a power outage.
Transaction Basics
-- Explicit transaction BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1; UPDATE accounts SET balance = balance + 100 WHERE id = 2; COMMIT; -- Rollback on error BEGIN; INSERT INTO orders (user_id, total) VALUES (42, 99.99); -- Something goes wrong... ROLLBACK; -- nothing was committed -- Savepoints — partial rollback within a transaction BEGIN; INSERT INTO orders (user_id, total) VALUES (42, 99.99); SAVEPOINT before_items; INSERT INTO order_items (order_id, product_id) VALUES (1, 999); -- Oops, product 999 doesn't exist ROLLBACK TO SAVEPOINT before_items; -- Order still exists, just no items INSERT INTO order_items (order_id, product_id) VALUES (1, 1); COMMIT;
Long Transactions Are Expensive
A long-running transaction holds its snapshot open, preventing VACUUM from cleaning dead tuples created after the transaction started. A transaction open for hours can cause table bloat across the entire database. Keep transactions short — do your computation outside the transaction, then BEGIN/COMMIT quickly.
In autocommit mode (the default for most drivers), every statement is its own transaction. UPDATE accounts SET balance = 0 without an explicit BEGIN is automatically wrapped in BEGIN/COMMIT.
Isolation Levels
| Level | Snapshot Taken | Anomalies Prevented | Use Case |
|---|---|---|---|
| Read Committed (default) | At each statement start | Dirty reads | General OLTP — good enough for most apps |
| Repeatable Read | At transaction start | Dirty reads, non-repeatable reads, phantoms | Reports that need consistent snapshot |
| Serializable | At transaction start + conflict detection | All anomalies (including write skew) | Financial systems, inventory — full correctness |
PostgreSQL Has No True 'Read Uncommitted'
Setting READ UNCOMMITTED in PostgreSQL behaves identically to READ COMMITTED. PostgreSQL never exposes uncommitted data — dirty reads are impossible at any isolation level. This is stricter than the SQL standard requires.
-- Per-transaction BEGIN ISOLATION LEVEL SERIALIZABLE; SELECT balance FROM accounts WHERE id = 1; -- ... logic ... UPDATE accounts SET balance = balance - 100 WHERE id = 1; COMMIT; -- If a serialization conflict is detected, PostgreSQL aborts with: -- ERROR: could not serialize access due to concurrent update -- Application must retry the transaction -- Per-session default SET default_transaction_isolation = 'repeatable read'; -- Check current level SHOW transaction_isolation;
Serializable Snapshot Isolation (SSI)
PostgreSQL implements Serializable using SSI — it doesn't use traditional locking (which would block). Instead, it tracks read/write dependencies between concurrent transactions and aborts one if a cycle is detected. This means Serializable in PostgreSQL is optimistic: it doesn't block, but it may abort and require retry.
Concurrency Anomalies
| Anomaly | What Happens | Prevented At |
|---|---|---|
| Dirty Read | Reading uncommitted data from another transaction | All levels (PG never allows this) |
| Non-repeatable Read | Same row returns different value within one transaction | Repeatable Read and above |
| Phantom Read | Same query returns different rows within one transaction | Repeatable Read and above (PG is stricter than standard) |
| Lost Update | Two transactions read-modify-write same row, one overwrites the other | Use SELECT FOR UPDATE or Serializable |
| Write Skew | Two transactions read overlapping data, both write based on stale read | Serializable only |
Write Skew Example
-- Rule: at least one doctor must be on-call at all times -- Two doctors are on-call. Both try to go off-call simultaneously. -- Transaction 1 (Dr. Alice): BEGIN ISOLATION LEVEL READ COMMITTED; SELECT count(*) FROM doctors WHERE on_call = true; -- returns 2 -- "There are 2 on-call, I can safely go off-call" UPDATE doctors SET on_call = false WHERE name = 'Alice'; COMMIT; -- succeeds -- Transaction 2 (Dr. Bob) — runs concurrently: BEGIN ISOLATION LEVEL READ COMMITTED; SELECT count(*) FROM doctors WHERE on_call = true; -- returns 2 (snapshot!) -- "There are 2 on-call, I can safely go off-call" UPDATE doctors SET on_call = false WHERE name = 'Bob'; COMMIT; -- succeeds -- Result: ZERO doctors on-call! Constraint violated. -- Fix: use SERIALIZABLE isolation — one transaction would be aborted.
When to Use Serializable
Use Serializable when your correctness depends on reading data that other transactions might concurrently modify — and your write decision depends on what you read. Classic cases: inventory (don't oversell), scheduling (don't double-book), financial (don't overdraw). The cost: you must handle serialization failures with retry logic.
Row-Level Locking
| Lock Mode | Purpose | Blocks |
|---|---|---|
| FOR UPDATE | Exclusive lock — I will modify this row | Other FOR UPDATE, FOR SHARE |
| FOR NO KEY UPDATE | Like FOR UPDATE but allows concurrent FOR KEY SHARE | Other FOR UPDATE |
| FOR SHARE | Shared lock — prevent modification while I read | FOR UPDATE, FOR NO KEY UPDATE |
| FOR KEY SHARE | Weakest — only prevents key column changes | FOR UPDATE only |
-- Pessimistic locking: lock the row before modifying BEGIN; SELECT * FROM accounts WHERE id = 1 FOR UPDATE; -- Row is now locked — other transactions wait UPDATE accounts SET balance = balance - 100 WHERE id = 1; COMMIT; -- NOWAIT: fail immediately if row is locked (don't wait) SELECT * FROM accounts WHERE id = 1 FOR UPDATE NOWAIT; -- ERROR: could not obtain lock on row — handle in application -- SKIP LOCKED: skip locked rows (queue worker pattern) SELECT * FROM tasks WHERE status = 'pending' ORDER BY created_at LIMIT 1 FOR UPDATE SKIP LOCKED; -- Multiple workers can process different tasks concurrently -- Each worker skips rows already being processed by others
SKIP LOCKED — The Queue Pattern
SELECT FOR UPDATE SKIP LOCKED is how you build a reliable job queue in PostgreSQL without Redis or RabbitMQ. Multiple workers SELECT the next pending task with SKIP LOCKED — each gets a different row, no contention, no duplicate processing. Simple, reliable, and transactional.
Deadlocks
Deadlocks occur when two transactions each hold a lock the other needs. PostgreSQL automatically detects deadlocks (within ~1 second) and aborts one transaction as the victim. The application must retry.
-- Transaction 1: BEGIN; UPDATE accounts SET balance = balance - 10 WHERE id = 1; -- locks row 1 UPDATE accounts SET balance = balance + 10 WHERE id = 2; -- waits for row 2 -- Transaction 2 (concurrent): BEGIN; UPDATE accounts SET balance = balance - 10 WHERE id = 2; -- locks row 2 UPDATE accounts SET balance = balance + 10 WHERE id = 1; -- waits for row 1 -- DEADLOCK! PostgreSQL aborts one transaction. -- Prevention: always lock rows in consistent order (e.g., by ID ascending)
Advisory Locks
Advisory locks are application-level cooperative locks — PostgreSQL doesn't enforce them on any table or row. Your application decides what the lock ID means. They're useful for distributed locking without deploying Redis.
-- Session-level advisory lock (held until session ends or explicit unlock) SELECT pg_advisory_lock(12345); -- blocks if another session holds it -- ... do exclusive work ... SELECT pg_advisory_unlock(12345); -- Transaction-level advisory lock (released at COMMIT/ROLLBACK) BEGIN; SELECT pg_advisory_xact_lock(hashtext('process-payments')); -- ... exclusive payment processing ... COMMIT; -- lock automatically released -- Try lock (non-blocking — returns true/false) SELECT pg_try_advisory_lock(12345); -- returns false if already held -- Use case: prevent duplicate cron job execution SELECT pg_try_advisory_lock(hashtext('daily-report-job')); -- If false, another instance is already running — exit gracefully
Advisory Lock Use Cases
- ✅Distributed locking without Redis — use advisory locks when all workers connect to the same PostgreSQL
- ✅Preventing duplicate cron job execution across multiple app instances
- ✅Serializing access to external resources (API rate limits, file processing)
- ✅Application-level mutex for operations that don't map to a single row
Optimistic Concurrency Control
Optimistic concurrency avoids holding locks during "think time." Read the row, remember its version, do your computation, then UPDATE with a WHERE clause that checks the version hasn't changed. If it has, retry.
-- Table with version column CREATE TABLE products ( id bigint PRIMARY KEY, name text NOT NULL, price numeric(10,2) NOT NULL, version integer NOT NULL DEFAULT 1 ); -- Application reads product SELECT id, name, price, version FROM products WHERE id = 42; -- Returns: version = 5 -- Application computes new price, then updates with version check UPDATE products SET price = 29.99, version = version + 1 WHERE id = 42 AND version = 5; -- If another transaction modified the row, version != 5 -- UPDATE affects 0 rows — application detects this and retries -- Alternative: use xmin (system column) as version SELECT id, name, price, xmin FROM products WHERE id = 42; -- xmin changes on every update — no need for explicit version column UPDATE products SET price = 29.99 WHERE id = 42 AND xmin = '12345';
| Approach | Pessimistic (FOR UPDATE) | Optimistic (version check) |
|---|---|---|
| Lock held during | Entire transaction | No lock held |
| Contention handling | Blocks waiting transactions | Retry on conflict |
| Best for | Short transactions, high contention | Long think time, low contention |
| Deadlock risk | Yes | No (no locks held) |
| Starvation risk | Low | High under heavy contention (repeated retries) |
Interview Questions
Q:What's the difference between Read Committed and Repeatable Read in PostgreSQL?
A: Read Committed takes a new snapshot at each statement — you might see different data if another transaction commits between your statements. Repeatable Read takes one snapshot at transaction start — you see a frozen view of the database for the entire transaction. Use Repeatable Read for reports or operations that need a consistent view across multiple queries.
Q:How does PostgreSQL implement Serializable isolation without blocking?
A: PostgreSQL uses Serializable Snapshot Isolation (SSI). Transactions run concurrently using MVCC snapshots (no blocking). PostgreSQL tracks read/write dependencies between transactions. If it detects a dependency cycle that could produce a non-serializable result, it aborts one transaction with a serialization failure. The application must retry. This is optimistic — no blocking, but possible aborts.
Q:How would you implement a job queue in PostgreSQL?
A: Use SELECT FOR UPDATE SKIP LOCKED. Workers query: SELECT * FROM jobs WHERE status = 'pending' ORDER BY created_at LIMIT 1 FOR UPDATE SKIP LOCKED. Each worker gets a different unlocked row. Update status to 'processing', do the work, then update to 'completed' and COMMIT. If the worker crashes, the transaction rolls back and the job becomes available again. No external queue needed.
Q:What is write skew and how do you prevent it?
A: Write skew: two transactions read overlapping data, make decisions based on what they read, and write non-conflicting rows — but the combined result violates a constraint. Example: two doctors both go off-call because each sees the other is still on-call. Prevention: use SERIALIZABLE isolation (PostgreSQL detects the conflict) or use explicit locking (SELECT FOR UPDATE on the rows you're basing your decision on).
Q:When would you use advisory locks over row-level locks?
A: Advisory locks are for application-level coordination that doesn't map to a specific row: preventing duplicate cron job execution, serializing access to external APIs, coordinating file processing across workers. They're cooperative — PostgreSQL doesn't enforce them on data. Use them when all workers connect to the same PostgreSQL and you'd otherwise need Redis just for distributed locking.
Common Mistakes
Not handling serialization failures
Using SERIALIZABLE isolation without retry logic — transactions fail with 'could not serialize access' and the application crashes.
✅Serializable requires retry logic. Wrap your transaction in a retry loop that catches serialization_failure errors and re-executes the entire transaction (not just the failed statement). Typically 3-5 retries with exponential backoff.
Long-running transactions blocking VACUUM
Keeping a transaction open for minutes/hours (e.g., during a long report generation) while other transactions create dead tuples.
✅VACUUM cannot clean dead tuples created after the oldest open transaction's snapshot. Use READ COMMITTED for long reports (snapshot per statement), or run reports on a read replica. Keep transactions on the primary short.
Using SELECT FOR UPDATE when optimistic locking suffices
Locking rows pessimistically for operations with long think time (user editing a form for minutes).
✅Pessimistic locks block other transactions for the entire duration. For long think times, use optimistic concurrency: read the row with its version, let the user edit, then UPDATE WHERE version = original_version. Retry on conflict. No locks held during think time.
Inconsistent lock ordering causing deadlocks
Transaction A locks row 1 then row 2. Transaction B locks row 2 then row 1. Deadlock.
✅Always acquire locks in a consistent order — typically by primary key ascending. If you need to update multiple rows, sort them by ID first. PostgreSQL detects deadlocks automatically, but prevention is better than detection.