Indexes Deep Dive
The #1 performance lever in PostgreSQL. Getting indexes wrong is the most common cause of slow queries. Master every index type and when to use each.
Table of Contents
How Indexes Work
An index is a separate data structure that maintains pointers to rows in the heap table. Without an index, PostgreSQL must perform a sequential scan — reading every page of the table to find matching rows. With an index, it can jump directly to the relevant rows.
The Book Index
A table without indexes is like a book without an index — to find 'MVCC', you'd read every page. A B-tree index is like the alphabetical index at the back: look up 'MVCC', get page numbers, jump directly there. But the index itself takes space and must be updated every time the book changes. That's the fundamental trade-off: faster reads at the cost of slower writes and more storage.
| Concept | Description |
|---|---|
| Index Scan | Follow index to heap — random I/O, best for selective queries (< 5-10% of rows) |
| Sequential Scan | Read entire table — sequential I/O, optimal when reading large % of rows |
| Index-Only Scan | Answer query from index alone — no heap access (requires visibility map) |
| Bitmap Index Scan | Collect all matching TIDs from index, sort by physical location, then fetch heap |
| Write Amplification | Every INSERT/UPDATE/DELETE must update ALL indexes on the table |
When the Planner Ignores Your Index
The query planner uses cost-based optimization. If it estimates that a query will return more than ~5-10% of the table, it chooses a sequential scan — because sequential I/O is faster than random I/O for large result sets. An index on a boolean column with 50/50 distribution will never be used. Indexes help selective queries.
-- See index usage statistics SELECT relname AS table, indexrelname AS index, idx_scan AS times_used, idx_tup_read AS rows_read, idx_tup_fetch AS rows_fetched FROM pg_stat_user_indexes ORDER BY idx_scan DESC; -- Find unused indexes (candidates for removal) SELECT indexrelname, idx_scan FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexrelname NOT LIKE '%_pkey' ORDER BY pg_relation_size(indexrelid) DESC;
B-Tree Index
B-tree is the default index type in PostgreSQL. It maintains a balanced tree structure with O(log N) lookup for equality and range queries. It supports: =, <, <=,>, >=, BETWEEN,IN, and LIKE 'prefix%'.
| Supports | Does NOT Support |
|---|---|
| WHERE email = 'user@example.com' | WHERE email LIKE '%@gmail.com' (leading wildcard) |
| WHERE created_at > '2024-01-01' | WHERE lower(email) = '...' (without expression index) |
| WHERE id IN (1, 2, 3) | Full-text search (use GIN + tsvector) |
| WHERE name LIKE 'John%' | Array containment (use GIN) |
| ORDER BY created_at DESC | Geometric/spatial queries (use GiST) |
Multi-Column B-tree: Column Order Matters
In a multi-column index (a, b, c), the index is sorted bya first, then b within each a value, then c. The leading column rule: the index can only be used if the query filters on the leading column(s).
-- Index on (user_id, created_at, status) CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC, status); -- ✅ Uses index (filters on leading column) SELECT * FROM orders WHERE user_id = 42; SELECT * FROM orders WHERE user_id = 42 AND created_at > '2024-01-01'; SELECT * FROM orders WHERE user_id = 42 AND created_at > '2024-01-01' AND status = 'pending'; -- ❌ Cannot use this index (skips leading column) SELECT * FROM orders WHERE created_at > '2024-01-01'; SELECT * FROM orders WHERE status = 'pending'; -- ✅ Index-only scan if all needed columns are in the index SELECT user_id, created_at, status FROM orders WHERE user_id = 42;
Index-Only Scans
If all columns needed by the query exist in the index, PostgreSQL can answer the query without touching the heap table at all. This requires the visibility map to confirm all tuples on the page are visible (recently VACUUMed). Index-only scans are the fastest possible access path.
Hash Index
Hash indexes only support equality (=) lookups. They're faster than B-tree for pure equality on large values (like UUIDs or long strings) because they hash the value to a fixed-size bucket rather than comparing the full value.
| Aspect | B-tree | Hash |
|---|---|---|
| Equality (=) | ✅ O(log N) | ✅ O(1) amortized |
| Range (<, >, BETWEEN) | ✅ Supported | ❌ Not supported |
| ORDER BY | ✅ Sorted output | ❌ No ordering |
| Multi-column | ✅ Supported | ❌ Single column only |
| Size | Larger (stores full values) | Smaller (stores hashes) |
| WAL-logged | ✅ Always | ✅ Since PG 10 |
| Best for | General purpose | Equality on large values (UUID, long text) |
-- Hash index for pure equality lookups on UUID CREATE INDEX idx_sessions_token ON sessions USING HASH (session_token); -- Only useful for: SELECT * FROM sessions WHERE session_token = 'abc123...'; -- NOT useful for: -- SELECT * FROM sessions WHERE session_token > 'abc'; -- won't use hash
When to Use Hash Over B-tree
In practice, B-tree is almost always sufficient. Consider hash only when: (1) you exclusively do equality lookups, (2) the indexed column has large values (reducing index size matters), and (3) you've benchmarked and confirmed the hash index is faster for your workload. Most teams never need hash indexes.
GIN Index (Generalized Inverted Index)
GIN indexes are designed for types that contain multiple values — arrays, jsonb, and tsvector (full-text search). They build an inverted index: for each element/key, they store a list of rows containing that element.
GIN Index Use Cases
- ✅Full-text search: GIN on tsvector columns for @@ queries
- ✅jsonb containment: GIN on jsonb for @>, ?, ?|, ?& operators
- ✅Array operations: GIN on array columns for @>, &&, = ANY()
- ✅Trigram search: GIN with pg_trgm for LIKE '%substring%' queries
-- Full-text search ALTER TABLE articles ADD COLUMN search_vector tsvector GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || body)) STORED; CREATE INDEX idx_articles_search ON articles USING GIN (search_vector); SELECT * FROM articles WHERE search_vector @@ to_tsquery('postgresql & mvcc'); -- jsonb containment CREATE INDEX idx_events_payload ON events USING GIN (payload); SELECT * FROM events WHERE payload @> '{"type": "purchase", "amount_gt": 100}'; -- Array containment CREATE INDEX idx_articles_tags ON articles USING GIN (tags); SELECT * FROM articles WHERE tags @> ARRAY['postgresql', 'performance']; -- Trigram (pg_trgm) — enables LIKE '%substring%' with index CREATE EXTENSION IF NOT EXISTS pg_trgm; CREATE INDEX idx_users_name_trgm ON users USING GIN (name gin_trgm_ops); SELECT * FROM users WHERE name LIKE '%john%'; -- uses the GIN index!
GIN Write Overhead
GIN indexes are slower to build and update than B-tree. Every INSERT into an array or jsonb column may add multiple entries to the GIN index. Use gin_pending_list_limit to batch updates (fast-update mode). GIN is optimized for read-heavy workloads with infrequent writes.
GiST Index (Generalized Search Tree)
GiST is a framework for building balanced tree indexes on complex data types. It supports operations that B-tree cannot: containment, overlap, nearest-neighbor search. It's the index behind PostGIS, range types, and full-text search (alternative to GIN).
| Use Case | GiST Enables |
|---|---|
| PostGIS spatial queries | ST_Within, ST_DWithin, ST_Intersects, <-> (nearest neighbor) |
| Range type exclusion | EXCLUDE USING GIST (range WITH &&) — prevent overlaps |
| Full-text search | Alternative to GIN — smaller, faster updates, slower reads |
| Geometric types | Contains, overlaps, distance for point/box/polygon |
| inet/cidr | Subnet containment queries (<<, >>) |
-- PostGIS: find restaurants within 5km CREATE INDEX idx_restaurants_location ON restaurants USING GIST (location); SELECT name, ST_Distance(location, ST_MakePoint(-73.99, 40.73)::geography) AS dist FROM restaurants WHERE ST_DWithin(location, ST_MakePoint(-73.99, 40.73)::geography, 5000) ORDER BY location <-> ST_MakePoint(-73.99, 40.73)::geography; -- Range exclusion constraint (prevent overlapping bookings) CREATE TABLE bookings ( id serial PRIMARY KEY, room_id int NOT NULL, during tstzrange NOT NULL, EXCLUDE USING GIST (room_id WITH =, during WITH &&) ); -- inet containment CREATE INDEX idx_logs_ip ON access_logs USING GIST (client_ip inet_ops); SELECT * FROM access_logs WHERE client_ip << '192.168.1.0/24';
GiST vs GIN for Full-Text Search
GIN is like a complete phone book — every word maps to every document containing it. Fast to look up, expensive to maintain. GiST is like a hierarchical map — regions contain sub-regions. Faster to update, but lookups may need to check multiple branches (lossy). Use GIN for read-heavy FTS, GiST when you need fast writes or combine FTS with other GiST-able operations (like spatial).
BRIN Index (Block Range Index)
BRIN stores min/max values per block range (default 128 pages). It's extremely small — an entire BRIN index might be a few KB for a table with billions of rows. But it only works when data has natural correlation with physical storage order.
The Calendar Analogy
Imagine a filing cabinet where documents are stored in chronological order. A BRIN index is like a label on each drawer: 'January 1-15', 'January 16-31'. If you want documents from January 10, you only open the first drawer. But if documents were filed randomly, the labels would say 'January 1 - December 31' on every drawer — useless. BRIN only works when data is physically ordered by the indexed column.
| Aspect | B-tree | BRIN |
|---|---|---|
| Size | Large (entry per row) | Tiny (entry per block range) |
| Works on random data | ✅ Yes | ❌ No — needs physical correlation |
| Precision | Exact — points to specific rows | Lossy — points to block ranges, rechecks needed |
| Best for | General queries | Append-only tables with time-ordered data |
| Maintenance | Updated on every write | Minimal — only when new ranges created |
-- Perfect use case: append-only events table -- Data is inserted chronologically, so created_at correlates with physical order CREATE INDEX idx_events_created_brin ON events USING BRIN (created_at) WITH (pages_per_range = 128); -- Index size comparison on 100M row table: -- B-tree on created_at: ~2 GB -- BRIN on created_at: ~100 KB -- Query uses BRIN to skip irrelevant block ranges SELECT * FROM events WHERE created_at BETWEEN '2024-03-01' AND '2024-03-02'; -- Check correlation (1.0 = perfect physical ordering) SELECT attname, correlation FROM pg_stats WHERE tablename = 'events' AND attname = 'created_at'; -- correlation close to 1.0 or -1.0 → BRIN will work well
BRIN Requires Physical Correlation
Check pg_stats.correlation before creating a BRIN index. If correlation is close to 0, the data is randomly distributed and BRIN will scan most of the table anyway. BRIN is ideal for append-only tables (logs, events, time-series) where new rows always have the latest timestamp.
Partial & Expression Indexes
Partial Indexes
A partial index only includes rows matching a WHERE clause. It's smaller, faster to maintain, and faster to scan — because it only indexes the rows you actually query.
-- Only index active orders (90% of queries filter on status = 'pending') CREATE INDEX idx_orders_pending ON orders(user_id, created_at) WHERE status = 'pending'; -- Only index non-deleted records (soft delete pattern) CREATE INDEX idx_users_active ON users(email) WHERE deleted_at IS NULL; -- Partial unique: email must be unique among active users only CREATE UNIQUE INDEX idx_users_email_active ON users(email) WHERE deleted_at IS NULL; -- Allows: same email in deleted records (re-registration) -- Query MUST include the partial index condition to use it: SELECT * FROM orders WHERE user_id = 42 AND status = 'pending'; -- ✅ uses index SELECT * FROM orders WHERE user_id = 42 AND status = 'shipped'; -- ❌ cannot use index
Expression Indexes (Functional)
Index on the result of an expression or function. The query must use the exact same expression to benefit from the index.
-- Case-insensitive email lookup CREATE INDEX idx_users_email_lower ON users(lower(email)); -- Query must use lower(): SELECT * FROM users WHERE lower(email) = 'user@example.com'; -- ✅ uses index SELECT * FROM users WHERE email = 'user@example.com'; -- ❌ different expression -- Index on extracted JSON field CREATE INDEX idx_events_type ON events((payload->>'type')); SELECT * FROM events WHERE payload->>'type' = 'purchase'; -- Index on date part of timestamp CREATE INDEX idx_orders_date ON orders(date(created_at)); SELECT * FROM orders WHERE date(created_at) = '2024-03-15';
Covering Indexes & Unique Indexes
Covering Indexes (INCLUDE)
A covering index stores additional columns in the leaf pages without making them part of the search key. This enables index-only scans for queries that need those columns without the overhead of indexing them.
-- Without INCLUDE: must visit heap to get 'status' and 'total' CREATE INDEX idx_orders_user ON orders(user_id); SELECT user_id, status, total FROM orders WHERE user_id = 42; -- Index scan + heap fetch for status and total -- With INCLUDE: index-only scan — no heap access needed CREATE INDEX idx_orders_user_covering ON orders(user_id) INCLUDE (status, total); SELECT user_id, status, total FROM orders WHERE user_id = 42; -- Index-only scan — all data in the index -- INCLUDE columns are NOT usable for search predicates: SELECT * FROM orders WHERE status = 'pending'; -- ❌ cannot use INCLUDE column for filtering
Unique Indexes
-- Basic unique index CREATE UNIQUE INDEX idx_users_email ON users(email); -- Multi-column unique (combination must be unique) CREATE UNIQUE INDEX idx_subscriptions_user_plan ON subscriptions(user_id, plan_id); -- Partial unique: unique only among non-deleted records CREATE UNIQUE INDEX idx_users_email_active ON users(email) WHERE deleted_at IS NULL; -- Allows multiple deleted users with same email, but only one active
Index Maintenance
-- Build index without blocking writes (takes longer but no downtime) CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status); -- Rebuild a bloated index without lock REINDEX INDEX CONCURRENTLY idx_orders_status; -- Check index bloat SELECT schemaname, tablename, indexname, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size FROM pg_stat_user_indexes ORDER BY pg_relation_size(indexrelid) DESC LIMIT 10;
Interview Questions
Q:When would the planner choose a sequential scan over an index scan?
A: When the query returns a large percentage of the table (typically > 5-10%). Sequential I/O is faster than random I/O — reading the entire table sequentially is cheaper than bouncing around the disk following index pointers. Also when the table is very small (fits in a few pages), or when there's no suitable index for the query predicate.
Q:What's the difference between GIN and GiST for full-text search?
A: GIN builds a complete inverted index — every word maps to every document. Lookups are fast (exact match), but writes are expensive (must update many entries). GiST uses a lossy tree structure — faster to update, but lookups may produce false positives that require rechecking. Use GIN for read-heavy FTS workloads, GiST when write performance matters more.
Q:How would you enable case-insensitive search with an index?
A: Create an expression index on lower(column): CREATE INDEX idx_email_lower ON users(lower(email)). Then query with the same expression: WHERE lower(email) = lower('User@Example.com'). The query MUST use the exact same expression as the index — WHERE email ILIKE '...' won't use a lower() index. Alternatively, use citext extension for a case-insensitive text type.
Q:What is a partial index and when would you use one?
A: A partial index only includes rows matching a WHERE clause. Use cases: (1) Index only 'pending' orders when 95% of queries filter on that status — smaller index, faster lookups. (2) Unique constraint only on non-deleted records: CREATE UNIQUE INDEX ON users(email) WHERE deleted_at IS NULL. (3) Index on a boolean column where you only query one value: WHERE is_active = true.
Q:Why is BRIN so small and when does it fail?
A: BRIN stores only min/max per block range (128 pages by default), not per row. A 100M row table might have a 100 KB BRIN index vs a 2 GB B-tree. It fails when data lacks physical correlation with the indexed column — if timestamps are randomly distributed across pages, every block range spans the full time range, and BRIN can't eliminate any blocks. Check pg_stats.correlation before using BRIN.
Common Mistakes
Indexing every column
Creating an index on every column 'just in case' — thinking more indexes = faster queries.
✅Every index slows down writes (INSERT/UPDATE/DELETE must maintain all indexes). Only index columns that appear in WHERE, JOIN, and ORDER BY clauses of actual queries. Monitor pg_stat_user_indexes to find and remove unused indexes.
Wrong column order in multi-column index
CREATE INDEX ON orders(status, user_id) when most queries filter on user_id alone.
✅Put the most selective and most frequently filtered column first. An index on (user_id, status) supports queries on user_id alone, but an index on (status, user_id) cannot help queries that only filter on user_id (leading column rule).
Using LIKE '%term%' without pg_trgm
Expecting a B-tree index to help with LIKE '%substring%' queries.
✅B-tree only supports prefix matching (LIKE 'prefix%'). For substring search, install pg_trgm and create a GIN index: CREATE INDEX idx_name_trgm ON users USING GIN (name gin_trgm_ops). This enables indexed LIKE '%substring%' queries.
Not using CREATE INDEX CONCURRENTLY
Running CREATE INDEX on a production table — blocks all writes until the index is built.
✅Always use CREATE INDEX CONCURRENTLY in production. It takes longer (two table scans) but doesn't block writes. The only downside: it can't run inside a transaction and may fail if there are concurrent schema changes.
Ignoring index bloat
Never checking index sizes or rebuilding indexes after major data changes.
✅After bulk DELETEs or heavy UPDATE workloads, indexes accumulate dead entries. Monitor index sizes relative to table sizes. Use REINDEX CONCURRENTLY to rebuild bloated indexes without downtime.