Single-TableKey OverloadingAdjacency ListHierarchicalComposite KeysAccess Patterns

Primary Key & Single-Table Design

The most important skill in DynamoDB. Every other decision — indexes, capacity, performance — flows from your primary key design.

50 min read9 sections

Access Pattern Analysis

In relational databases, you model entities first and figure out queries later. In DynamoDB, you do the opposite: list every query your application needs first, then design your table to serve them.

🔑 The Required First Step

Before creating a DynamoDB table, write down every access pattern. For each: what are the inputs? What is returned? How frequently is it called? If a pattern cannot be served efficiently by your key design, you must either add a GSI or reconsider the model.

Example: E-Commerce Application

access-patterns.txttext

Access Pattern Analysis:
─────────────────────────────────────────────────────────────
Pattern                          | Input        | Frequency
─────────────────────────────────────────────────────────────
Get user profile                 | userId       | High
Get all orders for a user        | userId       | High
Get specific order               | orderId      | Medium
Get orders by date range         | userId+dates | Medium
Get all items in an order        | orderId      | Medium
Get user's recent 10 orders      | userId       | High
Find orders by status (pending)  | status       | Low (admin)
─────────────────────────────────────────────────────────────

Key Design Decision:
  PK = USER#<userId>
  SK = USER#<userId>         → user profile
  SK = ORDER#<timestamp>#<orderId> → orders (sorted by time)
  
  GSI1: PK = ORDER#<orderId>, SK = ITEM#<itemId> → order items

📐

The Architect's Blueprint

You wouldn't build a house and then decide where the plumbing goes. In DynamoDB, your access patterns are the blueprint. The table design is the house built to serve those patterns. Changing the blueprint after construction means demolishing and rebuilding — that's why access pattern analysis comes first, always.

Choosing a Partition Key

The partition key determines how data is distributed across physical partitions. A good partition key has three properties: high cardinality, even access distribution, and natural alignment with your access patterns.

Partition Key	Cardinality	Distribution	Verdict
userId	High (millions)	Even (each user ~equal traffic)	✅ Excellent
orderId	High (millions)	Even (each order accessed once)	✅ Excellent
deviceId	High (thousands+)	Even (IoT sensors)	✅ Good
status	Low (3-5 values)	Skewed (most items are 'active')	❌ Terrible
date	Medium (365/year)	Skewed (today is hot)	❌ Bad
country	Low (200)	Skewed (US gets 80% traffic)	❌ Bad
boolean (isActive)	2 values	Extremely skewed	❌ Never use

Good Partition Key Properties

✅High cardinality — many distinct values distributes load evenly
✅Even access distribution — not all requests going to same value
✅Natural to your access patterns — not forced or artificial
✅Stable — doesn't change over time for the same entity
✅Available at query time — you must know the PK to Query

Choosing a Sort Key

The sort key enables range queries within a partition. Items sharing a partition key are stored together, sorted by sort key value. This is where DynamoDB's query power lives.

Sort Key Conditions Available in Query

Operator	Use Case	Example
=	Exact match	SK = 'PROFILE'
<, <=, >, >=	Range queries	SK > '2024-01-01'
BETWEEN	Date/time ranges	SK BETWEEN '2024-01' AND '2024-06'
begins_with	Prefix matching (hierarchical)	SK begins_with 'ORDER#'

Composite Sort Keys for Hierarchical Queries

sort-key-hierarchy.txttext

Sort Key: COUNTRY#US#STATE#CA#CITY#SF

Query all US:     SK begins_with "COUNTRY#US"
Query California: SK begins_with "COUNTRY#US#STATE#CA"
Query SF:         SK begins_with "COUNTRY#US#STATE#CA#CITY#SF"

One sort key design serves multiple levels of hierarchy.

Sort Key as Version

Use sort key for versioning: SK = v0000001, v0000002, etc. Query all versions with begins_with, get latest with ScanIndexForward=false and Limit=1. Or maintain a separate item with SK=LATEST that duplicates the current version.

Single-Table Design Philosophy

Single-table design stores multiple entity types in one table. Users, orders, products, and reviews all live together, differentiated by key prefixes and an entity type attribute.

Why Single-Table?

DynamoDB has no joins. If you need data from two tables, that means two network round trips. Single-table design co-locates related items in the same partition, enabling a single Query to fetch a user AND their orders in one request.

single-table.txttext

Single-Table Design Example:
═══════════════════════════════════════════════════════════════
PK              | SK                    | Type    | Data
═══════════════════════════════════════════════════════════════
USER#123        | USER#123              | User    | {name, email, ...}
USER#123        | ORDER#2024-01-15#abc  | Order   | {total, status, ...}
USER#123        | ORDER#2024-01-20#def  | Order   | {total, status, ...}
USER#123        | FOLLOWER#456          | Follow  | {followedAt, ...}
USER#456        | USER#456              | User    | {name, email, ...}
USER#456        | FOLLOWED_BY#123       | Follow  | {followedAt, ...}
ORDER#abc       | ITEM#001              | Item    | {product, qty, ...}
ORDER#abc       | ITEM#002              | Item    | {product, qty, ...}
═══════════════════════════════════════════════════════════════

Queries enabled:
  • Get user profile:     PK = USER#123, SK = USER#123
  • Get user's orders:    PK = USER#123, SK begins_with "ORDER#"
  • Get recent orders:    PK = USER#123, SK begins_with "ORDER#", 
                          ScanIndexForward=false, Limit=5
  • Get order items:      PK = ORDER#abc, SK begins_with "ITEM#"
  • Get user's followers: PK = USER#123, SK begins_with "FOLLOWER#"

Item Collections

All items sharing a partition key form an item collection. They are physically co-located on the same partition, which means querying them is fast — a single sequential read from disk. This is the performance advantage of single-table design.

Key Overloading & Naming

In single-table design, the same key attributes hold different types of values depending on the item type. This is called key overloading.

Approach	Key Names	Pros	Cons
Generic	PK, SK	Flexible for single-table	Less readable in console
Semantic	userId, createdAt	Clear intent	Less flexible for multi-entity
Prefixed values	USER#123, ORDER#abc	Namespace items, enable begins_with	Slightly more storage

key-overloading.txttext

Key Overloading Example:
Same PK attribute holds different entity references:

PK = USER#123,  SK = USER#123        → User profile
PK = USER#123,  SK = ORDER#timestamp → User's order
PK = ORDER#abc, SK = ORDER#abc       → Order details
PK = ORDER#abc, SK = ITEM#001        → Order line item

The PK and SK columns are generic — their meaning depends on
the item type. An "entityType" attribute disambiguates.

Why Prefixes?

Prefixes like USER#, ORDER#, ITEM# serve two purposes: (1) they namespace items so different entity types don't collide, and (2) they enable begins_with queries to fetch only items of a specific type within a partition.

Advanced Key Patterns

Adjacency List Pattern (Many-to-Many)

adjacency-list.txttext

Modeling many-to-many relationships:

PK = USER#123, SK = USER#123         → user item
PK = USER#123, SK = FOLLOWS#456      → user 123 follows user 456
PK = USER#456, SK = FOLLOWED_BY#123  → inverted relationship

Query who user follows:    PK = USER#123, SK begins_with "FOLLOWS#"
Query followers of user:   PK = USER#456, SK begins_with "FOLLOWED_BY#"

Both directions queryable without a join.

Hierarchical Data Pattern

hierarchical-data.txttext

Parent-child relationships in one partition:

PK = ORG#acme, SK = ORG#acme                        → org item
PK = ORG#acme, SK = DEPT#engineering                 → department
PK = ORG#acme, SK = DEPT#engineering#USER#123        → user in dept

Query entire org:           PK = ORG#acme
Query specific department:  PK = ORG#acme, SK begins_with "DEPT#engineering"
Query users in department:  PK = ORG#acme, SK begins_with "DEPT#engineering#USER#"

Time-Series Pattern

time-series.txttext

Append-only writes, queries by time range:

PK = DEVICE#sensor1, SK = 2024-01-15T10:30:00Z
PK = DEVICE#sensor1, SK = 2024-01-15T10:31:00Z

Query last hour: PK = DEVICE#sensor1 
                 AND SK BETWEEN '2024-01-15T09:30' AND '2024-01-15T10:30'

⚠️ Hot partition risk: all writes go to current time.
Solution: shard by time bucket or add random suffix.

Optimistic Locking Pattern

optimistic-locking.tstypescript

// Add version attribute to items
// UpdateItem condition: version must match expected
const params = {
  TableName: "Products",
  Key: { PK: "PRODUCT#123", SK: "PRODUCT#123" },
  UpdateExpression: "SET #name = :name, #version = #version + :one",
  ConditionExpression: "#version = :expectedVersion",
  ExpressionAttributeNames: { "#name": "name", "#version": "version" },
  ExpressionAttributeValues: { 
    ":name": "Updated Name",
    ":one": 1,
    ":expectedVersion": 5  // current known version
  }
};
// ConditionalCheckFailedException → retry with fresh read

When Single-Table is Wrong

Single-table design is powerful but not always appropriate. It adds complexity that may not be justified for every use case.

Factor	Single-Table	Multi-Table
Team experience	Requires DynamoDB expertise	Easier to understand
Access patterns	Must be known upfront	More flexible to evolve
Query complexity	One request for related data	Multiple requests needed
Analytics/reporting	Very difficult	Easier with separate tables
Migration	All-or-nothing schema changes	Independent table evolution
Debugging	Complex — items look different	Clear — each table is one entity

The Honest Assessment

Single-table design is the "expert mode" of DynamoDB. If your team is new to DynamoDB, start with one table per entity type. You can always consolidate later once access patterns stabilize. A poorly designed single table is worse than multiple well-designed tables.

Interview Questions

Q:What is single-table design and why would you use it?

A: Storing multiple entity types (users, orders, products) in one DynamoDB table using generic key names (PK, SK) with prefixed values. The benefit: related items share a partition key, enabling a single Query to fetch multiple entity types without joins (which DynamoDB doesn't support). Trade-off: increased complexity and harder ad-hoc querying.

Q:How do you model a many-to-many relationship in DynamoDB?

A: Use the adjacency list pattern. Store both directions of the relationship as separate items. PK=USER#123, SK=FOLLOWS#456 and PK=USER#456, SK=FOLLOWED_BY#123. Each direction is independently queryable with begins_with. For the reverse lookup without duplicating data, use a GSI with inverted PK/SK.

Q:What makes a good partition key?

A: Three properties: (1) High cardinality — many distinct values. (2) Even access distribution — no single value gets disproportionate traffic. (3) Natural to access patterns — you must know the PK to query. Examples: userId, orderId, deviceId. Anti-examples: status, date, country.

Q:How would you handle a hot partition key?

A: Write sharding: append a random suffix (1-N) to the partition key to spread writes across N partitions. Reads require scatter-gather across all N shards. Calculated sharding: suffix = hash(userId) % N for deterministic routing. Also consider: caching hot reads with DAX, or switching to on-demand capacity mode.

Q:What is the difference between begins_with and contains in DynamoDB?

A: begins_with is a key condition — it can be used in KeyConditionExpression and leverages the sort key index for efficient queries. contains is a filter expression — it's applied AFTER data is read and does NOT reduce RCU consumption. Always prefer begins_with on sort keys over contains in filter expressions.

Common Mistakes

📋

Designing keys without listing access patterns first

Creating a table with 'obvious' keys (id as PK) then discovering you can't serve your queries. Always list every access pattern before designing keys.

✅Write down every query with inputs and outputs before creating the table. The table structure IS the query plan.

🧩

Using single-table design without understanding it

Copying single-table patterns from blog posts without understanding why. If you can't explain why items are co-located and what queries benefit, you're adding complexity without value.

✅Start simple with one table per entity. Consolidate into single-table only when access patterns are clear and co-location provides measurable benefit.

🔤

Forgetting that sort key comparisons are lexicographic

Storing numbers as sort keys without zero-padding. '9' sorts after '10' lexicographically.

✅Use zero-padded strings (v0000001) or ISO 8601 timestamps (2024-01-15T10:30:00Z) which sort correctly as strings.

📈

Unbounded item collections

Designing a partition key where items accumulate without limit (all events for a device, all messages in a chat). Eventually hits the 10 GB partition limit.

✅Use time-bucketed partition keys: DEVICE#sensor1#2024-01. Rotate partitions by time period.

💸

Not considering write amplification from GSIs

Every write to the base table also writes to each GSI that includes the item. 3 GSIs means 4× write cost.

✅Factor GSI write amplification into capacity planning. Minimize GSI count and use KEYS_ONLY projection where possible.