Best Practices
Master production-ready API design — versioning strategies, idempotency, pagination patterns, error handling conventions, and API contracts with OpenAPI.
Table of Contents
The Big Picture — Why Best Practices Matter
Best practices aren't academic rules — they're battle scars from production systems. Every practice in this guide exists because someone shipped an API without it, and something broke in a painful, expensive way.
The Banking System Analogy
Imagine you're building a public banking service. Versioning is like policy updates — you can't change the rules overnight and break every customer's workflow. You announce changes, support old policies for a transition period, and phase them out. Idempotency is like duplicate payment protection — if a customer accidentally submits a transfer twice, the system must process it only once. Pagination is like the queue system — you don't dump 10,000 customers into the lobby at once. You serve them in manageable batches. Error handling is like clear communication — when something goes wrong, you don't say 'Error.' You say 'Your account number is invalid. Expected format: XXXX-XXXX.' API contracts are like legal agreements — both the bank and the customer agree on the terms before any transaction happens.
🔥 Key Insight
These practices aren't optional for production systems. Skip versioning and you'll break mobile apps. Skip idempotency and you'll charge customers twice. Skip pagination and you'll crash under load. Skip error handling and your team will spend hours debugging. Skip contracts and frontend and backend will constantly disagree.
Overview — What Can Go Wrong
Every best practice prevents a specific category of failure. Here's the map of what goes wrong without them:
Versioning
Prevents breaking clients
Idempotency
Prevents duplicate actions
Pagination
Prevents performance collapse
Errors
Prevents debugging nightmares
Contracts
Prevents team misalignment
💥 Without Best Practices
- API change breaks 50,000 mobile users on old app version
- Customer charged $500 twice because of a retry
- GET /products returns 2M rows, server OOMs
- Error response:
{"error": true}— useless - Frontend builds against wrong response shape for 2 weeks
✅ With Best Practices
- Old clients keep working; new clients use v2
- Idempotency key deduplicates the retry automatically
- Cursor pagination returns 50 items at a time, efficiently
- Error: code, message, field, suggestion — instantly debuggable
- OpenAPI spec is the single source of truth for both teams
Versioning Strategies
APIs evolve. Fields get renamed, endpoints change, response shapes shift. Without versioning, every change risks breaking every client that depends on your API — mobile apps that can't force-update, third-party integrations you don't control, and internal services on different release cycles.
The Mobile App Problem
You ship a mobile app. 500,000 users download v1. You change the API response shape. Now every user on v1 sees a broken app — and they can't update until they visit the app store. You need both the old and new API to work simultaneously. That's versioning.
Versioning Approaches
| Approach | Example | Pros | Cons |
|---|---|---|---|
| URL Path | /api/v1/users /api/v2/users | Explicit, easy to understand, easy to route | URL pollution, hard to share code between versions |
| Header | Accept: application/vnd.api+json;version=2 | Clean URLs, version is metadata not resource identity | Hidden — not visible in browser, harder to test |
| Query Param | /api/users?version=2 | Easy to add, visible in URL | Pollutes query string, caching complications |
v1 (current — used by 500K mobile users): GET /api/v1/users/42 Response: { "id": 42, "name": "Alice", "email": "alice@example.com" } v2 (new — splits name into first/last): GET /api/v2/users/42 Response: { "id": 42, "firstName": "Alice", "lastName": "Smith", "email": "alice@example.com" } Both versions run simultaneously. v1 clients keep working. New clients use v2. After 6 months, deprecate v1 with a sunset header.
Deprecation Strategy
Announce Deprecation
Add a Sunset header to v1 responses: 'Sunset: Sat, 01 Mar 2025 00:00:00 GMT'. Add a Deprecation header. Update documentation. Notify consumers via email/changelog.
Monitor Usage
Track how many requests still hit v1. If 30% of traffic is still on v1, you can't kill it yet. Set a threshold (e.g., < 1% of traffic) before removal.
Remove Old Version
Once usage drops below threshold, return 410 Gone for v1 endpoints with a message pointing to v2. Don't just 404 — tell them what happened and where to go.
🎯 Interview Insight
URL versioning is the safest default. It's explicit, easy to route at the load balancer level, and every developer understands it. Header versioning is cleaner but harder to discover and test. In an interview, pick URL versioning and explain the deprecation lifecycle.
Idempotency
An operation is idempotent if performing it multiple times produces the same result as performing it once. This is one of the most critical concepts in distributed systems — because networks are unreliable, and retries are inevitable.
The Double-Charge Problem
A customer clicks 'Pay $200'. The request reaches the server, the payment is processed, but the response is lost due to a network timeout. The client retries. Without idempotency, the customer is charged $400. With idempotency, the server recognizes the retry (via an idempotency key), returns the original result, and the customer is charged $200 once. This isn't a theoretical problem — it happens thousands of times per day at scale.
HTTP Methods & Idempotency
| Method | Idempotent? | Safe? | Explanation |
|---|---|---|---|
| GET | ✅ Yes | ✅ Yes | Reading data never changes state |
| PUT | ✅ Yes | ❌ No | Replacing a resource with the same data = same result |
| DELETE | ✅ Yes | ❌ No | Deleting an already-deleted resource = still deleted |
| PATCH | ⚠️ Depends | ❌ No | 'Set status=active' is idempotent. 'Increment count' is not |
| POST | ❌ No | ❌ No | Creating a resource twice = two resources |
Idempotency Keys — The Solution for POST
Since POST is not naturally idempotent, we make it idempotent using an idempotency key — a unique identifier the client generates and sends with the request. The server uses this key to detect and deduplicate retries.
First request: POST /api/payments Idempotency-Key: pay_abc123xyz Body: { "amount": 200, "currency": "USD", "to": "merchant_42" } Server: 1. Check: has "pay_abc123xyz" been processed? → No 2. Process payment → success 3. Store: { key: "pay_abc123xyz", result: { id: "txn_789", status: "success" } } 4. Return: 201 Created { "id": "txn_789", "status": "success" } Retry (same key): POST /api/payments Idempotency-Key: pay_abc123xyz Body: { "amount": 200, "currency": "USD", "to": "merchant_42" } Server: 1. Check: has "pay_abc123xyz" been processed? → Yes 2. Return stored result: 200 OK { "id": "txn_789", "status": "success" } 3. Payment NOT processed again
Implementation Rules
- ✅Client generates the key (UUID v4 is standard)
- ✅Server stores key → result mapping (Redis with TTL works well)
- ✅Same key + same body = return cached result
- ✅Same key + different body = return 422 (conflict)
- ✅Keys expire after 24-48 hours (don't store forever)
Where Idempotency Is Critical
- ✅Payment processing (Stripe uses idempotency keys)
- ✅Order creation (prevent duplicate orders)
- ✅Email sending (prevent duplicate emails)
- ✅Any operation with side effects that can't be undone
- ✅Any operation behind an unreliable network
🎯 Interview Insight
Idempotency is a top-tier interview topic. When designing any system that handles money, orders, or irreversible actions, always mention idempotency keys. Explain the flow: client generates key → server checks if key exists → process or return cached result. Mention Stripe as a real-world example.
Pagination Patterns
Without pagination, a query like "get all products" returns every row in the database. With 2 million products, that's a response that crashes the server, saturates the network, and freezes the client. Pagination breaks large datasets into manageable pages.
Offset-Based Pagination
Page 1: GET /api/products?limit=20&offset=0 Page 2: GET /api/products?limit=20&offset=20 Page 3: GET /api/products?limit=20&offset=40 SQL: SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 40 Problem at scale: OFFSET 1000000 → database scans and skips 1M rows before returning 20 This gets slower and slower as offset increases Problem with mutations: User is on page 2 (offset=20). A new product is inserted at position 5. Page 3 (offset=40) now includes a product that was already on page 2. → Duplicate items in the feed
Cursor-Based Pagination
Page 1: GET /api/products?limit=20 Response: { "data": [...20 products...], "pagination": { "next_cursor": "eyJpZCI6NDJ9", // base64 encoded { "id": 42 } "has_more": true } } Page 2: GET /api/products?limit=20&cursor=eyJpZCI6NDJ9 SQL: SELECT * FROM products WHERE id > 42 ORDER BY id LIMIT 20 Why this is better: → No OFFSET — database seeks directly to id > 42 (index scan) → Performance is constant regardless of page number → Insertions don't cause duplicates (cursor is a stable pointer)
Keyset Pagination
Keyset pagination is cursor-based pagination using the actual column values as the cursor instead of an opaque token. It's the same concept — "give me rows after this point" — but the cursor is transparent.
| Feature | Offset | Cursor | Keyset |
|---|---|---|---|
| Performance at page 1000 | Very slow (scans 20K rows) | Fast (index seek) | Fast (index seek) |
| Jump to page N | ✅ Yes (offset = N * limit) | ❌ No (must traverse) | ❌ No (must traverse) |
| Consistency on insert | ❌ Duplicates possible | ✅ Stable | ✅ Stable |
| Implementation | Simple | Medium (encode/decode cursor) | Medium (composite WHERE) |
| Best for | Admin panels, small datasets | Infinite scroll, feeds | Time-series, logs |
Use Offset When
- ✅Dataset is small (< 10K rows)
- ✅Users need to jump to specific pages
- ✅Admin dashboards with page numbers
- ✅Simplicity matters more than performance
Use Cursor When
- ✅Dataset is large (100K+ rows)
- ✅Infinite scroll UI (Instagram, Twitter)
- ✅Data changes frequently (new items inserted)
- ✅Performance at scale is critical
🎯 Interview Insight
Always mention cursor-based pagination in system design interviews. Explain why offset breaks at scale (OFFSET 1M scans 1M rows), and how cursor pagination uses an indexed WHERE clause for constant-time performance. Mention that Twitter, Instagram, and Slack all use cursor pagination.
Error Handling Conventions
Error handling is the difference between a debuggable system and a nightmare. When something goes wrong, the error response should tell the developer exactly what happened, why, and how to fix it — without exposing internal implementation details.
HTTP Status Codes — Use Them Correctly
✅ 2xx — Success
- 200 OK — general success
- 201 Created — resource created
- 204 No Content — success, no body (DELETE)
⚠️ 4xx — Client Error
- 400 Bad Request — invalid input
- 401 Unauthorized — not authenticated
- 403 Forbidden — not authorized
- 404 Not Found — resource doesn't exist
- 409 Conflict — duplicate / state conflict
- 422 Unprocessable — validation failed
- 429 Too Many Requests — rate limited
💥 5xx — Server Error
- 500 Internal Server Error — unhandled exception
- 502 Bad Gateway — upstream service failed
- 503 Service Unavailable — overloaded / maintenance
- 504 Gateway Timeout — upstream timed out
Standard Error Response Structure
❌ BAD — Useless error: HTTP 400 { "error": true } ❌ BAD — Exposes internals: HTTP 500 { "error": "NullPointerException at UserService.java:142" } ✅ GOOD — Structured, actionable: HTTP 422 { "error": { "code": "VALIDATION_FAILED", "message": "Request validation failed", "details": [ { "field": "email", "message": "Must be a valid email address", "received": "not-an-email" }, { "field": "age", "message": "Must be between 18 and 120", "received": -5 } ], "request_id": "req_abc123", "documentation_url": "https://api.example.com/docs/errors#VALIDATION_FAILED" } }
Error Handling Rules
Do
- ✅Use correct HTTP status codes (not 200 for errors)
- ✅Include a machine-readable error code (VALIDATION_FAILED)
- ✅Include a human-readable message
- ✅Include field-level details for validation errors
- ✅Include a request_id for tracing in logs
- ✅Link to documentation for the error
Don't
- ❌Return 200 OK with an error body (anti-pattern)
- ❌Expose stack traces, SQL queries, or internal paths
- ❌Return generic 'Something went wrong' without details
- ❌Use 500 for client errors (that's a 4xx)
- ❌Return different error shapes from different endpoints
- ❌Log sensitive data (passwords, tokens) in error responses
🎯 Interview Insight
In system design interviews, mention that every error response should include a request_id that maps to a trace in your logging system. When a user reports "it's broken," you ask for the request_id and find the exact log entry in seconds. This is how production debugging works at scale.
OpenAPI / API Contracts
An API contract is a formal agreement between the API producer and consumer about what the API accepts and returns. OpenAPI (formerly Swagger) is the most widely used standard for defining these contracts.
The Legal Contract Analogy
Imagine building a house without blueprints. The electrician wires for 110V, the appliance team buys 220V equipment. Disaster. An API contract is the blueprint — it defines every endpoint, every field, every type, every possible error BEFORE anyone writes code. Frontend and backend teams build against the contract simultaneously, and when they integrate, everything fits.
What OpenAPI Defines
openapi: 3.0.0 info: title: E-Commerce API version: 1.0.0 paths: /api/v1/products: get: summary: List products parameters: - name: cursor in: query schema: { type: string } - name: limit in: query schema: { type: integer, default: 20, maximum: 100 } responses: '200': description: Product list content: application/json: schema: type: object properties: data: type: array items: { $ref: '#/components/schemas/Product' } pagination: $ref: '#/components/schemas/CursorPagination' components: schemas: Product: type: object required: [id, name, price] properties: id: { type: integer } name: { type: string } price: { type: number, format: double } CursorPagination: type: object properties: next_cursor: { type: string, nullable: true } has_more: { type: boolean }
Contract-First vs Code-First
| Approach | How It Works | Pros | Cons |
|---|---|---|---|
| Contract-First | Write the OpenAPI spec first, then generate server stubs and client SDKs | Teams agree upfront, parallel development, fewer integration bugs | Requires discipline, spec can drift from implementation |
| Code-First | Write the server code first, generate the OpenAPI spec from annotations | Spec always matches code, faster for small teams | Contract is an afterthought, harder to review API design before building |
What Contracts Give You
- ✅Auto-generated documentation (Swagger UI, Redoc)
- ✅Client SDK generation (TypeScript, Python, Java, Go)
- ✅Request/response validation (reject invalid payloads)
- ✅Mock servers for frontend development
- ✅Contract testing (verify implementation matches spec)
When Contracts Are Critical
- ✅Multiple teams consuming the same API
- ✅Public APIs (third-party developers need docs)
- ✅Microservices (service-to-service contracts)
- ✅Mobile apps (can't force-update, need stable contracts)
- ✅Any team larger than 3-4 developers
🎯 Interview Insight
Mention contract-first design in interviews when discussing large teams or microservices. "I'd define the OpenAPI spec first so frontend and backend can develop in parallel. We'd use the spec to generate TypeScript types for the frontend and validation middleware for the backend."
End-to-End Scenario
Let's design a production-ready API for an e-commerce system, applying every best practice from this guide.
The System: ShopAPI
1. VERSIONING Base URL: https://api.shop.com/v1 All endpoints prefixed with /v1 Sunset header on deprecated endpoints v2 runs in parallel when breaking changes are needed 2. PAGINATION (Cursor-based) GET /v1/products?limit=20&cursor=eyJpZCI6NDJ9 Response includes: { data: [...], pagination: { next_cursor, has_more } } Default limit: 20, max limit: 100 3. IDEMPOTENCY (for mutations) POST /v1/orders Header: Idempotency-Key: order_uuid_abc123 Server stores key → result in Redis (TTL: 24h) Retry with same key → returns cached result 4. ERROR HANDLING (consistent structure) All errors follow: { "error": { "code": "INSUFFICIENT_STOCK", "message": "Product #42 has only 3 items in stock", "details": [{ "field": "quantity", "message": "Requested 5, available 3" }], "request_id": "req_xyz789" } } 5. API CONTRACT (OpenAPI 3.0) Spec defined first → frontend generates TypeScript types Server validates requests against spec (middleware) Swagger UI at /docs for interactive documentation Contract tests run in CI to prevent drift
Payment Flow — All Practices Combined
Client Creates Order
POST /v1/orders with Idempotency-Key header. Server validates request against OpenAPI spec. If validation fails → 422 with field-level errors. If valid → create order, store idempotency key → result mapping.
Client Initiates Payment
POST /v1/payments with Idempotency-Key. Server checks: has this key been processed? No → process payment. Yes → return cached result. Network timeout? Client retries with same key — safe.
Client Lists Orders
GET /v1/orders?limit=20&cursor=... Cursor pagination ensures consistent results even as new orders are created. Response includes next_cursor for the next page.
Error Occurs
Payment fails due to insufficient funds. Server returns 402 Payment Required with error code PAYMENT_DECLINED, a human-readable message, and a request_id. Client logs the request_id for support tickets.
💡 This Is What Production Looks Like
Every real e-commerce API (Stripe, Shopify, Amazon) uses these exact patterns. Versioned URLs, idempotency keys on payments, cursor pagination on listings, structured errors with request IDs, and OpenAPI specs for documentation. This isn't theoretical — it's how the industry works.
Trade-offs & Decision Making
Every best practice has trade-offs. The skill is knowing when the added complexity is justified.
Versioning Trade-offs
| Decision | URL Versioning | Header Versioning |
|---|---|---|
| Discoverability | High — visible in URL | Low — hidden in headers |
| Caching | Easy — different URLs = different cache entries | Harder — same URL, need Vary header |
| Routing | Simple — route at load balancer | Complex — need header inspection |
| Cleanliness | URL pollution (/v1, /v2, /v3) | Clean URLs |
| Best for | Public APIs, most use cases | Internal APIs, API gateways |
Pagination Trade-offs
| Decision | Offset | Cursor |
|---|---|---|
| Random page access | ✅ Jump to page 50 | ❌ Must traverse sequentially |
| Performance at depth | ❌ O(offset) — degrades | ✅ O(1) — constant |
| Consistency | ❌ Duplicates on insert | ✅ Stable pointer |
| Implementation | Simple (LIMIT/OFFSET) | Medium (encode/decode cursor) |
| Best for | Small datasets, admin UIs | Large datasets, feeds, infinite scroll |
Contract Strictness Trade-offs
| Decision | Strict Contracts | Flexible Contracts |
|---|---|---|
| Safety | High — rejects unexpected fields | Low — accepts anything |
| Agility | Lower — spec must be updated first | Higher — just ship code |
| Integration bugs | Fewer — caught at validation | More — discovered at runtime |
| Best for | Large teams, public APIs | Prototypes, solo developers |
🎯 Interview Framework
When asked about any of these decisions, frame it as: "It depends on the scale and team size. For a startup with 3 developers, offset pagination and code-first contracts are fine. For a platform with 50 engineers and millions of users, cursor pagination and contract-first design prevent costly bugs."
Interview Questions
Conceptual, scenario-based, and edge-case questions you're likely to encounter.
Q:How do you prevent duplicate payments in a distributed system?
A: Use idempotency keys. The client generates a unique key (UUID) and sends it with the payment request. The server stores the key → result mapping (in Redis with a 24h TTL). On retry, the server checks if the key exists: if yes, return the cached result without reprocessing. If the key exists but the body is different, return 422 Conflict. This is exactly how Stripe handles idempotency.
Q:Why is cursor pagination better than offset at scale?
A: Offset pagination uses SQL OFFSET, which forces the database to scan and skip N rows before returning results. At OFFSET 1,000,000, the DB scans 1M rows just to skip them. Cursor pagination uses a WHERE clause (WHERE id > cursor) which leverages an index — the DB seeks directly to the right position in O(1). Additionally, offset pagination breaks when data is inserted (items shift, causing duplicates), while cursor pagination is stable.
Q:How do you version APIs without breaking existing clients?
A: Use URL versioning (/v1/users, /v2/users). Run both versions simultaneously. Add Sunset and Deprecation headers to old versions. Monitor traffic on old versions. Communicate deprecation timelines via changelogs and emails. Only remove old versions when usage drops below a threshold (e.g., < 1%). Return 410 Gone (not 404) when a version is removed, with a message pointing to the new version.
You're designing an API for a mobile banking app
What best practices would you apply?
Answer: Idempotency keys on all financial operations (transfers, payments) — mobile networks are unreliable and retries are common. URL versioning — mobile apps can't force-update, so old versions must keep working. Cursor pagination for transaction history — users have thousands of transactions. Structured error responses with error codes — the app needs to show specific messages ('Insufficient funds' vs 'Account locked'). OpenAPI contract — the mobile team and backend team need to agree on the API shape before building.
Your API returns 200 OK for all responses, including errors
What's wrong with this approach?
Answer: This breaks HTTP semantics. Intermediaries (CDNs, proxies, load balancers) use status codes to make decisions — a CDN might cache a 200 error response. Monitoring tools count 5xx rates for alerting — if errors are 200, you get no alerts. Client libraries check status codes for error handling — returning 200 forces every client to parse the body to detect errors. The fix: use proper status codes (400, 401, 404, 500) and reserve 200 for actual success.
A developer says 'We don't need API versioning, we'll just be careful'
Why is this dangerous?
Answer: Any change to a response shape (renaming a field, changing a type, removing a field) is a breaking change for clients that depend on the old shape. 'Being careful' doesn't scale — with 10 developers shipping weekly, someone will make a breaking change. Mobile apps can't force-update — users on old versions will break. Third-party integrations you don't control will break. The cost of adding versioning later (when things are already broken) is 10x higher than adding it from day one.
Common Mistakes
These mistakes are common in interviews and in production systems. Each one has caused real outages and real money lost.
Not versioning APIs from day one
Teams skip versioning because 'we only have one client.' Then a mobile app launches, a partner integration goes live, and suddenly you can't change anything without breaking someone. Adding versioning retroactively means migrating all existing clients — a painful, risky process.
✅Add /v1 to your base URL from the very first endpoint. It costs nothing upfront and saves enormous pain later. Even if you never need v2, the /v1 prefix is harmless.
Ignoring idempotency on financial operations
A payment endpoint without idempotency will double-charge customers on network retries. This isn't rare — it happens thousands of times per day at scale. Mobile networks drop connections constantly, and HTTP clients retry automatically.
✅Every POST endpoint that creates a resource or triggers a side effect should accept an Idempotency-Key header. Store key → result in Redis with a 24-48h TTL. Return the cached result on retry.
Using offset pagination at scale
Offset pagination works fine for the first few pages. But at page 5000 (OFFSET 100,000), the database scans and discards 100K rows before returning 20. Response times go from 5ms to 5 seconds. Users on infinite-scroll feeds hit this wall.
✅Use cursor-based pagination for any endpoint that could return more than a few thousand results. Encode the last item's ID (or timestamp) as the cursor. Use WHERE id > cursor instead of OFFSET.
Returning vague error messages
'Something went wrong' tells the developer nothing. 'Error: true' is even worse. Without structured errors, debugging requires reading server logs for every single issue. Support tickets pile up because users can't self-diagnose.
✅Every error response should include: HTTP status code, machine-readable error code, human-readable message, field-level details (for validation), and a request_id for log correlation. Follow a consistent structure across all endpoints.
No API contract between teams
Frontend builds against what they think the API returns. Backend builds what they think frontend needs. They integrate after 2 weeks and nothing matches — field names are different, types are wrong, optional fields are missing. Two weeks of rework.
✅Define the OpenAPI spec before writing code. Both teams review and agree on the contract. Frontend generates TypeScript types from the spec. Backend validates requests against the spec. Run contract tests in CI to catch drift.