DNSLoad BalancerCDNNATFirewallInfrastructureNetworking

Infrastructure Primitives

Learn the infrastructure building blocks — DNS resolution, load balancers, CDNs, NAT, and firewalls that form the backbone of every scalable system.

28 min read10 sections

The Big Picture — What Are Infrastructure Primitives?

Infrastructure primitives are the invisible building blocks that make the internet work. Every time you load a website, stream a video, or call an API, a chain of infrastructure components routes, balances, caches, secures, and delivers your request. You never see them, but without them, nothing works at scale.

🏙️

The City Analogy

Think of the internet as a city. DNS is the phonebook — it translates names ('Amazon HQ') into addresses ('410 Terry Ave'). The load balancer is a traffic police officer — it directs cars to the least congested road. The CDN is a chain of local warehouses — instead of shipping every package from a central factory, you stock popular items in warehouses near each neighborhood. NAT is the office receptionist — the outside world sees one phone number, but internally there are hundreds of extensions. The firewall is the security gate — it checks every person entering the building and blocks anyone unauthorized.

These aren't optional add-ons. They're the foundation. Every system you design in an interview or at work will use some combination of these primitives. Understanding them deeply is what separates someone who can draw boxes on a whiteboard from someone who can actually build scalable systems.

🔥 Key Insight

Infrastructure primitives are not about any single technology. They're about the roles that must be filled in any distributed system: name resolution, traffic distribution, content delivery, address translation, and security enforcement.

How They Work Together

No infrastructure component works in isolation. They form a pipeline that every request passes through. Here's the typical flow when a user visits a website:

👤

User

Types a URL

📖

DNS

Resolves name → IP

🏪

CDN

Serves cached content

⚖️

Load Balancer

Routes to server

🖥️

Server

Processes request

📖

DNS — The Phonebook

Translates human-readable domain names into IP addresses. Without DNS, you'd need to memorize 52.94.236.248 instead of typing amazon.com.

🏪

CDN — The Local Warehouse

Caches content at edge locations close to users. A user in Tokyo gets images from a Tokyo server, not from Virginia.

⚖️

Load Balancer — The Traffic Police

Distributes incoming requests across multiple servers. Prevents any single server from being overwhelmed.

🔐

NAT & Firewall — The Security Gate

NAT translates private IPs to public IPs. Firewalls filter traffic, blocking malicious requests before they reach your servers.

Real Request Flow — All Componentstext

User types: https://shop.example.com/products

1. DNS Resolution
   → Browser asks: "What's the IP of shop.example.com?"
   → DNS returns: 104.18.25.43 (CDN edge IP)

2. CDN Check
   → Request hits CDN edge server in user's region
   → Cache HIT? → Return cached response (fast!)
   → Cache MISS? → Forward to origin (load balancer)

3. Firewall
   → Request passes through WAF (Web Application Firewall)
   → Checks for SQL injection, XSS, rate limiting
   → Blocked? → Return 403. Allowed? → Continue.

4. Load Balancer
   → Routes request to healthiest backend server
   → Algorithm: least connections / round robin / IP hash

5. Server Processing
   → App server handles business logic
   → Queries database, builds response

6. Response flows back:
   Server → Load Balancer → Firewall → CDN (cache it) → User

💡 Why This Matters

In system design interviews, you're expected to place these components correctly in your architecture. Knowing where DNS, CDN, load balancers, and firewalls sit in the request flow is foundational.

Infrastructure pipeline — how DNS, CDN, load balancer, firewall, and server work together in a request flow — The infrastructure pipeline every request passes through — from DNS resolution to server processing

DNS Resolution — Deep Dive

DNS (Domain Name System) is the internet's phonebook. It translates human-friendly domain names likegoogle.cominto machine-friendly IP addresses like142.250.80.46. Without DNS, you'd need to memorize IP addresses for every website you visit.

📱

The Contact List Analogy

You don't memorize your friend's phone number — you save it as 'Alice' in your contacts. When you tap 'Alice', your phone looks up the number. DNS works the same way: you type 'google.com', and DNS looks up the IP address. And just like your phone caches contacts locally, DNS caches results at multiple levels to avoid looking up the same name repeatedly.

DNS Resolution — Step by Step

Browser Cache

Browser checks its own DNS cache first. If you visited google.com 2 minutes ago, the IP is already cached locally. Cache hit → done, no network request needed.

OS Cache

If the browser cache misses, the OS checks its DNS cache. On macOS/Linux, this is managed by the system resolver. On Windows, it's the DNS Client service. Still local — no network hop.

Recursive Resolver (ISP / Public DNS)

If the OS cache misses, the query goes to a recursive resolver — usually your ISP's DNS server or a public one like 8.8.8.8 (Google) or 1.1.1.1 (Cloudflare). This resolver does the heavy lifting of walking the DNS hierarchy.

Root Name Server

The recursive resolver asks a root server: 'Where can I find .com domains?' There are 13 root server clusters worldwide. The root server responds: 'Ask the .com TLD server at 192.5.6.30.'

TLD (Top-Level Domain) Server

The resolver asks the .com TLD server: 'Where can I find google.com?' The TLD server responds: 'The authoritative server for google.com is ns1.google.com at 216.239.32.10.'

Authoritative Name Server

The resolver asks Google's authoritative server: 'What's the IP for google.com?' The authoritative server responds: '142.250.80.46' with a TTL (time-to-live) of 300 seconds. The resolver caches this and returns it to the browser.

DNS Resolution Flowtext

Browser: "What's the IP of google.com?"

Step 1: Browser cache     → MISS
Step 2: OS cache          → MISS
Step 3: Recursive resolver (8.8.8.8)
  │
  ├→ Root server (.):     "Ask .com TLD at 192.5.6.30"
  ├→ TLD server (.com):   "Ask ns1.google.com at 216.239.32.10"
  └→ Authoritative (ns1): "google.com = 142.250.80.46 (TTL: 300s)"

Result cached at:
  → Recursive resolver (for other users too)
  → OS cache
  → Browser cache

Next request within 300s → instant (cache hit)

TTL — Why Caching Matters

⏱️ Short TTL (30-60 seconds)

Changes propagate quickly
Useful during migrations or failovers
More DNS queries → higher latency
More load on DNS servers

⏱️ Long TTL (3600+ seconds)

Fewer DNS queries → lower latency
Less load on DNS infrastructure
Changes take longer to propagate
Stale records during IP changes

What Happens When DNS Fails?

DNS is slow

❌Every page load adds 50-200ms of DNS lookup time
❌Users perceive the site as slow (even if the server is fast)
❌Mobile users on cellular networks are hit hardest
❌Cascading effect: every resource (images, scripts) needs DNS too

DNS is down

❌Your domain becomes unreachable — even if servers are healthy
❌Users see 'DNS_PROBE_FINISHED_NXDOMAIN' errors
❌No failover possible unless you have redundant DNS providers
❌The 2016 Dyn DNS attack took down Twitter, Netflix, Reddit

🎯 Interview Insight

DNS is a single point of failure for your entire system. In interviews, mention using multiple DNS providers (e.g., Route 53 + Cloudflare) for redundancy, and pre-warming DNS caches before traffic migrations.

DNS resolution hierarchy — browser cache, OS cache, recursive resolver, root server, TLD server, authoritative server — DNS resolution walks a hierarchy of caches and servers to translate domain names into IP addresses

Load Balancers

A load balancer distributes incoming traffic across multiple servers so no single server gets overwhelmed. It's the reason a website with millions of users doesn't run on a single machine.

👮

Traffic Police vs Smart Routing

An L4 load balancer is like a traffic police officer at a highway junction — they see the license plate (IP) and destination (port) and direct cars to the least busy lane. They don't know what's inside the car. An L7 load balancer is like a smart routing system at an airport — it reads your boarding pass (HTTP headers, URL path, cookies) and sends you to the correct terminal, gate, and even priority lane based on your ticket class.

L4 vs L7 — The Two Types

🔌 L4 — Transport Layer

Routes based on IP address + port number
Cannot inspect HTTP headers, URLs, or cookies
Very fast — operates at the TCP/UDP level
Lower CPU usage, higher throughput
Use case: TCP proxying, database connections, gaming

🧠 L7 — Application Layer

Routes based on HTTP headers, URL path, cookies, body
Can make intelligent routing decisions
Terminates TLS (decrypts to inspect content)
Higher CPU usage, more features
Use case: web apps, API gateways, microservices

Feature	L4 Load Balancer	L7 Load Balancer
Operates at	TCP/UDP (transport layer)	HTTP/HTTPS (application layer)
Routing based on	IP + Port	URL path, headers, cookies, body
TLS termination	No (passes encrypted traffic through)	Yes (decrypts, inspects, re-encrypts)
Speed	Faster (less processing)	Slower (content inspection)
Intelligence	Dumb routing	Smart routing (path-based, header-based)
Use cases	TCP proxying, databases, gaming	Web apps, APIs, microservices
Examples	AWS NLB, HAProxy (TCP mode)	AWS ALB, Nginx, HAProxy (HTTP mode)

Load Balancing Algorithms

Round Robin

Requests are distributed to servers in order: Server 1 → Server 2 → Server 3 → Server 1 → ... Simple and fair, but doesn't account for server load. A server handling a heavy request gets the same share as an idle one.

Least Connections

Routes to the server with the fewest active connections. Better than round robin when requests have varying processing times. A server finishing fast gets more requests; a server stuck on a heavy query gets fewer.

IP Hash

Hashes the client's IP address to determine which server handles the request. The same client always goes to the same server. Useful for session affinity (sticky sessions) without cookies.

L7 Load Balancer — Path-Based Routingtext

Incoming requests to api.example.com:

  /api/users/*     → User Service cluster (3 servers)
  /api/payments/*  → Payment Service cluster (5 servers)
  /api/search/*    → Search Service cluster (8 servers)
  /static/*        → CDN origin (2 servers)
  /*               → Default backend (2 servers)

This is only possible with L7 — it reads the URL path.
An L4 balancer would send ALL requests to the same pool.

Critical Concepts

💓

Health Checks

Load balancers periodically ping each server (e.g., GET /health). If a server fails to respond, it's removed from the pool. When it recovers, it's added back. This is how failover works.

📌

Sticky Sessions

Some apps store session state on the server. Sticky sessions ensure the same user always hits the same server (via cookies or IP hash). Trade-off: uneven load distribution.

🔄

Failover

When a server goes down, the load balancer automatically routes traffic to healthy servers. Users don't notice. This is the primary reliability benefit of load balancing.

🎯 Interview Insight — When to Use L4 vs L7

Use L4 when you need raw speed and don't need to inspect content — TCP proxying, database connection pooling, gaming servers. Use L7 when you need intelligent routing — path-based routing for microservices, A/B testing via headers, rate limiting per API endpoint. Most web applications use L7.

L4 vs L7 load balancer — transport layer routing vs application layer intelligent routing — L4 routes by IP and port; L7 reads HTTP headers, URLs, and cookies for intelligent routing decisions

CDN & Edge Caching

A CDN (Content Delivery Network) is a globally distributed network of servers that caches and serves content from locations close to the user. Instead of every request traveling to your origin server in Virginia, a user in Tokyo gets the response from a server in Tokyo.

📦

The Amazon Warehouse Analogy

Amazon doesn't ship every order from one central warehouse. They stock popular items in warehouses near major cities. When you order, the package comes from the nearest warehouse — not from across the country. A CDN does the same thing with web content. Popular images, scripts, and pages are cached at 'edge locations' worldwide. The user gets content from the nearest edge, not from your origin server thousands of miles away.

How CDN Works

First Request (Cache MISS)

User in Tokyo requests an image. The CDN edge in Tokyo doesn't have it yet. The edge forwards the request to the origin server, gets the response, caches it locally, and returns it to the user. Slower this first time.

Subsequent Requests (Cache HIT)

Another user in Tokyo requests the same image. The CDN edge already has it cached. Response is served directly from Tokyo — no round trip to the origin. Latency drops from 200ms to 20ms.

Cache Expiration (TTL)

After the TTL expires, the edge considers the cached content stale. The next request triggers a revalidation with the origin. If the content hasn't changed, the origin responds with '304 Not Modified' and the edge refreshes the TTL.

CDN cache hit vs miss — first request fetches from origin, subsequent requests served from edge — Cache MISS fetches from origin and stores at the edge; cache HIT serves instantly from the nearest edge location

✅ What CDNs Cache Well (Static Content)

Images, videos, fonts, CSS, JavaScript bundles
HTML pages that don't change per user
API responses that are the same for all users
Downloads (PDFs, installers, packages)

⚠️ What CDNs Struggle With (Dynamic Content)

Personalized pages (user dashboard, cart)
Real-time data (stock prices, live scores)
Authenticated API responses (per-user data)
Content that changes every second

Cache Invalidation — The Hard Problem

⚠️ The Hardest Problem in Computer Science

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton. When you update content on your origin, how do you ensure all CDN edges worldwide serve the new version? This is cache invalidation.

Strategy	How It Works	Trade-off
TTL-based expiration	Content expires after N seconds. Edge fetches fresh copy.	Simple but users see stale content until TTL expires
Cache purge / invalidation	You explicitly tell the CDN to drop cached content.	Immediate but requires API calls; can be slow at global scale
Cache busting (versioned URLs)	Change the URL: style.css → style.v2.css or style.css?v=abc123	Most reliable — new URL = guaranteed fresh fetch. Requires build tooling.
Stale-while-revalidate	Serve stale content immediately, revalidate in background.	Best UX (instant response) but briefly serves outdated content

Geo-Based Routing

CDN Edge Locations — How Routing Workstext

User in Tokyo → DNS resolves to CDN edge in Tokyo (35.1ms)
User in London → DNS resolves to CDN edge in London (12.4ms)
User in Mumbai → DNS resolves to CDN edge in Mumbai (18.7ms)

How? The CDN provider (Cloudflare, CloudFront, Akamai) uses:
  1. Anycast routing — same IP, multiple locations
  2. GeoDNS — returns different IPs based on user's location
  3. Latency-based routing — routes to the fastest edge

Without CDN:
  All users → Origin in us-east-1 (Virginia)
  Tokyo user: 180ms | London user: 90ms | Mumbai user: 150ms

With CDN:
  Tokyo user: 35ms  | London user: 12ms  | Mumbai user: 19ms

🎬

Netflix Streaming

Netflix uses its own CDN (Open Connect) with servers inside ISP networks. When you stream a show, the video comes from a server literally inside your ISP's data center — not from Netflix's cloud.

🖼️

Image Delivery

E-commerce sites serve product images from CDN edges. A site with 10M products and users worldwide would collapse without CDN — the origin can't handle that many image requests.

CDN geo-based routing — users in different regions served from nearest edge location — CDN geo-routing directs users to the nearest edge server using Anycast, GeoDNS, or latency-based routing

NAT & Firewalls

NAT — Network Address Translation

NAT solves a fundamental problem: there aren't enough public IPv4 addresses for every device on the internet. There are only ~4.3 billion IPv4 addresses, but there are 15+ billion connected devices. NAT lets thousands of devices share a single public IP.

🏢

The Office Phone System

An office building has one main phone number (public IP). Inside, there are 500 employees with internal extensions (private IPs: 192.168.1.x). When an employee calls outside, the receptionist (NAT) replaces the internal extension with the main number. When a call comes back, the receptionist routes it to the correct extension. The outside world only ever sees the one main number.

Outbound Request

Your laptop (192.168.1.42) sends a request to google.com. The NAT router replaces the source IP with its public IP (203.0.113.5) and records the mapping: 192.168.1.42:54321 ↔ 203.0.113.5:54321. The request goes out with the public IP.

Inbound Response

Google's response comes back to 203.0.113.5:54321. The NAT router looks up the mapping and forwards the response to 192.168.1.42:54321. Your laptop receives the response as if it had a direct connection.

Inbound Requests (The Catch)

By default, NAT blocks unsolicited inbound traffic. If someone tries to connect to 203.0.113.5 from outside, NAT doesn't know which internal device to forward to. This is why you need port forwarding to host a server behind NAT.

Private IP Ranges (RFC 1918)

✅10.0.0.0 – 10.255.255.255 (10.x.x.x)
✅172.16.0.0 – 172.31.255.255
✅192.168.0.0 – 192.168.255.255 (home networks)
✅These IPs are NOT routable on the public internet
✅Every home/office uses them internally

Why NAT Matters for System Design

✅Cloud VPCs use private IPs internally
✅NAT gateways let private instances access the internet
✅Microservices communicate via private IPs within a VPC
✅Public-facing load balancers have public IPs; backends don't
✅Understanding NAT explains why port mapping exists

NAT translation flow — private IP mapped to public IP for outbound requests and back for responses — NAT translates private IPs to a shared public IP for outbound traffic and reverses the mapping for responses

Firewalls

A firewall inspects network traffic and decides what to allow and what to block based on predefined rules. It's the security guard at the entrance of your system.

🛂

The Security Guard Analogy

A network firewall is like a security guard at a building entrance. They check your ID (IP address, port) and either let you in or turn you away. A WAF (Web Application Firewall) is like a more thorough security checkpoint — they open your bag (HTTP request body), check for weapons (SQL injection, XSS payloads), and verify your invitation (authentication tokens).

🔌 Network-Level Firewall

Operates at L3/L4 (IP addresses, ports, protocols)
Rules: "Allow TCP port 443 from any IP"
Rules: "Block all traffic from 185.x.x.x range"
Fast — doesn't inspect packet contents
Examples: AWS Security Groups, iptables

🧠 Application-Level Firewall (WAF)

Operates at L7 (HTTP requests, headers, body)
Detects SQL injection, XSS, CSRF attacks
Rate limiting per IP or per API key
Geo-blocking (block traffic from specific countries)
Examples: AWS WAF, Cloudflare WAF, ModSecurity

Firewall Rules — Typical Setuptext

Network Firewall (Security Group):
  ALLOW  TCP  443  from 0.0.0.0/0     ← HTTPS from anywhere
  ALLOW  TCP  80   from 0.0.0.0/0     ← HTTP from anywhere (redirect to HTTPS)
  ALLOW  TCP  22   from 10.0.0.0/8    ← SSH only from internal network
  DENY   ALL  ALL  from 0.0.0.0/0     ← Block everything else

WAF Rules:
  BLOCK  if request body contains "SELECT * FROM"  ← SQL injection
  BLOCK  if request body contains "<script>"       ← XSS attempt
  RATE LIMIT  100 requests/minute per IP           ← DDoS protection
  BLOCK  if User-Agent is empty                    ← Bot filtering
  ALLOW  if IP is in whitelist                     ← Trusted partners

🔐 Defense in Depth

Never rely on a single firewall. Use network firewalls to block unwanted ports/IPs, WAF to block application-level attacks, and application-level validation as the last line of defense. Each layer catches what the previous one missed.

End-to-End Flow

Let's trace exactly what happens when you typewww.example.comand press Enter — through every infrastructure primitive.

DNS Resolution

Browser checks its cache → OS cache → recursive resolver → root server → .com TLD → authoritative server. Result: 'www.example.com = 104.18.25.43' (this is the CDN edge IP, not the origin). Cached with TTL of 300 seconds.

CDN Edge Check

The IP 104.18.25.43 belongs to a CDN edge server in the user's region. The edge checks its cache: does it have the response for GET /? Cache HIT → return immediately (done in ~15ms). Cache MISS → forward to origin.

Firewall / WAF

The request passes through the WAF. It checks: Is this IP blocked? Does the request contain attack patterns? Is this IP exceeding rate limits? If clean → forward. If suspicious → block with 403.

Load Balancer

The L7 load balancer receives the request. It reads the URL path, checks server health, and routes to the healthiest backend using least-connections algorithm. The request goes to Server #3 (which has 12 active connections vs Server #1's 47).

NAT (Internal Routing)

The backend server sits in a private subnet (10.0.2.15). The load balancer translates the public request to the private IP. The server processes the request — queries the database, builds the HTML response.

Response Flows Back

Server → Load Balancer → WAF → CDN edge (caches the response for future requests) → User's browser. Total time: ~120ms for cache miss, ~15ms for cache hit on subsequent requests.

Complete Infrastructure Flowtext

User types: www.example.com

1. DNS     → "www.example.com = 104.18.25.43" (CDN edge IP)
2. CDN     → Cache MISS → forward to origin
3. WAF     → Request is clean → allow
4. LB (L7) → Route to Server #3 (least connections)
5. NAT     → Public IP → Private IP (10.0.2.15)
6. Server  → Process request, query DB, build HTML
7. Return  → Server → LB → WAF → CDN (cache it) → User

Subsequent request (same content):
1. DNS     → Cache HIT (browser cache, TTL still valid)
2. CDN     → Cache HIT → return immediately
   (Steps 3-6 skipped entirely)

Latency: 120ms (first) → 15ms (cached)

💡 Interview Tip

When explaining this flow in an interview, walk through it layer by layer. Mention what happens on cache hits vs misses. Bonus points: explain how each component contributes to both performance (CDN, caching) and reliability (LB failover, WAF protection).

End-to-end request flow — complete path from user through DNS, CDN, firewall, load balancer to server and back — The complete request lifecycle through every infrastructure primitive — from user to server and back

Trade-offs & Decision Making

Every infrastructure decision involves trade-offs. Here are the key decisions you'll face in interviews and real systems.

CDN — When to Use vs Not

Scenario	Use CDN?	Why
Static marketing site	Yes	All content is cacheable, global audience benefits from edge delivery
User dashboard with personalized data	Partial	Cache static assets (CSS, JS, images) on CDN; serve dynamic data from origin
Real-time trading platform	No (for data)	Prices change every millisecond — caching would serve stale data
Video streaming platform	Yes	Video files are large and static — CDN reduces bandwidth costs and latency
Internal admin tool (10 users)	No	Not worth the complexity for a small, internal audience

L4 vs L7 Load Balancer

Scenario	Choose	Why
Microservices with path-based routing	L7	Need to route /api/users to User Service, /api/orders to Order Service
TCP database connection pooling	L4	No HTTP to inspect — just forward TCP connections
A/B testing via request headers	L7	Need to read custom headers to route to test vs control
High-throughput gaming server	L4	Raw speed matters; no need to inspect packet contents
API gateway with rate limiting	L7	Need to inspect API keys, paths, and apply per-endpoint limits

DNS Caching Trade-offs

Decision	Short TTL (60s)	Long TTL (3600s)
Failover speed	Fast — clients pick up new IP in ~60s	Slow — clients use stale IP for up to 1 hour
DNS query volume	High — more lookups, more latency	Low — fewer lookups, faster page loads
Migration flexibility	High — can switch IPs quickly	Low — old IP receives traffic for a long time
Best for	Active development, blue-green deploys	Stable production systems

🎯 Interview Framework

When asked about infrastructure decisions, always frame it as: "It depends on [specific requirement]. If we need X, I'd choose A because... If we need Y instead, I'd choose B because..." Never give a one-size-fits-all answer.

Interview Questions

Conceptual, scenario-based, and edge-case questions you're likely to encounter.

Q:Why is DNS caching important?

A: Without caching, every single page load would require a full DNS resolution — walking the hierarchy from root to TLD to authoritative server. That adds 50-200ms per request. With caching at the browser, OS, and resolver levels, most DNS lookups are instant (0ms). Caching also reduces load on DNS infrastructure. The trade-off is staleness: if you change your server's IP, cached entries won't update until the TTL expires.

Q:What's the difference between a CDN and a cache?

A: A cache is a general concept — storing data closer to where it's needed to avoid recomputation or re-fetching. A CDN is a specific implementation of caching: a globally distributed network of servers that cache content at edge locations close to users. A CDN is a cache, but not all caches are CDNs. Redis is a cache (in-memory, server-side). Your browser has a cache (local). A CDN is a geographically distributed cache.

Q:When would you NOT use a CDN?

A: When content is highly personalized (user-specific dashboards), changes every second (real-time data), is only accessed by a small internal team (admin tools), or when the audience is in a single geographic region and the origin is already close. CDN adds complexity (cache invalidation, configuration) that isn't justified for these cases.

Your global e-commerce site is slow for users in Asia

How would you diagnose and fix this?

Answer: First, check if a CDN is in place — if not, add one with edge locations in Asia. If CDN exists, check cache hit rates for Asian edges (low hit rate = content not being cached). Check DNS resolution time — consider using a DNS provider with servers in Asia. Check if the origin server is in US-only — consider deploying a regional origin or read replica in Asia. Finally, check if the load balancer is routing Asian traffic to the nearest server.

After a deployment, users are seeing the old version of your site

What's happening and how do you fix it?

Answer: The CDN is serving stale cached content. Fix: 1) Purge the CDN cache (immediate but slow at global scale). 2) Better long-term: use cache busting — version your static assets (app.v2.js instead of app.js) so the new URL forces a fresh fetch. 3) Set appropriate Cache-Control headers: short TTL for HTML (changes often), long TTL for versioned assets (URL changes on update).

Your load balancer health checks are passing, but users report errors

What could be wrong?

Answer: The health check endpoint (/health) might return 200 even when the app is partially broken. For example: the health check doesn't verify database connectivity, so the server appears healthy but fails on actual requests. Fix: make health checks meaningful — verify all critical dependencies (DB, cache, external APIs). Also check: is the load balancer routing to a server that's healthy but overloaded? Consider using least-connections instead of round-robin.

Common Mistakes

These misconceptions trip up engineers in interviews and in production systems.

🖼️

Thinking CDN = only for static files

CDNs can cache API responses, HTML pages, and even dynamic content with the right cache headers. Modern CDNs support edge computing (Cloudflare Workers, Lambda@Edge) that can run logic at the edge — personalization, A/B testing, auth checks — without hitting the origin.

✅Think of CDN as 'edge caching + edge compute'. Cache static assets aggressively, cache API responses where possible, and use edge functions for lightweight dynamic logic.

🔀

Confusing L4 and L7 load balancers

A common interview mistake is saying 'load balancer' without specifying which type. L4 and L7 have fundamentally different capabilities. Saying 'the load balancer routes based on URL path' when you drew an L4 balancer shows a gap in understanding.

✅Always specify: 'I'd use an L7 load balancer here because we need path-based routing' or 'L4 is sufficient since we just need TCP distribution.' Show you understand the distinction.

🗑️

Ignoring cache invalidation

Setting up a CDN without a cache invalidation strategy leads to users seeing stale content after deployments. Teams then panic-purge the entire cache, which causes a thundering herd of requests to the origin.

✅Plan cache invalidation from day one. Use versioned URLs for static assets (app.abc123.js). Set short TTLs for HTML. Use stale-while-revalidate for the best UX. Never rely on manual cache purges as your primary strategy.

🌐

Not understanding NAT's role in cloud architecture

Many developers don't realize that their cloud servers use private IPs internally and NAT gateways for outbound internet access. This leads to confusion when debugging connectivity issues — 'why can't my server reach the internet?' — because there's no NAT gateway configured.

✅Understand that in cloud VPCs: public subnets have internet gateways (direct public IP), private subnets need NAT gateways for outbound access. Backend servers should be in private subnets with NAT, not exposed directly to the internet.

🔒

Relying on a single layer of security

Using only a network firewall and assuming you're safe. Network firewalls can't detect SQL injection or XSS — those are valid HTTP requests on allowed ports. Without a WAF, application-level attacks pass right through.

✅Defense in depth: network firewall (block ports/IPs) + WAF (block attack patterns) + application validation (sanitize inputs) + rate limiting. Each layer catches what the previous one can't.

Infrastructure Primitives

Table of Contents

The Big Picture — What Are Infrastructure Primitives?

The City Analogy

How They Work Together

DNS — The Phonebook

CDN — The Local Warehouse

Load Balancer — The Traffic Police

NAT & Firewall — The Security Gate

DNS Resolution — Deep Dive

The Contact List Analogy

DNS Resolution — Step by Step

Browser Cache

OS Cache

Recursive Resolver (ISP / Public DNS)

Root Name Server

TLD (Top-Level Domain) Server

Authoritative Name Server

TTL — Why Caching Matters

⏱️ Short TTL (30-60 seconds)

⏱️ Long TTL (3600+ seconds)

What Happens When DNS Fails?

DNS is slow

DNS is down

Load Balancers

Traffic Police vs Smart Routing

L4 vs L7 — The Two Types

🔌 L4 — Transport Layer

🧠 L7 — Application Layer

Load Balancing Algorithms

Round Robin

Least Connections

IP Hash

Critical Concepts

Health Checks

Sticky Sessions

Failover

CDN & Edge Caching

The Amazon Warehouse Analogy

How CDN Works

First Request (Cache MISS)

Subsequent Requests (Cache HIT)

Cache Expiration (TTL)

✅ What CDNs Cache Well (Static Content)

⚠️ What CDNs Struggle With (Dynamic Content)

Cache Invalidation — The Hard Problem

Geo-Based Routing

Netflix Streaming

Image Delivery

NAT & Firewalls

NAT — Network Address Translation

The Office Phone System

Outbound Request

Inbound Response

Inbound Requests (The Catch)

Private IP Ranges (RFC 1918)

Why NAT Matters for System Design

Firewalls

The Security Guard Analogy

🔌 Network-Level Firewall

🧠 Application-Level Firewall (WAF)

End-to-End Flow

DNS Resolution

CDN Edge Check

Firewall / WAF

Load Balancer

NAT (Internal Routing)

Response Flows Back

Trade-offs & Decision Making

CDN — When to Use vs Not

L4 vs L7 Load Balancer

DNS Caching Trade-offs

Interview Questions

Q:Why is DNS caching important?

Q:What's the difference between a CDN and a cache?

Q:When would you NOT use a CDN?

Your global e-commerce site is slow for users in Asia

After a deployment, users are seeing the old version of your site

Your load balancer health checks are passing, but users report errors

Common Mistakes