Infrastructure Primitives
Learn the infrastructure building blocks — DNS resolution, load balancers, CDNs, NAT, and firewalls that form the backbone of every scalable system.
Table of Contents
The Big Picture — What Are Infrastructure Primitives?
Infrastructure primitives are the invisible building blocks that make the internet work. Every time you load a website, stream a video, or call an API, a chain of infrastructure components routes, balances, caches, secures, and delivers your request. You never see them, but without them, nothing works at scale.
The City Analogy
Think of the internet as a city. DNS is the phonebook — it translates names ('Amazon HQ') into addresses ('410 Terry Ave'). The load balancer is a traffic police officer — it directs cars to the least congested road. The CDN is a chain of local warehouses — instead of shipping every package from a central factory, you stock popular items in warehouses near each neighborhood. NAT is the office receptionist — the outside world sees one phone number, but internally there are hundreds of extensions. The firewall is the security gate — it checks every person entering the building and blocks anyone unauthorized.
These aren't optional add-ons. They're the foundation. Every system you design in an interview or at work will use some combination of these primitives. Understanding them deeply is what separates someone who can draw boxes on a whiteboard from someone who can actually build scalable systems.
🔥 Key Insight
Infrastructure primitives are not about any single technology. They're about the roles that must be filled in any distributed system: name resolution, traffic distribution, content delivery, address translation, and security enforcement.
How They Work Together
No infrastructure component works in isolation. They form a pipeline that every request passes through. Here's the typical flow when a user visits a website:
User
Types a URL
DNS
Resolves name → IP
CDN
Serves cached content
Load Balancer
Routes to server
Server
Processes request
DNS — The Phonebook
Translates human-readable domain names into IP addresses. Without DNS, you'd need to memorize 52.94.236.248 instead of typing amazon.com.
CDN — The Local Warehouse
Caches content at edge locations close to users. A user in Tokyo gets images from a Tokyo server, not from Virginia.
Load Balancer — The Traffic Police
Distributes incoming requests across multiple servers. Prevents any single server from being overwhelmed.
NAT & Firewall — The Security Gate
NAT translates private IPs to public IPs. Firewalls filter traffic, blocking malicious requests before they reach your servers.
User types: https://shop.example.com/products 1. DNS Resolution → Browser asks: "What's the IP of shop.example.com?" → DNS returns: 104.18.25.43 (CDN edge IP) 2. CDN Check → Request hits CDN edge server in user's region → Cache HIT? → Return cached response (fast!) → Cache MISS? → Forward to origin (load balancer) 3. Firewall → Request passes through WAF (Web Application Firewall) → Checks for SQL injection, XSS, rate limiting → Blocked? → Return 403. Allowed? → Continue. 4. Load Balancer → Routes request to healthiest backend server → Algorithm: least connections / round robin / IP hash 5. Server Processing → App server handles business logic → Queries database, builds response 6. Response flows back: Server → Load Balancer → Firewall → CDN (cache it) → User
💡 Why This Matters
In system design interviews, you're expected to place these components correctly in your architecture. Knowing where DNS, CDN, load balancers, and firewalls sit in the request flow is foundational.
DNS Resolution — Deep Dive
DNS (Domain Name System) is the internet's phonebook. It translates human-friendly domain names likegoogle.cominto machine-friendly IP addresses like142.250.80.46. Without DNS, you'd need to memorize IP addresses for every website you visit.
The Contact List Analogy
You don't memorize your friend's phone number — you save it as 'Alice' in your contacts. When you tap 'Alice', your phone looks up the number. DNS works the same way: you type 'google.com', and DNS looks up the IP address. And just like your phone caches contacts locally, DNS caches results at multiple levels to avoid looking up the same name repeatedly.
DNS Resolution — Step by Step
Browser Cache
Browser checks its own DNS cache first. If you visited google.com 2 minutes ago, the IP is already cached locally. Cache hit → done, no network request needed.
OS Cache
If the browser cache misses, the OS checks its DNS cache. On macOS/Linux, this is managed by the system resolver. On Windows, it's the DNS Client service. Still local — no network hop.
Recursive Resolver (ISP / Public DNS)
If the OS cache misses, the query goes to a recursive resolver — usually your ISP's DNS server or a public one like 8.8.8.8 (Google) or 1.1.1.1 (Cloudflare). This resolver does the heavy lifting of walking the DNS hierarchy.
Root Name Server
The recursive resolver asks a root server: 'Where can I find .com domains?' There are 13 root server clusters worldwide. The root server responds: 'Ask the .com TLD server at 192.5.6.30.'
TLD (Top-Level Domain) Server
The resolver asks the .com TLD server: 'Where can I find google.com?' The TLD server responds: 'The authoritative server for google.com is ns1.google.com at 216.239.32.10.'
Authoritative Name Server
The resolver asks Google's authoritative server: 'What's the IP for google.com?' The authoritative server responds: '142.250.80.46' with a TTL (time-to-live) of 300 seconds. The resolver caches this and returns it to the browser.
Browser: "What's the IP of google.com?" Step 1: Browser cache → MISS Step 2: OS cache → MISS Step 3: Recursive resolver (8.8.8.8) │ ├→ Root server (.): "Ask .com TLD at 192.5.6.30" ├→ TLD server (.com): "Ask ns1.google.com at 216.239.32.10" └→ Authoritative (ns1): "google.com = 142.250.80.46 (TTL: 300s)" Result cached at: → Recursive resolver (for other users too) → OS cache → Browser cache Next request within 300s → instant (cache hit)
TTL — Why Caching Matters
⏱️ Short TTL (30-60 seconds)
- Changes propagate quickly
- Useful during migrations or failovers
- More DNS queries → higher latency
- More load on DNS servers
⏱️ Long TTL (3600+ seconds)
- Fewer DNS queries → lower latency
- Less load on DNS infrastructure
- Changes take longer to propagate
- Stale records during IP changes
What Happens When DNS Fails?
DNS is slow
- ❌Every page load adds 50-200ms of DNS lookup time
- ❌Users perceive the site as slow (even if the server is fast)
- ❌Mobile users on cellular networks are hit hardest
- ❌Cascading effect: every resource (images, scripts) needs DNS too
DNS is down
- ❌Your domain becomes unreachable — even if servers are healthy
- ❌Users see 'DNS_PROBE_FINISHED_NXDOMAIN' errors
- ❌No failover possible unless you have redundant DNS providers
- ❌The 2016 Dyn DNS attack took down Twitter, Netflix, Reddit
🎯 Interview Insight
DNS is a single point of failure for your entire system. In interviews, mention using multiple DNS providers (e.g., Route 53 + Cloudflare) for redundancy, and pre-warming DNS caches before traffic migrations.
Load Balancers
A load balancer distributes incoming traffic across multiple servers so no single server gets overwhelmed. It's the reason a website with millions of users doesn't run on a single machine.
Traffic Police vs Smart Routing
An L4 load balancer is like a traffic police officer at a highway junction — they see the license plate (IP) and destination (port) and direct cars to the least busy lane. They don't know what's inside the car. An L7 load balancer is like a smart routing system at an airport — it reads your boarding pass (HTTP headers, URL path, cookies) and sends you to the correct terminal, gate, and even priority lane based on your ticket class.
L4 vs L7 — The Two Types
🔌 L4 — Transport Layer
- Routes based on IP address + port number
- Cannot inspect HTTP headers, URLs, or cookies
- Very fast — operates at the TCP/UDP level
- Lower CPU usage, higher throughput
- Use case: TCP proxying, database connections, gaming
🧠 L7 — Application Layer
- Routes based on HTTP headers, URL path, cookies, body
- Can make intelligent routing decisions
- Terminates TLS (decrypts to inspect content)
- Higher CPU usage, more features
- Use case: web apps, API gateways, microservices
| Feature | L4 Load Balancer | L7 Load Balancer |
|---|---|---|
| Operates at | TCP/UDP (transport layer) | HTTP/HTTPS (application layer) |
| Routing based on | IP + Port | URL path, headers, cookies, body |
| TLS termination | No (passes encrypted traffic through) | Yes (decrypts, inspects, re-encrypts) |
| Speed | Faster (less processing) | Slower (content inspection) |
| Intelligence | Dumb routing | Smart routing (path-based, header-based) |
| Use cases | TCP proxying, databases, gaming | Web apps, APIs, microservices |
| Examples | AWS NLB, HAProxy (TCP mode) | AWS ALB, Nginx, HAProxy (HTTP mode) |
Load Balancing Algorithms
Round Robin
Requests are distributed to servers in order: Server 1 → Server 2 → Server 3 → Server 1 → ... Simple and fair, but doesn't account for server load. A server handling a heavy request gets the same share as an idle one.
Least Connections
Routes to the server with the fewest active connections. Better than round robin when requests have varying processing times. A server finishing fast gets more requests; a server stuck on a heavy query gets fewer.
IP Hash
Hashes the client's IP address to determine which server handles the request. The same client always goes to the same server. Useful for session affinity (sticky sessions) without cookies.
Incoming requests to api.example.com: /api/users/* → User Service cluster (3 servers) /api/payments/* → Payment Service cluster (5 servers) /api/search/* → Search Service cluster (8 servers) /static/* → CDN origin (2 servers) /* → Default backend (2 servers) This is only possible with L7 — it reads the URL path. An L4 balancer would send ALL requests to the same pool.
Critical Concepts
Health Checks
Load balancers periodically ping each server (e.g., GET /health). If a server fails to respond, it's removed from the pool. When it recovers, it's added back. This is how failover works.
Sticky Sessions
Some apps store session state on the server. Sticky sessions ensure the same user always hits the same server (via cookies or IP hash). Trade-off: uneven load distribution.
Failover
When a server goes down, the load balancer automatically routes traffic to healthy servers. Users don't notice. This is the primary reliability benefit of load balancing.
🎯 Interview Insight — When to Use L4 vs L7
Use L4 when you need raw speed and don't need to inspect content — TCP proxying, database connection pooling, gaming servers. Use L7 when you need intelligent routing — path-based routing for microservices, A/B testing via headers, rate limiting per API endpoint. Most web applications use L7.
CDN & Edge Caching
A CDN (Content Delivery Network) is a globally distributed network of servers that caches and serves content from locations close to the user. Instead of every request traveling to your origin server in Virginia, a user in Tokyo gets the response from a server in Tokyo.
The Amazon Warehouse Analogy
Amazon doesn't ship every order from one central warehouse. They stock popular items in warehouses near major cities. When you order, the package comes from the nearest warehouse — not from across the country. A CDN does the same thing with web content. Popular images, scripts, and pages are cached at 'edge locations' worldwide. The user gets content from the nearest edge, not from your origin server thousands of miles away.
How CDN Works
First Request (Cache MISS)
User in Tokyo requests an image. The CDN edge in Tokyo doesn't have it yet. The edge forwards the request to the origin server, gets the response, caches it locally, and returns it to the user. Slower this first time.
Subsequent Requests (Cache HIT)
Another user in Tokyo requests the same image. The CDN edge already has it cached. Response is served directly from Tokyo — no round trip to the origin. Latency drops from 200ms to 20ms.
Cache Expiration (TTL)
After the TTL expires, the edge considers the cached content stale. The next request triggers a revalidation with the origin. If the content hasn't changed, the origin responds with '304 Not Modified' and the edge refreshes the TTL.
✅ What CDNs Cache Well (Static Content)
- Images, videos, fonts, CSS, JavaScript bundles
- HTML pages that don't change per user
- API responses that are the same for all users
- Downloads (PDFs, installers, packages)
⚠️ What CDNs Struggle With (Dynamic Content)
- Personalized pages (user dashboard, cart)
- Real-time data (stock prices, live scores)
- Authenticated API responses (per-user data)
- Content that changes every second
Cache Invalidation — The Hard Problem
⚠️ The Hardest Problem in Computer Science
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton. When you update content on your origin, how do you ensure all CDN edges worldwide serve the new version? This is cache invalidation.
| Strategy | How It Works | Trade-off |
|---|---|---|
| TTL-based expiration | Content expires after N seconds. Edge fetches fresh copy. | Simple but users see stale content until TTL expires |
| Cache purge / invalidation | You explicitly tell the CDN to drop cached content. | Immediate but requires API calls; can be slow at global scale |
| Cache busting (versioned URLs) | Change the URL: style.css → style.v2.css or style.css?v=abc123 | Most reliable — new URL = guaranteed fresh fetch. Requires build tooling. |
| Stale-while-revalidate | Serve stale content immediately, revalidate in background. | Best UX (instant response) but briefly serves outdated content |
Geo-Based Routing
User in Tokyo → DNS resolves to CDN edge in Tokyo (35.1ms) User in London → DNS resolves to CDN edge in London (12.4ms) User in Mumbai → DNS resolves to CDN edge in Mumbai (18.7ms) How? The CDN provider (Cloudflare, CloudFront, Akamai) uses: 1. Anycast routing — same IP, multiple locations 2. GeoDNS — returns different IPs based on user's location 3. Latency-based routing — routes to the fastest edge Without CDN: All users → Origin in us-east-1 (Virginia) Tokyo user: 180ms | London user: 90ms | Mumbai user: 150ms With CDN: Tokyo user: 35ms | London user: 12ms | Mumbai user: 19ms
Netflix Streaming
Netflix uses its own CDN (Open Connect) with servers inside ISP networks. When you stream a show, the video comes from a server literally inside your ISP's data center — not from Netflix's cloud.
Image Delivery
E-commerce sites serve product images from CDN edges. A site with 10M products and users worldwide would collapse without CDN — the origin can't handle that many image requests.
NAT & Firewalls
NAT — Network Address Translation
NAT solves a fundamental problem: there aren't enough public IPv4 addresses for every device on the internet. There are only ~4.3 billion IPv4 addresses, but there are 15+ billion connected devices. NAT lets thousands of devices share a single public IP.
The Office Phone System
An office building has one main phone number (public IP). Inside, there are 500 employees with internal extensions (private IPs: 192.168.1.x). When an employee calls outside, the receptionist (NAT) replaces the internal extension with the main number. When a call comes back, the receptionist routes it to the correct extension. The outside world only ever sees the one main number.
Outbound Request
Your laptop (192.168.1.42) sends a request to google.com. The NAT router replaces the source IP with its public IP (203.0.113.5) and records the mapping: 192.168.1.42:54321 ↔ 203.0.113.5:54321. The request goes out with the public IP.
Inbound Response
Google's response comes back to 203.0.113.5:54321. The NAT router looks up the mapping and forwards the response to 192.168.1.42:54321. Your laptop receives the response as if it had a direct connection.
Inbound Requests (The Catch)
By default, NAT blocks unsolicited inbound traffic. If someone tries to connect to 203.0.113.5 from outside, NAT doesn't know which internal device to forward to. This is why you need port forwarding to host a server behind NAT.
Private IP Ranges (RFC 1918)
- ✅10.0.0.0 – 10.255.255.255 (10.x.x.x)
- ✅172.16.0.0 – 172.31.255.255
- ✅192.168.0.0 – 192.168.255.255 (home networks)
- ✅These IPs are NOT routable on the public internet
- ✅Every home/office uses them internally
Why NAT Matters for System Design
- ✅Cloud VPCs use private IPs internally
- ✅NAT gateways let private instances access the internet
- ✅Microservices communicate via private IPs within a VPC
- ✅Public-facing load balancers have public IPs; backends don't
- ✅Understanding NAT explains why port mapping exists
Firewalls
A firewall inspects network traffic and decides what to allow and what to block based on predefined rules. It's the security guard at the entrance of your system.
The Security Guard Analogy
A network firewall is like a security guard at a building entrance. They check your ID (IP address, port) and either let you in or turn you away. A WAF (Web Application Firewall) is like a more thorough security checkpoint — they open your bag (HTTP request body), check for weapons (SQL injection, XSS payloads), and verify your invitation (authentication tokens).
🔌 Network-Level Firewall
- Operates at L3/L4 (IP addresses, ports, protocols)
- Rules: "Allow TCP port 443 from any IP"
- Rules: "Block all traffic from 185.x.x.x range"
- Fast — doesn't inspect packet contents
- Examples: AWS Security Groups, iptables
🧠 Application-Level Firewall (WAF)
- Operates at L7 (HTTP requests, headers, body)
- Detects SQL injection, XSS, CSRF attacks
- Rate limiting per IP or per API key
- Geo-blocking (block traffic from specific countries)
- Examples: AWS WAF, Cloudflare WAF, ModSecurity
Network Firewall (Security Group): ALLOW TCP 443 from 0.0.0.0/0 ← HTTPS from anywhere ALLOW TCP 80 from 0.0.0.0/0 ← HTTP from anywhere (redirect to HTTPS) ALLOW TCP 22 from 10.0.0.0/8 ← SSH only from internal network DENY ALL ALL from 0.0.0.0/0 ← Block everything else WAF Rules: BLOCK if request body contains "SELECT * FROM" ← SQL injection BLOCK if request body contains "<script>" ← XSS attempt RATE LIMIT 100 requests/minute per IP ← DDoS protection BLOCK if User-Agent is empty ← Bot filtering ALLOW if IP is in whitelist ← Trusted partners
🔐 Defense in Depth
Never rely on a single firewall. Use network firewalls to block unwanted ports/IPs, WAF to block application-level attacks, and application-level validation as the last line of defense. Each layer catches what the previous one missed.
End-to-End Flow
Let's trace exactly what happens when you typewww.example.comand press Enter — through every infrastructure primitive.
DNS Resolution
Browser checks its cache → OS cache → recursive resolver → root server → .com TLD → authoritative server. Result: 'www.example.com = 104.18.25.43' (this is the CDN edge IP, not the origin). Cached with TTL of 300 seconds.
CDN Edge Check
The IP 104.18.25.43 belongs to a CDN edge server in the user's region. The edge checks its cache: does it have the response for GET /? Cache HIT → return immediately (done in ~15ms). Cache MISS → forward to origin.
Firewall / WAF
The request passes through the WAF. It checks: Is this IP blocked? Does the request contain attack patterns? Is this IP exceeding rate limits? If clean → forward. If suspicious → block with 403.
Load Balancer
The L7 load balancer receives the request. It reads the URL path, checks server health, and routes to the healthiest backend using least-connections algorithm. The request goes to Server #3 (which has 12 active connections vs Server #1's 47).
NAT (Internal Routing)
The backend server sits in a private subnet (10.0.2.15). The load balancer translates the public request to the private IP. The server processes the request — queries the database, builds the HTML response.
Response Flows Back
Server → Load Balancer → WAF → CDN edge (caches the response for future requests) → User's browser. Total time: ~120ms for cache miss, ~15ms for cache hit on subsequent requests.
User types: www.example.com 1. DNS → "www.example.com = 104.18.25.43" (CDN edge IP) 2. CDN → Cache MISS → forward to origin 3. WAF → Request is clean → allow 4. LB (L7) → Route to Server #3 (least connections) 5. NAT → Public IP → Private IP (10.0.2.15) 6. Server → Process request, query DB, build HTML 7. Return → Server → LB → WAF → CDN (cache it) → User Subsequent request (same content): 1. DNS → Cache HIT (browser cache, TTL still valid) 2. CDN → Cache HIT → return immediately (Steps 3-6 skipped entirely) Latency: 120ms (first) → 15ms (cached)
💡 Interview Tip
When explaining this flow in an interview, walk through it layer by layer. Mention what happens on cache hits vs misses. Bonus points: explain how each component contributes to both performance (CDN, caching) and reliability (LB failover, WAF protection).
Trade-offs & Decision Making
Every infrastructure decision involves trade-offs. Here are the key decisions you'll face in interviews and real systems.
CDN — When to Use vs Not
| Scenario | Use CDN? | Why |
|---|---|---|
| Static marketing site | Yes | All content is cacheable, global audience benefits from edge delivery |
| User dashboard with personalized data | Partial | Cache static assets (CSS, JS, images) on CDN; serve dynamic data from origin |
| Real-time trading platform | No (for data) | Prices change every millisecond — caching would serve stale data |
| Video streaming platform | Yes | Video files are large and static — CDN reduces bandwidth costs and latency |
| Internal admin tool (10 users) | No | Not worth the complexity for a small, internal audience |
L4 vs L7 Load Balancer
| Scenario | Choose | Why |
|---|---|---|
| Microservices with path-based routing | L7 | Need to route /api/users to User Service, /api/orders to Order Service |
| TCP database connection pooling | L4 | No HTTP to inspect — just forward TCP connections |
| A/B testing via request headers | L7 | Need to read custom headers to route to test vs control |
| High-throughput gaming server | L4 | Raw speed matters; no need to inspect packet contents |
| API gateway with rate limiting | L7 | Need to inspect API keys, paths, and apply per-endpoint limits |
DNS Caching Trade-offs
| Decision | Short TTL (60s) | Long TTL (3600s) |
|---|---|---|
| Failover speed | Fast — clients pick up new IP in ~60s | Slow — clients use stale IP for up to 1 hour |
| DNS query volume | High — more lookups, more latency | Low — fewer lookups, faster page loads |
| Migration flexibility | High — can switch IPs quickly | Low — old IP receives traffic for a long time |
| Best for | Active development, blue-green deploys | Stable production systems |
🎯 Interview Framework
When asked about infrastructure decisions, always frame it as: "It depends on [specific requirement]. If we need X, I'd choose A because... If we need Y instead, I'd choose B because..." Never give a one-size-fits-all answer.
Interview Questions
Conceptual, scenario-based, and edge-case questions you're likely to encounter.
Q:Why is DNS caching important?
A: Without caching, every single page load would require a full DNS resolution — walking the hierarchy from root to TLD to authoritative server. That adds 50-200ms per request. With caching at the browser, OS, and resolver levels, most DNS lookups are instant (0ms). Caching also reduces load on DNS infrastructure. The trade-off is staleness: if you change your server's IP, cached entries won't update until the TTL expires.
Q:What's the difference between a CDN and a cache?
A: A cache is a general concept — storing data closer to where it's needed to avoid recomputation or re-fetching. A CDN is a specific implementation of caching: a globally distributed network of servers that cache content at edge locations close to users. A CDN is a cache, but not all caches are CDNs. Redis is a cache (in-memory, server-side). Your browser has a cache (local). A CDN is a geographically distributed cache.
Q:When would you NOT use a CDN?
A: When content is highly personalized (user-specific dashboards), changes every second (real-time data), is only accessed by a small internal team (admin tools), or when the audience is in a single geographic region and the origin is already close. CDN adds complexity (cache invalidation, configuration) that isn't justified for these cases.
Your global e-commerce site is slow for users in Asia
How would you diagnose and fix this?
Answer: First, check if a CDN is in place — if not, add one with edge locations in Asia. If CDN exists, check cache hit rates for Asian edges (low hit rate = content not being cached). Check DNS resolution time — consider using a DNS provider with servers in Asia. Check if the origin server is in US-only — consider deploying a regional origin or read replica in Asia. Finally, check if the load balancer is routing Asian traffic to the nearest server.
After a deployment, users are seeing the old version of your site
What's happening and how do you fix it?
Answer: The CDN is serving stale cached content. Fix: 1) Purge the CDN cache (immediate but slow at global scale). 2) Better long-term: use cache busting — version your static assets (app.v2.js instead of app.js) so the new URL forces a fresh fetch. 3) Set appropriate Cache-Control headers: short TTL for HTML (changes often), long TTL for versioned assets (URL changes on update).
Your load balancer health checks are passing, but users report errors
What could be wrong?
Answer: The health check endpoint (/health) might return 200 even when the app is partially broken. For example: the health check doesn't verify database connectivity, so the server appears healthy but fails on actual requests. Fix: make health checks meaningful — verify all critical dependencies (DB, cache, external APIs). Also check: is the load balancer routing to a server that's healthy but overloaded? Consider using least-connections instead of round-robin.
Common Mistakes
These misconceptions trip up engineers in interviews and in production systems.
Thinking CDN = only for static files
CDNs can cache API responses, HTML pages, and even dynamic content with the right cache headers. Modern CDNs support edge computing (Cloudflare Workers, Lambda@Edge) that can run logic at the edge — personalization, A/B testing, auth checks — without hitting the origin.
✅Think of CDN as 'edge caching + edge compute'. Cache static assets aggressively, cache API responses where possible, and use edge functions for lightweight dynamic logic.
Confusing L4 and L7 load balancers
A common interview mistake is saying 'load balancer' without specifying which type. L4 and L7 have fundamentally different capabilities. Saying 'the load balancer routes based on URL path' when you drew an L4 balancer shows a gap in understanding.
✅Always specify: 'I'd use an L7 load balancer here because we need path-based routing' or 'L4 is sufficient since we just need TCP distribution.' Show you understand the distinction.
Ignoring cache invalidation
Setting up a CDN without a cache invalidation strategy leads to users seeing stale content after deployments. Teams then panic-purge the entire cache, which causes a thundering herd of requests to the origin.
✅Plan cache invalidation from day one. Use versioned URLs for static assets (app.abc123.js). Set short TTLs for HTML. Use stale-while-revalidate for the best UX. Never rely on manual cache purges as your primary strategy.
Not understanding NAT's role in cloud architecture
Many developers don't realize that their cloud servers use private IPs internally and NAT gateways for outbound internet access. This leads to confusion when debugging connectivity issues — 'why can't my server reach the internet?' — because there's no NAT gateway configured.
✅Understand that in cloud VPCs: public subnets have internet gateways (direct public IP), private subnets need NAT gateways for outbound access. Backend servers should be in private subnets with NAT, not exposed directly to the internet.
Relying on a single layer of security
Using only a network firewall and assuming you're safe. Network firewalls can't detect SQL injection or XSS — those are valid HTTP requests on allowed ports. Without a WAF, application-level attacks pass right through.
✅Defense in depth: network firewall (block ports/IPs) + WAF (block attack patterns) + application validation (sanitize inputs) + rate limiting. Each layer catches what the previous one can't.