Protocol Stack
Deep dive into the protocol stack — TCP vs UDP, HTTP versions, TLS/HTTPS, and WebSockets. Understand when and why to use each protocol in distributed systems.
Table of Contents
The Big Picture — What is a Protocol Stack?
A protocol stack is a set of layered rules that govern how data travels from your browser to a server and back. Each layer has one job, and they work together like an assembly line.
The Postal System Analogy
Imagine sending a gift to a friend. You write a letter (your data), put it in a box (HTTP), lock the box (TLS), hand it to a delivery truck (TCP), and the truck drives on roads (the physical network). Each step adds a layer of handling. The recipient reverses the process — truck arrives, box is unlocked, letter is read. That's exactly how a protocol stack works. Each layer wraps the data with its own instructions, and the receiving side unwraps them in reverse order.
Why not just one protocol that does everything? For the same reason you don't have one person who writes the letter, packs the box, drives the truck, AND builds the road. Separation of concerns makes each layer replaceable, testable, and independently optimizable.
🔥 Key Insight
A protocol stack is not a single thing — it's a layered system where each layer solves one problem and trusts the layer below it to handle the rest.
Layered Breakdown
Every network request you make passes through these layers. Think of them as nested envelopes — each layer wraps the previous one.
Application Layer
Where your app logic lives. HTTP defines how browsers request pages. WebSockets enable real-time communication. This is the layer developers interact with most.
Security Layer
TLS encrypts data before it leaves your machine. It sits between the application and transport layers, ensuring nobody can read your data in transit.
Transport Layer
TCP guarantees delivery and ordering. UDP trades reliability for speed. This layer decides HOW data gets delivered — reliably or fast.
Network / Physical Layer
IP addresses, routing, and the actual wires/radio waves. This layer moves raw packets across the internet. You rarely touch this directly.
How a Request Flows Through the Stack
Browser
Your app code
HTTP
Application layer
TLS
Encryption layer
TCP
Transport layer
Internet
Network layer
Your browser wants to load https://api.example.com/users Step 1 — Application Layer (HTTP) → Browser creates an HTTP GET request → Headers: Host, Accept, Authorization → This is your "letter" Step 2 — Security Layer (TLS) → The HTTP request is encrypted → Only the server can decrypt it → This is "locking the box" Step 3 — Transport Layer (TCP) → Encrypted data is split into segments → Each segment gets a sequence number → TCP ensures all segments arrive in order → This is "the delivery truck" Step 4 — Network Layer (IP) → Segments are wrapped in IP packets → Each packet gets source & destination IP → Routers forward packets across the internet → This is "the road system" On the server side, the process reverses: IP → TCP (reassemble) → TLS (decrypt) → HTTP (parse request)
💡 Why This Matters
In system design interviews, understanding which layer handles what helps you reason about where problems occur. Slow response? Could be TCP congestion. Data breach? TLS misconfiguration. Stale data? HTTP caching headers.
TCP vs UDP — Deep Dive
TCP and UDP are the two transport protocols that power the internet. Every piece of data you send uses one of them. The choice between them is one of the most fundamental trade-offs in system design.
🤝 TCP — Transmission Control Protocol
- Connection-oriented (3-way handshake)
- Guarantees delivery — retransmits lost packets
- Guarantees ordering — sequence numbers
- Flow control — prevents overwhelming the receiver
- Congestion control — adapts to network conditions
⚡ UDP — User Datagram Protocol
- Connectionless — fire and forget
- No delivery guarantee — packets can be lost
- No ordering guarantee — packets can arrive out of order
- No flow/congestion control — raw speed
- Minimal overhead — just source, dest, length, checksum
TCP — How It Works
3-Way Handshake (Connection Setup)
Client sends SYN → Server replies SYN-ACK → Client sends ACK. This establishes a reliable connection before any data flows. Like calling someone and waiting for them to pick up before talking.
Data Transfer
Data is split into segments, each with a sequence number. The receiver sends ACK for each segment received. If an ACK doesn't come back in time, the sender retransmits. This guarantees every byte arrives.
Connection Teardown (4-Way Handshake)
Either side sends FIN → other side ACKs → sends its own FIN → first side ACKs. Clean shutdown ensures both sides know the conversation is over.
UDP — How It Works
No Handshake
UDP doesn't establish a connection. It just sends the data immediately. Like shouting a message across a room — you don't wait for acknowledgment.
Datagram Delivery
Each UDP packet (datagram) is independent. It contains the data, source port, destination port, length, and a checksum. That's it. No sequence numbers, no retransmission.
Application Handles the Rest
If you need reliability over UDP, your application must implement it. This is exactly what QUIC (HTTP/3) does — reliability on top of UDP, but smarter than TCP.
| Feature | TCP | UDP |
|---|---|---|
| Connection | Connection-oriented (handshake) | Connectionless |
| Reliability | Guaranteed delivery | Best-effort, packets can be lost |
| Ordering | Guaranteed (sequence numbers) | No ordering guarantee |
| Speed | Slower (overhead from guarantees) | Faster (minimal overhead) |
| Header Size | 20–60 bytes | 8 bytes |
| Flow Control | Yes (sliding window) | No |
| Use Cases | APIs, file transfer, email, web | Streaming, gaming, DNS, VoIP |
Phone Call vs Walkie-Talkie
TCP is like a phone call — you dial, wait for the other person to pick up, have a two-way conversation, and hang up properly. UDP is like a walkie-talkie — you press the button and talk. Maybe they hear you, maybe they don't. But it's instant and there's no setup delay.
When to use TCP
- ✅REST APIs — every request must arrive completely
- ✅File downloads — missing bytes = corrupted file
- ✅Email (SMTP) — messages must be delivered reliably
- ✅Database connections — queries must not be lost
- ✅Any scenario where data integrity > speed
When to use UDP
- ✅Video streaming — a dropped frame is better than a delayed one
- ✅Online gaming — stale position data is useless
- ✅DNS lookups — small, single request-response
- ✅VoIP calls — real-time audio can't wait for retransmits
- ✅Any scenario where speed > completeness
⚠️ Common Misconception
UDP is NOT always faster than TCP. TCP with keep-alive connections and modern optimizations can be very fast. UDP is faster only when you genuinely don't need reliability guarantees and the overhead of TCP's handshake/retransmission matters.
HTTP Evolution
HTTP is the application-layer protocol that powers the web. It has evolved dramatically — each version solving the previous one's biggest bottleneck.
The Highway Analogy
HTTP/1.1 is a single-lane road — one car at a time, and if one car breaks down, everything stops. HTTP/2 is a multi-lane highway — many cars travel simultaneously on the same road. HTTP/3 is a highway where each car has its own independent lane — if one car has a flat tire, the others keep moving without slowing down.
HTTP/1.1 (1997)
What it solved
HTTP/1.0 opened a new TCP connection for every single request. HTTP/1.1 introduced persistent connections (keep-alive) so multiple requests could reuse the same connection.
Key Features
- ✅Persistent connections (Connection: keep-alive)
- ✅Chunked transfer encoding
- ✅Host header (virtual hosting)
- ✅Caching headers (ETag, If-Modified-Since)
- ✅Pipelining (in theory — rarely used)
Limitations
- ❌Head-of-line blocking — requests are sequential per connection
- ❌Text-based headers — verbose and uncompressed
- ❌Workarounds needed: domain sharding, sprite sheets, bundling
- ❌6 TCP connections per domain (browser limit)
- ❌No server push capability
🚧 Head-of-Line Blocking (HTTP/1.1)
If you send 5 requests on one connection, request #2 can't start until request #1 finishes. One slow response blocks everything behind it — like a slow car in a single-lane tunnel.
HTTP/2 (2015)
What it solved
HTTP/1.1's head-of-line blocking and verbose headers. HTTP/2 introduced multiplexing — multiple requests and responses can fly over a single TCP connection simultaneously.
Key Features
- ✅Multiplexing — multiple streams on one connection
- ✅Binary framing — more efficient than text
- ✅Header compression (HPACK)
- ✅Server push — server can send resources proactively
- ✅Stream prioritization
Limitations
- ❌Still runs on TCP — TCP-level head-of-line blocking remains
- ❌One lost packet blocks ALL streams (TCP retransmission)
- ❌TLS is practically required (browsers enforce it)
- ❌Server push is rarely used in practice
- ❌Complex implementation compared to HTTP/1.1
HTTP/1.1 (Sequential): Connection 1: GET /style.css → wait → response → GET /app.js → wait → response Connection 2: GET /image.png → wait → response → GET /font.woff → wait → response (Browser opens up to 6 connections to work around the bottleneck) HTTP/2 (Multiplexed): Connection 1: Stream 1: GET /style.css ──────→ response Stream 2: GET /app.js ──────→ response Stream 3: GET /image.png ──────→ response Stream 4: GET /font.woff ──────→ response (All on ONE connection, interleaved as binary frames)
HTTP/3 (2022)
What it solved
HTTP/2's TCP-level head-of-line blocking. HTTP/3 replaces TCP with QUIC (built on UDP), so a lost packet in one stream doesn't block other streams.
Key Features
- ✅Built on QUIC (UDP-based transport)
- ✅No TCP head-of-line blocking — streams are independent
- ✅0-RTT connection establishment (for repeat visits)
- ✅Built-in encryption (TLS 1.3 integrated into QUIC)
- ✅Connection migration (survives network changes)
Trade-offs
- ❌UDP is sometimes blocked by firewalls/middleboxes
- ❌Higher CPU usage (encryption in userspace)
- ❌Less mature tooling and debugging support
- ❌Fallback to HTTP/2 needed for compatibility
- ❌More complex to implement and deploy
🔥 Why HTTP/3 Uses UDP — The Critical Insight
TCP's reliability is baked into the kernel — you can't change how it handles packet loss. When one packet is lost, TCP blocks ALL data until that packet is retransmitted. QUIC builds its own reliability on top of UDP, but per-stream. Lost packet in stream A? Only stream A waits. Streams B, C, D keep flowing. This is why HTTP/3 chose UDP — not because it's faster, but because it's more flexible.
| Feature | HTTP/1.1 | HTTP/2 | HTTP/3 |
|---|---|---|---|
| Transport | TCP | TCP | QUIC (UDP) |
| Multiplexing | No (1 request at a time) | Yes (streams) | Yes (independent streams) |
| Head-of-Line Blocking | HTTP-level | TCP-level | None |
| Header Format | Text | Binary (HPACK) | Binary (QPACK) |
| Connection Setup | TCP + TLS (2-3 RTT) | TCP + TLS (2-3 RTT) | 0-1 RTT (TLS built-in) |
| Encryption | Optional | Practically required | Always (built into QUIC) |
| Connection Migration | No | No | Yes (connection ID based) |
TLS & HTTPS
TLS (Transport Layer Security) is the protocol that encrypts data between your browser and a server. HTTPS is simply HTTP + TLS. Without it, anyone on the network can read your passwords, tokens, and data in plain text.
The Locked Box Analogy
Imagine you want to send a secret message to a friend, but the postal worker can read anything. Solution: your friend sends you an open padlock (public key). You put your message in a box, lock it with their padlock, and send it. Only your friend has the key (private key) to open it. Even the postal worker can't read it. After this initial exchange, you both agree on a shared secret key (symmetric key) for faster communication — like having a shared combination lock.
TLS Handshake — Step by Step
Client Hello
Browser sends: 'Hey, I want a secure connection. Here are the TLS versions and cipher suites I support.' This includes a random number used later for key generation.
Server Hello + Certificate
Server responds: 'Let's use TLS 1.3 with this cipher suite. Here's my certificate (containing my public key), signed by a trusted Certificate Authority.' The browser verifies the certificate is legitimate.
Key Exchange
Both sides use the server's public key and a key exchange algorithm (like Diffie-Hellman) to generate a shared secret. This shared secret is used to derive symmetric encryption keys. The public/private keys are only used for this initial exchange.
Secure Communication Begins
From now on, all data is encrypted with the symmetric key. Symmetric encryption is much faster than asymmetric — that's why we use the public key only for the initial handshake.
🔑 Asymmetric Encryption (Public/Private Key)
- Two keys: public (shared) + private (secret)
- Encrypt with public key → only private key can decrypt
- Slow — used only for the initial handshake
- Solves the "how do we share a secret over an insecure channel?" problem
🔐 Symmetric Encryption (Shared Key)
- One shared key for both encryption and decryption
- Fast — used for all data after the handshake
- AES-256 is the most common algorithm
- The key is derived during the TLS handshake
Client (Browser) Server | | |──── Client Hello ─────────────────────→| | (TLS version, cipher suites, | | client random) | | | |←──── Server Hello + Certificate ───────| | (chosen cipher, server random, | | public key certificate) | | | |──── Key Exchange ─────────────────────→| | (pre-master secret encrypted | | with server's public key) | | | |←──── Finished ─────────────────────────| | | |══════ Encrypted Data (AES) ════════════| | (all traffic now uses the | | derived symmetric key) |
⚠️ TLS sits between HTTP and TCP
TLS is not a replacement for TCP. It's a layer on top of TCP and below HTTP. The stack is: HTTP → TLS → TCP → IP. TLS encrypts the HTTP data, then TCP handles reliable delivery of the encrypted bytes.
WebSockets
HTTP follows a request-response model — the client asks, the server answers. But what if the server needs to push data to the client without being asked? That's the problem WebSockets solve.
Walkie-Talkie vs Letter Exchange
HTTP is like exchanging letters — you send a letter, wait for a reply, send another letter. WebSockets are like a walkie-talkie — once the channel is open, both sides can talk whenever they want, instantly, without waiting for the other to finish.
How WebSockets Work
HTTP Upgrade Handshake
The connection starts as a normal HTTP request with an 'Upgrade: websocket' header. The server responds with '101 Switching Protocols'. This is the only HTTP exchange — after this, the connection switches to the WebSocket protocol.
Persistent Full-Duplex Connection
Both client and server can send messages at any time, independently. No request-response cycle. No headers on every message. Just raw, framed data flowing both ways over a single TCP connection.
Close Handshake
Either side can initiate a close by sending a close frame. The other side acknowledges, and the TCP connection is torn down cleanly.
Client → Server (HTTP Request): GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Sec-WebSocket-Version: 13 Server → Client (HTTP Response): HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= After this → pure WebSocket frames (no more HTTP)
Real-World Use Cases
Chat Applications
Messages need to appear instantly for all participants. Polling would create unacceptable latency and server load.
Live Trading Dashboards
Stock prices change multiple times per second. The server pushes price updates the moment they happen.
Real-Time Notifications
Push notifications, live scores, collaborative editing — any scenario where the server needs to initiate communication.
WebSockets vs HTTP Polling vs Long Polling
| Approach | How It Works | Latency | Server Load | Best For |
|---|---|---|---|---|
| HTTP Polling | Client sends requests every N seconds | High (up to N seconds delay) | High (many empty responses) | Simple dashboards, low-frequency updates |
| Long Polling | Client sends request, server holds it until data is available | Medium (near real-time) | Medium (connections held open) | Notifications, moderate real-time needs |
| WebSockets | Persistent connection, both sides push anytime | Low (instant) | Low (one connection, no repeated headers) | Chat, gaming, live data, collaboration |
💡 When NOT to Use WebSockets
If your data updates every 30 seconds, simple HTTP polling is fine. WebSockets add complexity — connection management, reconnection logic, load balancer configuration. Use them when you genuinely need sub-second, bidirectional communication.
End-to-End Flow
Let's trace exactly what happens when you typehttps://amazon.comin your browser and press Enter.
DNS Resolution
Browser asks: 'What's the IP address of amazon.com?' → DNS server responds: '52.94.236.248'. This usually uses UDP (small, single request-response). The result is cached locally for future requests.
TCP Handshake
Browser opens a TCP connection to 52.94.236.248:443. Three-way handshake: SYN → SYN-ACK → ACK. This takes 1 round-trip time (RTT). Now we have a reliable connection.
TLS Handshake
Browser and server negotiate encryption. Client Hello → Server Hello + Certificate → Key Exchange → Finished. This takes 1-2 additional RTTs. Now all data is encrypted.
HTTP Request
Browser sends: GET / HTTP/2 with headers (Host, Accept, User-Agent, cookies). This is encrypted by TLS, split into TCP segments, wrapped in IP packets, and sent across the internet.
Server Processing
Amazon's load balancer routes the request to a web server. The server processes the request — fetches data from caches/databases, renders HTML, and prepares the response.
HTTP Response
Server sends back: HTTP/2 200 OK with HTML content, headers (Content-Type, Cache-Control, Set-Cookie). The response travels back through the same layers in reverse.
Browser Renders
Browser parses HTML, discovers CSS/JS/image resources, and makes additional HTTP requests for each (multiplexed over the same HTTP/2 connection). The page renders progressively.
User types: https://amazon.com 1. DNS [UDP] → "What IP is amazon.com?" → 52.94.236.248 2. TCP [SYN] → 3-way handshake with 52.94.236.248:443 3. TLS [1.3] → Negotiate encryption, verify certificate 4. HTTP/2 [GET] → GET / (encrypted, multiplexed) 5. Server → Process request, build response 6. HTTP/2 [200] → HTML response (encrypted, multiplexed) 7. Browser → Parse HTML, request CSS/JS/images (same connection) Total connection setup: ~3 RTTs (DNS + TCP + TLS) With HTTP/3 (QUIC): ~1 RTT (TCP + TLS combined) With 0-RTT (repeat visit): ~0 RTT (cached keys)
💡 Interview Tip
This end-to-end flow is one of the most commonly asked system design questions. Practice explaining it layer by layer. Bonus points if you mention how HTTP/3 reduces the RTT count by combining TCP and TLS into QUIC's handshake.
Trade-offs & Decision Making
In system design interviews, you're not just expected to know what each protocol does — you need to justify why you'd choose one over another for a specific scenario.
TCP vs UDP — Decision Framework
| Scenario | Choose | Why |
|---|---|---|
| REST API backend | TCP | Every request/response must arrive completely and in order |
| Video streaming (Netflix) | UDP (via QUIC) | Dropped frames are acceptable; low latency matters more |
| Online multiplayer game | UDP | Player positions must be real-time; stale data is useless |
| File upload service | TCP | Missing bytes = corrupted file; reliability is non-negotiable |
| DNS queries | UDP | Small payload, single round-trip; TCP overhead isn't worth it |
| Database replication | TCP | Data consistency is critical; every byte must arrive |
HTTP vs WebSockets — Decision Framework
| Scenario | Choose | Why |
|---|---|---|
| E-commerce product page | HTTP | Static content, request-response is perfect |
| Real-time chat | WebSockets | Messages must appear instantly for all users |
| Dashboard (updates every 30s) | HTTP Polling | Low frequency; WebSocket complexity isn't justified |
| Live stock ticker | WebSockets | Prices change multiple times per second |
| Notifications (occasional) | Long Polling or SSE | Server-initiated but not high frequency |
| Collaborative document editing | WebSockets | Every keystroke must sync in real-time |
HTTP/2 vs HTTP/3 — Decision Framework
| Scenario | Choose | Why |
|---|---|---|
| Standard web application | HTTP/2 | Widely supported, good multiplexing, proven |
| Mobile users on flaky networks | HTTP/3 | Connection migration survives network switches |
| High-latency connections | HTTP/3 | 0-RTT reduces connection setup time |
| Enterprise/internal services | HTTP/2 | Firewall compatibility, mature tooling |
| Global CDN edge delivery | HTTP/3 | Reduced latency for users worldwide |
🎯 Interview Framework
When asked "which protocol would you use?" — always state the trade-off explicitly: "I'd choose X because in this scenario, [property A] matters more than [property B]." Never just name a protocol without justifying the trade-off.
Interview Questions
These cover conceptual, scenario-based, and trick questions you might encounter.
Q:Why does HTTP/3 use UDP instead of TCP?
A: Not because UDP is faster — but because TCP's reliability is implemented in the OS kernel and can't be customized. When a packet is lost in TCP, ALL streams are blocked until retransmission (head-of-line blocking). QUIC builds its own reliability on top of UDP, but per-stream. A lost packet in stream A only blocks stream A — streams B, C, D continue unaffected. QUIC also integrates TLS 1.3 directly, reducing connection setup from 3 RTTs to 1.
Q:Why is TCP slow for real-time applications?
A: TCP guarantees delivery and ordering. If packet #5 is lost, TCP holds packets #6, #7, #8 in a buffer until #5 is retransmitted and received. For real-time apps (gaming, video calls), this creates unacceptable latency — by the time the retransmitted packet arrives, the data is stale. UDP lets the application decide what to do with missing data (usually: skip it and move on).
Q:Can you have reliability over UDP?
A: Yes. QUIC (used by HTTP/3) is the best example. It implements reliable delivery, ordering, congestion control, and encryption on top of UDP — but in userspace, not the kernel. This means it can be updated and optimized without waiting for OS updates. Other examples: WebRTC's SCTP over UDP, and custom game networking protocols.
You're designing a live auction platform
Which protocol would you use for bid updates?
Answer: WebSockets for real-time bid updates — bids must appear instantly for all participants. The server pushes every new bid to all connected clients. HTTP polling would create unacceptable latency (someone could bid on a stale price). For the REST API (placing bids, fetching auction details), standard HTTP/2 over TCP is fine.
Your API responses are slow on mobile networks in India
How would you improve the connection performance?
Answer: Migrate to HTTP/3 (QUIC). Mobile networks have high latency and frequent network switches (WiFi → 4G). HTTP/3's 0-RTT connection setup reduces initial load time. Connection migration means the connection survives when the user switches networks. Also consider: CDN edge servers closer to users, response compression, and reducing payload sizes.
A developer says 'Let's use WebSockets for everything'
What's wrong with this approach?
Answer: WebSockets add significant complexity: connection state management, reconnection logic, load balancer configuration (sticky sessions or connection-aware routing), and no built-in caching. For standard request-response patterns (fetching a user profile, submitting a form), HTTP is simpler, cacheable, and well-supported by every tool in the ecosystem. Use WebSockets only when you need persistent, bidirectional, real-time communication.
Common Mistakes
These misconceptions trip up engineers in interviews and in production.
Thinking UDP = always faster
UDP has less overhead, but 'faster' depends on context. TCP with keep-alive connections, HTTP/2 multiplexing, and modern congestion control is extremely fast for most use cases. UDP is only meaningfully faster when you genuinely don't need reliability and the handshake overhead matters (real-time streaming, gaming).
✅Ask: 'Do I need guaranteed delivery?' If yes → TCP. If stale/missing data is acceptable → consider UDP.
Confusing HTTPS with 'just encryption'
HTTPS (TLS) provides three things: encryption (nobody can read the data), authentication (you're talking to the real server, not an impersonator), and integrity (data wasn't tampered with in transit). Many people only think about encryption and forget the other two.
✅Remember: TLS = Encryption + Authentication + Integrity. The certificate verification step is what prevents man-in-the-middle attacks.
Misunderstanding WebSockets vs HTTP
WebSockets are NOT a replacement for HTTP. They solve a specific problem: persistent, bidirectional communication. Using WebSockets for a REST API would lose caching, standard error handling, middleware support, and the entire HTTP ecosystem.
✅Use HTTP for request-response patterns. Use WebSockets only when the server needs to push data to the client in real-time without being asked.
Ignoring HTTP/2's TCP head-of-line blocking
Many developers think HTTP/2 solved all head-of-line blocking. It solved HTTP-level blocking (multiple streams on one connection), but TCP-level blocking remains. One lost TCP packet blocks ALL HTTP/2 streams until retransmission.
✅Understand the difference: HTTP/2 fixes application-layer HOL blocking. HTTP/3 (QUIC) fixes transport-layer HOL blocking by using independent UDP streams.
Assuming HTTP/3 is always better than HTTP/2
HTTP/3 has real trade-offs: some firewalls block UDP, CPU usage is higher (userspace encryption), debugging tools are less mature, and you need HTTP/2 fallback anyway.
✅Default to HTTP/2 for most applications. Consider HTTP/3 when you have mobile users on flaky networks, high-latency connections, or need connection migration.