How TLS (HTTPS) Actually Works
A complete, beginner-friendly guide to TLS — from certificates and key exchange to why MITM attacks fail. No hand-waving, no vague 'it's encrypted' explanations.
Table of Contents
Why TLS Exists
When you visit a website over plain HTTP, every single byte you send and receive travels across the internet in plain text. Your passwords, credit card numbers, personal messages — all readable by anyone sitting between you and the server.
You (Browser) The Internet Server | | |── GET /login ──────────────────────────────────────────────→ | | username=alice&password=secret123 | | | | 👁️ Anyone on the network can read this | | (WiFi operator, ISP, router admin, attacker) | | | |←── 200 OK ───────────────────────────────────────────────── | | { "token": "eyJhbGciOi..." } | | | | 👁️ Token is also visible — attacker can steal it |
This isn't a theoretical problem. On public WiFi, anyone running a packet sniffer (like Wireshark) can see exactly what you're sending. Your ISP can read every HTTP request. Government surveillance programs can tap into backbone routers.
🔥 The Core Problem
HTTP has no confidentiality (anyone can read), no integrity (anyone can modify), and no authentication (you can't verify who you're talking to). TLS solves all three.
HTTPS is simply HTTP wrapped in TLS (Transport Layer Security). TLS provides:
Confidentiality
Data is encrypted. Only the intended recipient can read it. Even if someone intercepts the packets, they see gibberish.
Integrity
Data cannot be tampered with in transit. If even one bit is changed, the recipient detects it and rejects the message.
Authentication
You can verify you're actually talking to the real server (e.g., google.com) and not an impersonator.
High-Level TLS Handshake Overview
Before any encrypted data flows, the browser and server need to agree on how to communicate securely. This negotiation is called the TLS handshake. Think of it as two strangers meeting and establishing a secret language before sharing sensitive information.
Client Hello
Browser introduces itself
Server Hello
Server responds with cert
Verify Cert
Browser checks identity
Key Exchange
Generate shared secret
Encrypted!
Secure communication
Client Hello
The browser says: 'Hi! I want a secure connection. Here are the TLS versions and encryption algorithms I support, plus some random data.'
Server Hello + Certificate
The server responds: 'Great, let's use TLS 1.3 with this cipher suite. Here's my certificate — it contains my public key and proves I'm really who I say I am.'
Certificate Verification
The browser checks the certificate: Is it signed by a trusted Certificate Authority? Is it for the right domain? Has it expired? This all happens locally — no network call needed.
Key Exchange
Both sides use a clever mathematical trick (Diffie-Hellman) to generate a shared secret key — without ever sending that key over the network. Even someone watching every packet can't figure out the key.
Secure Communication
From now on, all data is encrypted with the shared secret using fast symmetric encryption (like AES). The handshake is done — everything is private.
💡 Key Insight
The handshake uses slow asymmetric cryptography (public/private keys) just once to establish a shared secret. Then all actual data uses fast symmetric encryption. This is why TLS is fast despite using complex cryptography.
Deep Dive: Certificate & Trust Model
This is where most people get confused. Let's break it down completely.
What's Inside a Certificate?
A TLS certificate is essentially a digital ID card. It contains:
┌─────────────────────────────────────────────┐ │ TLS Certificate │ ├─────────────────────────────────────────────┤ │ Subject: google.com │ │ Public Key: [server's public key] │ │ Issuer: Let's Encrypt Authority X3 │ │ Valid From: 2025-01-01 │ │ Valid Until: 2025-04-01 │ │ Signature: [CA's digital signature] │ │ Serial No: 04:A3:7B:... │ └─────────────────────────────────────────────┘ The signature is the critical part. It's created by the CA using their PRIVATE key. Anyone can verify it using the CA's PUBLIC key.
The Chain of Trust
Certificates don't exist in isolation. They form a chain:
Root CA (pre-installed in your browser/OS) │ │ signs ──→ Intermediate CA Certificate │ │ │ │ signs ──→ Server Certificate (google.com) │ │ │ │ │ └── Contains google.com's public key │ │ │ └── Why? Root CAs are too valuable to use directly. │ If an intermediate is compromised, only it is revoked. │ └── Your browser ships with ~150 trusted root CAs. (Apple, Microsoft, Mozilla each maintain their own list)
When your browser receives a certificate from google.com, it walks up the chain:
Check the server certificate
Is it for the right domain (google.com)? Is it within its validity period? Is the signature valid?
Check who signed it
The server cert was signed by an intermediate CA. The browser checks: do I have this intermediate's certificate? Is its signature valid?
Walk up to the root
The intermediate was signed by a root CA. The browser checks: is this root CA in my pre-installed trust store? If yes — the entire chain is trusted.
“Does the browser contact the CA to verify the certificate?”
No. Verification is done entirely locally using digital signatures. The browser already has the root CA's public key pre-installed. It uses that public key to verify the signature on the intermediate cert, then uses the intermediate's public key to verify the server cert. No network call to any CA is needed. This is pure math — the browser checks that the signature was created by someone who holds the CA's private key.
How Digital Signatures Work (Simplified)
The Wax Seal Analogy
In medieval times, a king would seal letters with a unique wax stamp. Anyone could look at the seal and verify it came from the king (they know what his seal looks like), but nobody could forge it (they don't have his stamp). Digital signatures work the same way — the CA 'stamps' the certificate with their private key, and anyone can verify it with the CA's public key.
SIGNING (done by the CA): 1. Take the certificate data (domain, public key, validity, etc.) 2. Hash it → produces a fixed-size fingerprint 3. Encrypt the hash with the CA's PRIVATE key → this is the signature VERIFYING (done by your browser): 1. Take the certificate data 2. Hash it → produces the same fingerprint 3. Decrypt the signature with the CA's PUBLIC key → get the original hash 4. Compare: do they match? ✅ Match → certificate is authentic, not tampered with ❌ No match → certificate is forged or modified, REJECT
🔥 Why This Is Secure
An attacker cannot forge a signature because they don't have the CA's private key. They can't modify the certificate because the hash would change. They can't create a fake certificate for google.com because no trusted CA would sign it for them.
Deep Dive: Diffie-Hellman Key Exchange (TLS 1.3)
This is the most elegant part of TLS. The problem: how do two parties create a shared secret over a channel where everyone is listening?
The answer is the Diffie-Hellman key exchange. Let's build intuition with an analogy first, then map it to the real math.
The Color Mixing Analogy
Shared Secret via Color Mixing
Imagine Alice and Bob want to agree on a secret color, but Eve is watching everything. They start with a common color everyone knows (say, yellow). Alice picks a secret color (red) and Bob picks a secret color (blue). Alice mixes yellow + red and sends the result. Bob mixes yellow + blue and sends the result. Eve can see both mixed colors, but she can't unmix them to get the secret colors. Now Alice takes Bob's mix and adds her secret red. Bob takes Alice's mix and adds his secret blue. They both end up with the same final color — but Eve can't compute it because she never knew the secret colors.
Public color: 🟡 Yellow (everyone knows this) Alice's secret: 🔴 Red Bob's secret: 🔵 Blue Alice computes: Bob computes: 🟡 + 🔴 = 🟠 Orange 🟡 + 🔵 = 🟢 Green ──── Exchange over public channel ──── Alice sends 🟠 Bob sends 🟢 (Eve sees both 🟠 and 🟢) Alice computes: Bob computes: 🟢 + 🔴 = 🟤 Brown 🟠 + 🔵 = 🟤 Brown ↑ ↑ SAME SECRET COLOR! SAME SECRET COLOR! Eve has: 🟡, 🟠, 🟢 Eve needs: 🔴 or 🔵 (the secret colors) But she can't "unmix" 🟠 to get 🔴. That's the key insight.
Now the Real Math
The color analogy maps directly to modular arithmetic. Instead of colors, we use numbers and exponentiation:
Public values (everyone knows these): g = generator (a base number, like 2 or 5) p = a large prime number Alice's secret: a (a random large number) Bob's secret: b (a random large number) Alice computes: A = g^a mod p → sends A to Bob Bob computes: B = g^b mod p → sends B to Alice Alice computes shared secret: s = B^a mod p = (g^b)^a mod p = g^(ab) mod p Bob computes shared secret: s = A^b mod p = (g^a)^b mod p = g^(ab) mod p Both get: g^(ab) mod p ← THE SAME SHARED SECRET Attacker has: g, p, A (= g^a mod p), B (= g^b mod p) Attacker needs: a or b But computing a from g^a mod p is the "discrete logarithm problem" → computationally infeasible for large numbers (2048+ bits)
🧠 The Discrete Logarithm Problem
Going forward is easy: given g, a, and p, computing g^a mod p is fast. Going backward is practically impossible: given g^a mod p, g, and p, finding a has no known efficient algorithm for large numbers. This one-way property is what makes Diffie-Hellman secure.
Why the Attacker Can't Compute the Shared Secret
Let's be very explicit about what the attacker sees and what they can't do:
| What Attacker Has | What Attacker Needs | Why They Can't Get It |
|---|---|---|
| g (generator) | a (Alice's secret) | Never transmitted |
| p (prime) | b (Bob's secret) | Never transmitted |
| A = g^a mod p | g^(ab) mod p (shared secret) | Can't compute without a or b |
| B = g^b mod p | — | Discrete log problem is infeasible |
💡 TLS 1.3 Uses Ephemeral Keys
In TLS 1.3, both sides generate fresh Diffie-Hellman keys for every single connection. This provides "forward secrecy" — even if the server's long-term private key is compromised later, past sessions can't be decrypted because the ephemeral keys were discarded.
Full Step-by-Step TLS 1.3 Flow
Now let's put it all together. Here's exactly what happens when your browser connects to https://example.com, with what the attacker can see at each step.
Client (Browser) Server (example.com) | | |──── Client Hello ───────────────────────────────────→ | | • Supported TLS versions (1.3) | | • Supported cipher suites | | • Client random (32 bytes) | | • Client's ephemeral DH public key (g^a mod p) | | | | 👁️ Attacker sees: ALL of this (it's plaintext) | | | |←─── Server Hello ──────────────────────────────────── | | • Chosen TLS version (1.3) | | • Chosen cipher suite | | • Server random (32 bytes) | | • Server's ephemeral DH public key (g^b mod p) | | | | 👁️ Attacker sees: ALL of this too | | | | ═══ Both sides now compute shared secret ═══ | | Client: (g^b)^a mod p = g^(ab) mod p | | Server: (g^a)^b mod p = g^(ab) mod p | | Attacker: has g^a and g^b, but CAN'T get g^(ab) | | | | ═══ Everything below is ENCRYPTED ═══ | | | |←─── {Encrypted} Certificate ──────────────────────── | | • Server's certificate (with public key) | | • Certificate chain | | | |←─── {Encrypted} CertificateVerify ────────────────── | | • Signature over handshake using server's | | private key (proves server owns the cert) | | | |←─── {Encrypted} Finished ─────────────────────────── | | • MAC over entire handshake transcript | | | | Browser verifies: | | ✅ Certificate chain → trusted CA? | | ✅ Domain matches certificate? | | ✅ Signature valid? (server has private key) | | ✅ Handshake not tampered with? | | | |──── {Encrypted} Finished ───────────────────────────→ | | • MAC over entire handshake transcript | | | |══════ Secure channel established ═══════════════════ | | | |──── {Encrypted} GET /api/data ──────────────────────→ | |←─── {Encrypted} 200 OK { ... } ──────────────────── |
🔥 Notice Something Important
In TLS 1.3, the certificate is sent encrypted. The attacker can't even see which website you're connecting to from the certificate (though SNI in the Client Hello still leaks the domain name — Encrypted Client Hello (ECH) is being developed to fix this).
What the Attacker Sees at Each Step
| Step | Visible to Attacker? | What They See |
|---|---|---|
| Client Hello | ✅ Yes | TLS version, cipher suites, client DH public key, SNI (domain name) |
| Server Hello | ✅ Yes | Chosen cipher, server DH public key |
| Shared secret computation | ❌ No | They have g^a and g^b but can't compute g^(ab) |
| Certificate | ❌ No (encrypted) | Gibberish — encrypted with shared secret |
| CertificateVerify | ❌ No (encrypted) | Gibberish |
| Application data | ❌ No (encrypted) | Gibberish — all HTTP data is encrypted |
Why TLS 1.3 Is Faster Than TLS 1.2
TLS 1.2 required 2 round trips before sending encrypted data. TLS 1.3 completes the handshake in just 1 round trip (1-RTT), and can even do 0-RTT for repeat connections.
TLS 1.2 (2-RTT): Client → Server: ClientHello Server → Client: ServerHello, Certificate, KeyExchange Client → Server: KeyExchange, ChangeCipherSpec, Finished Server → Client: ChangeCipherSpec, Finished Client → Server: [encrypted data] ← 2 round trips before data TLS 1.3 (1-RTT): Client → Server: ClientHello + KeyShare (DH public key) Server → Client: ServerHello + KeyShare + {Certificate} + {Finished} Client → Server: {Finished} + [encrypted data] ← 1 round trip! The trick: Client sends its DH key in the FIRST message, so the server can compute the shared secret immediately.
Man-in-the-Middle (MITM) Attack — Step by Step
This is the section that clears up the biggest confusion. Let's simulate exactly what happens when an attacker tries to intercept a TLS connection.
The Attack Scenario
Normal connection: Client ←──────────────────────────→ Server MITM attack: Client ←────→ Attacker ←────→ Server The attacker sits between client and server. Goal: Create two separate encrypted channels and relay data. Client thinks they're talking to Server Server thinks they're talking to Client Attacker decrypts, reads, re-encrypts everything
Step-by-Step: How the Attacker Tries
Client sends Client Hello
The attacker intercepts this and forwards it to the real server (or creates their own). So far, nothing unusual.
Attacker generates their own key pair
The attacker creates their own DH key pair (c, C where C = g^c mod p). They send C to the client instead of the server's real DH public key.
Attacker also connects to the real server
The attacker completes a separate TLS handshake with the real server, using their own DH keys. Now there are two encrypted channels: Client↔Attacker and Attacker↔Server.
🚨 The critical moment: Certificate
The attacker needs to send a certificate to the client. But they can't use the real server's certificate — they don't have the server's private key to create a valid CertificateVerify signature. They must send their OWN certificate.
💥 This is where it fails
The attacker's certificate is either: (a) self-signed — browser rejects it immediately, (b) signed by an untrusted CA — browser rejects it, or (c) for a different domain — browser rejects it. The attacker CANNOT get a legitimate CA to sign a certificate for google.com.
“If the attacker creates their own key pair (c, C), can't they decrypt and re-encrypt everything?”
Yes, they CAN create two separate encrypted channels. The encryption itself would work fine. The attack fails at the certificate verification step, not the encryption step.
Without certificates, this attack would work perfectly. The attacker would have a secure channel to the client and a secure channel to the server, decrypting and re-encrypting in between.
Certificates are the ONLY thing that prevents MITM. They bind a public key to an identity (domain name), verified by a trusted third party (CA).
Client Attacker Server | | | |←── Attacker's DH key ───| | |←── Attacker's cert ─────| | | | | | Browser checks cert: | | ❌ Is it signed by a trusted CA? | | → NO. Attacker can't get a CA to sign | | a cert for "google.com" for them. | | | | ❌ Does the signature verify? | | → NO. Even if attacker copies google.com's | | cert, they can't create a valid | | CertificateVerify (need private key). | | | | 🛑 CONNECTION REFUSED | | Browser shows: "Your connection is not private" |
Without Certificates vs. With Certificates
| Scenario | MITM Possible? | Why |
|---|---|---|
| Plain Diffie-Hellman (no certificates) | ✅ Yes | No way to verify who you're exchanging keys with |
| DH + Self-signed certificate | ✅ Yes (if user ignores warning) | No trusted CA vouching for identity |
| DH + CA-signed certificate (TLS) | ❌ No | Browser verifies certificate chain to trusted root CA |
🔥 The Key Takeaway
Diffie-Hellman alone does NOT prevent MITM. It only ensures that whoever you're talking to can't be eavesdropped on. Certificates are what ensure you're talking to the right person in the first place. DH + Certificates together = secure.
Why Copying the Public Key Doesn't Break Security
“If the attacker can see and copy the server's public key, why is TLS still secure?”
Because the public key is meant to be public. That's literally why it's called a "public" key. Security doesn't depend on keeping it secret.
The public key can only encrypt data or verify signatures. It cannot decrypt data or create signatures. Only the private key can do that, and the private key never leaves the server.
Let's be very specific about what having the public key lets you do and what it doesn't:
With the public key, you CAN
- ✅Encrypt a message that only the private key holder can read
- ✅Verify a signature created by the private key
- ✅Compute g^a mod p (the DH public value) — but this is useless without the private exponent
With the public key, you CANNOT
- ❌Decrypt messages encrypted with the public key
- ❌Create valid signatures (need private key)
- ❌Derive the private key from the public key
- ❌Compute the shared DH secret (need the private exponent)
The Mailbox Analogy
A public key is like a mailbox with a slot. Anyone can drop a letter in (encrypt), but only the person with the key to the mailbox (private key) can open it and read the letters. Knowing where the mailbox is and what it looks like doesn't help you read the mail inside.
Even in the Diffie-Hellman exchange, the attacker sees both public values (g^a mod p and g^b mod p) but cannot compute the shared secret (g^(ab) mod p). The public values are designed to be visible — the math ensures that seeing them doesn't help.
What an Attacker CAN and CANNOT Do
Let's be crystal clear about the attacker's capabilities and limitations when TLS is properly implemented.
👁️ Attacker CAN
- ⚠️Intercept all network packets
- ⚠️Read the Client Hello and Server Hello (plaintext)
- ⚠️See which domain you're connecting to (SNI)
- ⚠️Copy both DH public keys
- ⚠️See the size and timing of encrypted packets
- ⚠️Replay captured messages (but TLS detects this)
- ⚠️Block or drop packets (denial of service)
🛡️ Attacker CANNOT
- 🔒Forge a certificate signed by a trusted CA
- 🔒Compute the shared secret from public DH values
- 🔒Decrypt any encrypted traffic
- 🔒Modify encrypted data without detection
- 🔒Derive the server's private key
- 🔒Create a valid CertificateVerify signature
- 🔒Decrypt past sessions (forward secrecy)
💡 The Security Boundary
TLS protects data in transit. It does NOT protect against a compromised server, a compromised client, malware on your machine, or social engineering. If the server itself is hacked, TLS can't help — the attacker already has access to the decrypted data.
When MITM Actually Works (Edge Cases)
TLS is not magic. There are real-world scenarios where MITM attacks succeed. Understanding these helps you appreciate what TLS actually protects against and where the boundaries are.
User Ignores Browser Warning
When the browser shows 'Your connection is not private,' some users click 'Advanced → Proceed anyway.' This bypasses certificate verification entirely — the user is voluntarily accepting a potentially fake certificate.
✅Never ignore certificate warnings. If you see one on a site you trust, something is wrong — don't proceed.
Corporate TLS Inspection (Legitimate MITM)
Many companies install a custom root CA on employee devices. This lets the corporate proxy decrypt, inspect, and re-encrypt all HTTPS traffic. Your browser trusts the proxy's certificates because the company's root CA is in your trust store.
✅This is by design in corporate environments. Be aware that your employer can see your HTTPS traffic on company devices.
Malware Installs a Root CA
Malware can install a rogue root CA into your system's trust store. Once installed, the malware (or its operator) can issue certificates for any domain, and your browser will trust them.
✅Keep your system updated, use antivirus software, and periodically audit your trusted root certificates.
Compromised Certificate Authority
If a CA's private key is stolen, the attacker can sign certificates for any domain. This has happened — DigiNotar (2011) was compromised and used to issue fake certificates for google.com, targeting Iranian users.
✅CAs use Certificate Transparency logs so that domain owners can detect unauthorized certificates. Browsers also maintain revocation lists.
Summary: When Does MITM Succeed?
| Scenario | How It Works | How Common |
|---|---|---|
| User ignores warning | User clicks through browser's certificate error | Common (user error) |
| Corporate proxy | Company installs root CA on managed devices | Very common in enterprises |
| Malware root CA | Malicious software adds trusted CA to system | Moderate (targeted attacks) |
| Compromised CA | CA's private key is stolen | Rare but catastrophic |
| Government coercion | Government forces CA to issue fake certs | Rare, documented cases exist |
🔥 The Pattern
Every successful MITM attack against TLS involves compromising the trust model, not breaking the cryptography. The math is solid — the attacks target the human and organizational layers around it.
Final Mental Model
Let's distill everything into a clear mental model you can carry with you.
Certificate = Identity
Certificates answer: 'Am I really talking to google.com?' They bind a public key to a domain name, verified by a trusted CA. Without this, you could be talking to anyone.
Diffie-Hellman = Shared Secret
DH answers: 'How do we create a shared secret over a public channel?' Both sides contribute randomness, and math ensures only they can compute the result.
Symmetric Encryption = Speed
AES answers: 'How do we encrypt data fast?' Once the shared secret is established, all data is encrypted with fast symmetric algorithms. This is the workhorse.
┌─────────────────────────────────────────────────────────┐ │ TLS Connection │ │ │ │ Step 1: IDENTITY (Certificate) │ │ ┌─────────────────────────────────────────────┐ │ │ │ "Prove you're really google.com" │ │ │ │ → Server shows CA-signed certificate │ │ │ │ → Browser verifies chain to trusted root │ │ │ └─────────────────────────────────────────────┘ │ │ ↓ │ │ Step 2: KEY EXCHANGE (Diffie-Hellman) │ │ ┌─────────────────────────────────────────────┐ │ │ │ "Let's create a shared secret" │ │ │ │ → Both sides send DH public values │ │ │ │ → Both compute same secret independently │ │ │ │ → Attacker can't compute it (discrete log) │ │ │ └─────────────────────────────────────────────┘ │ │ ↓ │ │ Step 3: ENCRYPTED COMMUNICATION (AES) │ │ ┌─────────────────────────────────────────────┐ │ │ │ "Now we talk privately" │ │ │ │ → All data encrypted with shared secret │ │ │ │ → Fast symmetric encryption (AES-256-GCM) │ │ │ │ → Integrity protected (HMAC/AEAD) │ │ │ └─────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────┘
Quick Revision Cheat Sheet
TLS: HTTP + encryption + authentication + integrity
Certificate: Digital ID card signed by a trusted CA — proves server identity
CA (Certificate Authority): Trusted third party that signs certificates — like a passport office
Chain of Trust: Root CA → Intermediate CA → Server cert. Browser trusts root, verifies chain.
Diffie-Hellman: Math trick to create shared secret over public channel. Attacker sees public values but can't compute secret.
Forward Secrecy: Ephemeral DH keys per session. Compromised server key can't decrypt past traffic.
Symmetric Encryption: AES-256-GCM — fast encryption using the shared secret for all application data
MITM Prevention: Certificates prevent MITM, not encryption alone. DH without certs is vulnerable.
Public Key: Meant to be public. Can encrypt and verify, but can't decrypt or sign.
TLS 1.3: 1-RTT handshake, mandatory forward secrecy, encrypted certificates, no legacy cruft
Q:If I had to explain TLS in one sentence, what would I say?
A: TLS uses certificates to verify identity, Diffie-Hellman to create a shared secret that eavesdroppers can't compute, and symmetric encryption to protect all data — making it impossible for anyone between you and the server to read or modify your communication.