MessagingWhatsAppTelegramWebSocketE2E EncryptionCassandraReal-timeFan-outInterview

Design a Messaging App (WhatsApp / Telegram)

An end-to-end interview-ready walkthrough — from back-of-envelope math through deep dives on WebSocket management, message ordering, group fan-out, E2E encryption, and multi-device sync. Structured to mirror the arc of a 45-minute system design interview.

55 min read16 sections
01

Requirements

A messaging app is deceptively complex. On the surface it's "send text from A to B" — but at WhatsApp/Telegram scale, you're solving real-time delivery across 2 billion devices, message ordering without a global clock, end-to-end encryption that even your own servers can't break, and group fan-out to millions of members. The requirements you anchor here determine whether you build a weekend project or a planetary-scale communication system.

Functional Requirements

Core business logic & features

  • 01.
    1:1 MessagingUsers can send text messages to any other user. Messages are persisted and available on reconnect.
  • 02.
    Group MessagingUsers can create groups (up to 100K members for Telegram-scale). Messages fan out to all participants.
  • 03.
    Delivery StatusThree-state delivery tracking: sent (server received), delivered (recipient device received), read (recipient opened).
  • 04.
    Media SharingSupport images, videos, audio messages, and documents up to 2GB. Thumbnails generated server-side.
  • 05.
    Online PresenceShow online/offline status and 'last seen' timestamp. Typing indicators for active conversations.
  • 06.
    Multi-Device SyncUsers can be logged in on phone + desktop + tablet simultaneously. All devices stay in sync.

Non-Functional

System constraints

Latency

Message delivery in <500ms end-to-end for online recipients. Typing indicators in <200ms.

Scale

2B registered users, 500M DAU, 100B messages/day. Peak: 50M concurrent WebSocket connections.

Availability

99.99% uptime. Messaging is critical infrastructure — downtime means people can't communicate.

Security

End-to-end encryption for all 1:1 messages. Even server operators cannot read message content.

🎯 Clarifying questions that change the design

Each of these steers you toward a fundamentally different architecture:

  • What's the max group size? 256 members (WhatsApp) vs 200K (Telegram) changes fan-out strategy entirely. Small groups can fan-out on write; large groups must fan-out on read.
  • Is message history stored server-side or client-side? WhatsApp stores minimally on server (E2E encrypted, client is source of truth). Telegram stores everything server-side (cloud-first). This changes your storage model.
  • Multi-device or single-device? Single-device (original WhatsApp) is simpler — one inbox queue. Multi-device requires per-device delivery tracking and sync protocols.
  • Do we need message search? Searching E2E encrypted messages requires client-side indexing. Server-side search only works for unencrypted messages.
  • Voice/video calls? Real-time media is a separate system (WebRTC, TURN/STUN servers). Scope it out unless asked.
  • Message retention policy? Keep forever vs auto-delete after N days changes storage sizing dramatically.

In scope vs out of scope

In ScopeOut of ScopeWhy
1:1 and group text messagingVoice/video calls (WebRTC)Real-time media is a separate system with different latency models
Media sharing (images, video, docs)Stories / status updatesEphemeral content is a feed problem, not a messaging one
Delivery receipts (sent/delivered/read)Payment integration (WhatsApp Pay)Fintech is its own 45-minute interview
End-to-end encryption (1:1)Full E2E for large groups (>256)Group E2E at scale requires complex key rotation — mention but don't deep-dive
Online presence + typing indicatorsAI chatbots / message translationML features, not distributed systems
Multi-device sync (phone + desktop)Cross-platform message backup/restoreBackup is a storage/export concern, not real-time delivery
Push notifications for offline usersSMS fallback deliveryCarrier integration is a vendor concern, not architecture

💡 Interviewer signal

The strongest opening: "I'll focus on the real-time message delivery pipeline — that's where the distributed systems complexity lives. The core challenge is maintaining message ordering, exactly-once delivery semantics, and sub-500ms latency across 2 billion devices with persistent connections. Media upload is an async pipeline I'll cover separately." This shows you know where the hard problems are.

1 / 16