Push Mechanisms
Master real-time push patterns — WebSockets, Server-Sent Events, long polling, and push notifications. Understand when and why to use each for live data delivery.
Table of Contents
The Big Picture — Why Push?
Traditional HTTP is pull-based: the client asks, the server answers. But what if the server has new data and the client doesn't know to ask? A new chat message, a stock price change, a live score update — the server needs to push data to the client without waiting to be asked.
The Phone Call vs Checking Voicemail
Polling is like checking your voicemail every 30 seconds: 'Any new messages? No. Any new messages? No. Any new messages? Yes!' — wasteful, slow, and you always find out late. Push is like your friend calling you directly when something happens. You're instantly notified, no wasted effort, no delay. That's the difference between pull-based (HTTP polling) and push-based (WebSockets, SSE) communication.
🔥 Key Insight
HTTP was designed for request-response: client asks, server answers. Real-time systems need the opposite: server initiates, client receives. Push mechanisms solve this by keeping a connection open or using external services to deliver data the moment it's available.
Overview of Push Mechanisms
Long Polling
Client sends a request, server holds it until data is available. Semi-real-time. Works over standard HTTP. The simplest push approximation.
WebSockets
Persistent, full-duplex TCP connection. Both client and server push data anytime. True real-time. The default for chat, gaming, and live data.
Server-Sent Events (SSE)
Server pushes updates over a persistent HTTP connection. One-way (server → client). Simpler than WebSockets. Great for feeds and notifications.
Push Notifications
Server sends alerts via OS-level services (APNs, FCM). Works when the app is closed. Not for continuous data — for discrete alerts.
| Mechanism | Direction | Connection | Latency | Complexity |
|---|---|---|---|---|
| HTTP Polling | Client → Server | New connection each time | High (up to poll interval) | Very low |
| Long Polling | Client-initiated, server-held | Held open, reconnects | Medium (near real-time) | Low |
| SSE | Server → Client | Persistent HTTP stream | Low (instant push) | Medium |
| WebSockets | Bidirectional | Persistent TCP | Very low (instant both ways) | High |
| Push Notifications | Server → Device (via OS) | External service | Medium (seconds) | Medium |
Long Polling
Long polling is the simplest push approximation. The client sends a request, and the server holds it open until new data is available (or a timeout occurs). When the client gets a response, it immediately sends a new request. This creates a near-real-time loop using standard HTTP.
Client Server │ │ │── GET /updates?since=100 ─────────→│ │ │ (holds request open...) │ │ (waiting for new data...) │ │ (30 seconds pass, no data) │←── 204 No Content (timeout) ───────│ │ │ │── GET /updates?since=100 ─────────→│ │ │ (holds request open...) │ │ (new message arrives!) │←── 200 {messages: [...]} ──────────│ │ │ │── GET /updates?since=105 ─────────→│ (immediately reconnects) │ │ (waiting again...) Timeline: T=0s: Client sends request T=0-30s: Server holds connection (no data yet) T=12s: New data arrives → server responds immediately T=12s: Client processes data, sends new request → Effective latency: 0-30 seconds (depends on when data arrives)
Strengths
- ✅Works over standard HTTP (no special protocol)
- ✅Works through firewalls and proxies (it's just HTTP)
- ✅Simple to implement (regular HTTP endpoints)
- ✅Good fallback when WebSockets aren't available
- ✅Near real-time for low-frequency updates
Limitations
- ❌High overhead: each response requires a new HTTP request
- ❌Server holds many open connections (resource-intensive)
- ❌Latency gap between response and next request
- ❌HTTP headers sent on every reconnection (bandwidth waste)
- ❌Not suitable for high-frequency updates (>1/sec)
🎯 Interview Insight
Long polling is the "before WebSockets" answer. Mention it as a fallback or for environments where WebSockets aren't supported. For modern systems, WebSockets or SSE are preferred. Early chat systems (pre-2011) used long polling extensively.
WebSockets (Full-Duplex)
WebSockets provide a persistent, full-duplex connection between client and server. After an initial HTTP handshake, the connection upgrades to a raw TCP channel where both sides can send data at any time — no request-response cycle, no headers on every message.
1. HANDSHAKE (HTTP → WebSocket upgrade): Client → GET /chat HTTP/1.1 Upgrade: websocket Connection: Upgrade Server → HTTP/1.1 101 Switching Protocols Upgrade: websocket 2. PERSISTENT CONNECTION (full-duplex): Client ──→ "Hello!" ──→ Server Client ←── "Hi there!" ←── Server Client ──→ "How are you?" ──→ Server Server ──→ "New notification" ──→ Client (server-initiated!) → No HTTP headers per message (~2-6 bytes framing vs ~800 bytes HTTP) → Either side sends anytime (no request-response pattern) → Connection stays open for minutes, hours, or days 3. CLOSE: Either side sends a close frame → clean shutdown
Chat Applications
Messages appear instantly for all participants. Server pushes new messages without clients asking. Slack, Discord, WhatsApp Web all use WebSockets.
Live Trading
Stock prices change multiple times per second. Server streams price updates continuously. Even 100ms delay means a different price.
Multiplayer Games
Player positions, actions, and game state sync in real-time. Bidirectional: client sends inputs, server sends world state. Latency must be <50ms.
Strengths
- ✅True real-time: sub-millisecond latency
- ✅Full-duplex: both sides push data anytime
- ✅Low overhead: ~2-6 bytes per message (no HTTP headers)
- ✅Efficient for high-frequency updates
- ✅Native browser support (WebSocket API)
Challenges
- ❌Stateful connections: harder to scale horizontally
- ❌Load balancer complexity (sticky sessions or connection-aware)
- ❌Reconnection logic must be implemented by the client
- ❌No built-in caching, retry, or error handling
- ❌Scaling to millions of connections requires pub/sub (Redis, Kafka)
Problem: 1M connected users across 100 servers. User A (Server 1) sends a message to User B (Server 47). Server 1 doesn't know about User B's connection. Solution: Redis Pub/Sub (or Kafka) 1. User A sends message → Server 1 2. Server 1 publishes to Redis channel "chat:room:42" 3. ALL servers subscribed to "chat:room:42" receive the message 4. Server 47 finds User B's connection → pushes the message Server 1 ──→ Redis Pub/Sub ──→ Server 47 ──→ User B Every server subscribes to channels for its connected users. Redis broadcasts to all subscribers. This is how Slack, Discord, and every large chat system works.
🎯 Interview Insight
WebSockets are the default answer for real-time bidirectional communication. Always mention the scaling challenge: stateful connections require pub/sub (Redis) for cross-server messaging. Say: "WebSockets for the client connection, Redis Pub/Sub for broadcasting across server instances."
Server-Sent Events (SSE)
SSE is a one-way push mechanism: the server streams updates to the client over a persistent HTTP connection. The client opens a connection, and the server sends events as they happen. Unlike WebSockets, SSE is unidirectional — the client can't send data back over the same connection.
Client opens connection: GET /events HTTP/1.1 Accept: text/event-stream Server responds with streaming headers: HTTP/1.1 200 OK Content-Type: text/event-stream Cache-Control: no-cache Connection: keep-alive Server pushes events as they happen: data: {"type": "notification", "message": "New follower!"} data: {"type": "price_update", "symbol": "AAPL", "price": 178.50} data: {"type": "notification", "message": "Your order shipped!"} (connection stays open, server sends more events...) Client-side (JavaScript): const source = new EventSource('/events'); source.onmessage = (event) => { const data = JSON.parse(event.data); // Update UI with new data }; // Auto-reconnects if connection drops!
Strengths
- ✅Simpler than WebSockets (standard HTTP, no upgrade)
- ✅Auto-reconnect built into the browser API
- ✅Works through HTTP proxies and firewalls
- ✅Event ID tracking (resume from where you left off)
- ✅Perfect for: notifications, live feeds, dashboards
Limitations
- ❌One-way only (server → client, no client → server)
- ❌Limited to ~6 connections per domain in HTTP/1.1
- ❌Text-only (no binary data without encoding)
- ❌Not suitable for bidirectional communication (chat, gaming)
- ❌Less efficient than WebSockets for high-frequency updates
| Feature | WebSockets | SSE |
|---|---|---|
| Direction | Bidirectional | Server → Client only |
| Protocol | WebSocket (TCP upgrade) | HTTP (standard) |
| Auto-reconnect | Manual (you implement it) | Built-in (browser handles it) |
| Binary data | Yes | No (text only) |
| Proxy/firewall | Can be blocked | Works everywhere (it's HTTP) |
| Complexity | High | Low |
| Best for | Chat, gaming, collaboration | Notifications, feeds, dashboards |
🎯 Interview Insight
SSE is underrated. For server-to-client-only updates (notifications, live feeds, dashboard metrics), SSE is simpler than WebSockets and works through any HTTP infrastructure. Use WebSockets only when you need bidirectional communication. Many systems that use WebSockets would be better served by SSE.
Push Notifications
Push notifications are delivered via OS-level services — Apple Push Notification Service (APNs) for iOS and Firebase Cloud Messaging (FCM) for Android/Web. They work even when the app is closed or the device is asleep. They're not for continuous data streams — they're for discrete alerts.
1. REGISTRATION: App starts → registers with APNs/FCM → gets a device token App sends device token to your backend Backend stores: user_id → device_token 2. SENDING: Event occurs (new message, order shipped, price alert) Backend → APNs/FCM API: "Send to device_token: 'Your order shipped!'" APNs/FCM → Device: notification appears in system tray 3. DELIVERY: Device online → notification delivered immediately (~1-5 seconds) Device offline → APNs/FCM queues it, delivers when device reconnects App closed → notification still appears (OS-level delivery) Flow: Your Server → APNs (iOS) / FCM (Android) → User's Device ↑ Third-party service handles delivery, queuing, and device management
Messaging Apps
'Alice sent you a message.' Delivered even when WhatsApp is closed. Tapping the notification opens the conversation.
E-Commerce
'Your order has shipped!' 'Price drop on your wishlist item.' Drives re-engagement and conversions.
Alerts & Monitoring
'Server CPU at 95%.' 'Payment failed for customer X.' Critical alerts that need immediate attention.
Strengths
- ✅Works when app is closed (OS-level delivery)
- ✅Reliable delivery (APNs/FCM handle queuing and retry)
- ✅Cross-platform (iOS, Android, Web)
- ✅No persistent connection needed from your server
- ✅Drives user re-engagement
Limitations
- ❌Limited payload size (~4 KB for APNs, ~4 KB for FCM)
- ❌Delivery latency: 1-30 seconds (not instant)
- ❌Dependent on third-party services (APNs, FCM)
- ❌Users can disable notifications (opt-out)
- ❌Not for continuous data streams (use WebSockets/SSE)
🎯 Interview Insight
Push notifications complement WebSockets/SSE — they don't replace them. Use WebSockets for in-app real-time data. Use push notifications for when the user isn't in the app. A chat app uses both: WebSocket for live messages when the app is open, push notification to alert the user when the app is closed.
End-to-End Scenario
Let's design the real-time layer for a chat application with a live dashboard and offline notifications.
💬 Chat App — 10M Connected Users
Features: live chat, typing indicators, online presence, admin dashboard, offline alerts.
Requirements: sub-100ms message delivery, works when app is closed.
WebSockets → Live chat messages + typing indicators
Each connected user maintains a WebSocket connection. Messages are delivered in <50ms. Typing indicators ('Alice is typing...') are sent as lightweight WebSocket frames. Server uses Redis Pub/Sub to broadcast messages across 100+ server instances. 10M concurrent connections distributed across servers.
SSE → Admin dashboard (live metrics)
The admin dashboard shows: messages/sec, active users, error rates. SSE streams these metrics from the server every second. No bidirectional communication needed — the dashboard only receives data. Simpler than WebSockets, auto-reconnects on network issues.
Push Notifications → Offline message alerts
User B is offline (app closed). User A sends a message. The server detects B is not connected via WebSocket. Server sends a push notification via FCM/APNs: 'Alice: Hey, are you free tonight?' B's phone shows the notification. Tapping it opens the app and loads the conversation.
Long Polling → Fallback for restricted networks
Some corporate networks block WebSocket connections. The client detects the WebSocket failure and falls back to long polling. Messages are delivered with slightly higher latency (~1-2 seconds) but the app still works. Graceful degradation.
User online (app open): Client ⇄ WebSocket ⇄ Server ⇄ Redis Pub/Sub ⇄ Other Servers → Live messages, typing, presence (sub-50ms) User offline (app closed): Server → FCM/APNs → Device notification → "Alice sent you a message" (1-5 seconds) Admin dashboard: Browser ← SSE ← Server → Live metrics stream (1 update/sec) Fallback (WebSocket blocked): Client → Long Polling → Server → Messages delivered with ~1-2s latency Connection management: 10M users × 100 servers = 100K connections per server Redis Pub/Sub for cross-server message routing Presence service tracks who's online (Redis SET)
Trade-offs & Decision Making
| Mechanism | Direction | Latency | Complexity | Scaling | Best For |
|---|---|---|---|---|---|
| Long Polling | Client-driven | Medium (0-30s) | Low | Easy (stateless HTTP) | Fallback, low-frequency updates |
| SSE | Server → Client | Low (instant push) | Medium | Moderate (persistent HTTP) | Notifications, feeds, dashboards |
| WebSockets | Bidirectional | Very low (<50ms) | High | Hard (stateful, pub/sub needed) | Chat, gaming, collaboration |
| Push Notifs | Server → Device | Medium (1-30s) | Medium | Easy (third-party handles it) | Offline alerts, re-engagement |
Decision Flowchart
Need bidirectional communication? ├── YES → WebSockets (chat, gaming, collaboration) └── NO ↓ Need to reach users when app is closed? ├── YES → Push Notifications (alerts, re-engagement) └── NO ↓ Need server → client streaming? ├── YES → SSE (notifications, feeds, live metrics) └── NO ↓ Need near-real-time with HTTP compatibility? ├── YES → Long Polling (fallback, restricted networks) └── NO → Regular HTTP polling is fine
🎯 The Practical Rule
Most real-time systems use multiple mechanisms together. WebSockets for in-app real-time. SSE for dashboards. Push notifications for offline users. Long polling as a fallback. Don't pick one — layer them based on the use case.
Interview Questions
Q:WebSockets vs SSE — when to use each?
A: WebSockets when you need bidirectional communication — chat (client sends messages AND receives them), gaming (client sends inputs, server sends state), collaborative editing. SSE when you only need server-to-client push — notifications, live feeds, dashboard metrics, stock tickers. SSE is simpler (standard HTTP, auto-reconnect, works through proxies) and should be preferred when bidirectional isn't needed. Many systems use WebSockets unnecessarily when SSE would suffice.
Q:Why is long polling inefficient?
A: Each response requires a new HTTP request — full headers (~800 bytes) sent every time. The server holds thousands of open connections waiting for data. There's a latency gap between receiving a response and establishing the next connection. For high-frequency updates, the overhead of constant reconnection is significant. Long polling was the best option before WebSockets (2011) — now it's primarily a fallback for restricted networks.
Q:How do you scale WebSocket connections?
A: WebSocket connections are stateful — each is tied to a specific server. To broadcast a message to all users in a chat room across 100 servers: use Redis Pub/Sub or Kafka. When a message arrives at Server 1, it publishes to a Redis channel. All servers subscribed to that channel receive the message and push it to their connected clients. Also: use connection-aware load balancing (sticky sessions or a connection registry), implement heartbeats to detect dead connections, and set connection limits per server (~100K connections per instance).
Q:When to use push notifications?
A: When the user isn't actively using the app. Push notifications are delivered by the OS (APNs/FCM) even when the app is closed. Use for: new messages when offline, order status updates, price alerts, breaking news. Don't use for: continuous data streams (use WebSockets), in-app updates (use SSE), or high-frequency data (push notifications have 1-30 second latency and limited payload). A chat app uses WebSockets when open + push notifications when closed.
Pitfalls
Using WebSockets when SSE would suffice
Building a notification feed with WebSockets when the client never sends data back. WebSockets add connection management complexity, load balancer configuration, and reconnection logic — all unnecessary when SSE provides server-to-client push with auto-reconnect built in.
✅Ask: 'Does the client need to send data over this connection?' If no → use SSE. If yes → use WebSockets. SSE is simpler, works through HTTP proxies, and auto-reconnects. Reserve WebSockets for truly bidirectional use cases.
Not handling connection drops
Assuming WebSocket connections stay open forever. In reality: mobile users switch networks (WiFi → cellular), connections time out, servers restart, load balancers close idle connections. Without reconnection logic, users silently stop receiving updates.
✅Implement: (1) heartbeat/ping-pong to detect dead connections, (2) automatic reconnection with exponential backoff, (3) message buffering on the server (deliver missed messages on reconnect), (4) connection state tracking (detect when a user goes offline). Libraries like Socket.IO handle most of this automatically.
Scaling issues with persistent connections
Running 1M WebSocket connections on 5 servers (200K each). One server crashes → 200K users disconnect simultaneously → all reconnect to the remaining 4 servers → each now handles 250K → cascading overload.
✅Distribute connections across many servers (50+ for 1M users = 20K each). Use graceful shutdown (drain connections before stopping a server). Implement connection limits per server. Use a connection registry (Redis) so any server can route messages to any user. Plan for thundering herd on reconnection (stagger reconnect with jitter).
Misusing push notifications for real-time streaming
Sending push notifications for every chat message, stock price update, or live score change. Push notifications have 1-30 second latency, limited payload (4 KB), and users will disable notifications if they receive 100/day. They're for alerts, not data streams.
✅Push notifications for: discrete, important events when the user isn't in the app (new message, order shipped, price alert). WebSockets/SSE for: continuous, in-app data streams (live chat, stock ticker, dashboard). A notification that says 'You have 47 new messages' is useful. 47 individual push notifications is spam.