WebSocketsSSELong PollingPush NotificationsReal-timeFull-DuplexLive Data

Push Mechanisms

Master real-time push patterns — WebSockets, Server-Sent Events, long polling, and push notifications. Understand when and why to use each for live data delivery.

26 min read10 sections
01

The Big Picture — Why Push?

Traditional HTTP is pull-based: the client asks, the server answers. But what if the server has new data and the client doesn't know to ask? A new chat message, a stock price change, a live score update — the server needs to push data to the client without waiting to be asked.

📞

The Phone Call vs Checking Voicemail

Polling is like checking your voicemail every 30 seconds: 'Any new messages? No. Any new messages? No. Any new messages? Yes!' — wasteful, slow, and you always find out late. Push is like your friend calling you directly when something happens. You're instantly notified, no wasted effort, no delay. That's the difference between pull-based (HTTP polling) and push-based (WebSockets, SSE) communication.

🔥 Key Insight

HTTP was designed for request-response: client asks, server answers. Real-time systems need the opposite: server initiates, client receives. Push mechanisms solve this by keeping a connection open or using external services to deliver data the moment it's available.

02

Overview of Push Mechanisms

Long Polling

Client sends a request, server holds it until data is available. Semi-real-time. Works over standard HTTP. The simplest push approximation.

🔌

WebSockets

Persistent, full-duplex TCP connection. Both client and server push data anytime. True real-time. The default for chat, gaming, and live data.

📡

Server-Sent Events (SSE)

Server pushes updates over a persistent HTTP connection. One-way (server → client). Simpler than WebSockets. Great for feeds and notifications.

🔔

Push Notifications

Server sends alerts via OS-level services (APNs, FCM). Works when the app is closed. Not for continuous data — for discrete alerts.

MechanismDirectionConnectionLatencyComplexity
HTTP PollingClient → ServerNew connection each timeHigh (up to poll interval)Very low
Long PollingClient-initiated, server-heldHeld open, reconnectsMedium (near real-time)Low
SSEServer → ClientPersistent HTTP streamLow (instant push)Medium
WebSocketsBidirectionalPersistent TCPVery low (instant both ways)High
Push NotificationsServer → Device (via OS)External serviceMedium (seconds)Medium
03

Long Polling

Long polling is the simplest push approximation. The client sends a request, and the server holds it open until new data is available (or a timeout occurs). When the client gets a response, it immediately sends a new request. This creates a near-real-time loop using standard HTTP.

Long Polling — How It Workstext
Client                              Server
  │                                    │
  │── GET /updates?since=100 ─────────→│
  │                                    │ (holds request open...)
  │                                    │ (waiting for new data...)
  │                                    │ (30 seconds pass, no data)
  │←── 204 No Content (timeout) ───────│
  │                                    │
  │── GET /updates?since=100 ─────────→│
  │                                    │ (holds request open...)
  │                                    │ (new message arrives!)
  │←── 200 {messages: [...]} ──────────│
  │                                    │
  │── GET /updates?since=105 ─────────→│  (immediately reconnects)
  │                                    │ (waiting again...)

Timeline:
  T=0s:   Client sends request
  T=0-30s: Server holds connection (no data yet)
  T=12s:  New data arrivesserver responds immediately
  T=12s:  Client processes data, sends new request
Effective latency: 0-30 seconds (depends on when data arrives)

Strengths

  • Works over standard HTTP (no special protocol)
  • Works through firewalls and proxies (it's just HTTP)
  • Simple to implement (regular HTTP endpoints)
  • Good fallback when WebSockets aren't available
  • Near real-time for low-frequency updates

Limitations

  • High overhead: each response requires a new HTTP request
  • Server holds many open connections (resource-intensive)
  • Latency gap between response and next request
  • HTTP headers sent on every reconnection (bandwidth waste)
  • Not suitable for high-frequency updates (>1/sec)

🎯 Interview Insight

Long polling is the "before WebSockets" answer. Mention it as a fallback or for environments where WebSockets aren't supported. For modern systems, WebSockets or SSE are preferred. Early chat systems (pre-2011) used long polling extensively.

04

WebSockets (Full-Duplex)

WebSockets provide a persistent, full-duplex connection between client and server. After an initial HTTP handshake, the connection upgrades to a raw TCP channel where both sides can send data at any time — no request-response cycle, no headers on every message.

WebSocket — Connection Lifecycletext
1. HANDSHAKE (HTTPWebSocket upgrade):
   ClientGET /chat HTTP/1.1
            Upgrade: websocket
            Connection: Upgrade
   ServerHTTP/1.1 101 Switching Protocols
            Upgrade: websocket

2. PERSISTENT CONNECTION (full-duplex):
   Client ──→ "Hello!"           ──→ Server
   Client ←── "Hi there!"        ←── Server
   Client ──→ "How are you?"     ──→ Server
   Server ──→ "New notification" ──→ Client  (server-initiated!)
   
No HTTP headers per message (~2-6 bytes framing vs ~800 bytes HTTP)
Either side sends anytime (no request-response pattern)
Connection stays open for minutes, hours, or days

3. CLOSE:
   Either side sends a close frameclean shutdown
💬

Chat Applications

Messages appear instantly for all participants. Server pushes new messages without clients asking. Slack, Discord, WhatsApp Web all use WebSockets.

📈

Live Trading

Stock prices change multiple times per second. Server streams price updates continuously. Even 100ms delay means a different price.

🎮

Multiplayer Games

Player positions, actions, and game state sync in real-time. Bidirectional: client sends inputs, server sends world state. Latency must be <50ms.

Strengths

  • True real-time: sub-millisecond latency
  • Full-duplex: both sides push data anytime
  • Low overhead: ~2-6 bytes per message (no HTTP headers)
  • Efficient for high-frequency updates
  • Native browser support (WebSocket API)

Challenges

  • Stateful connections: harder to scale horizontally
  • Load balancer complexity (sticky sessions or connection-aware)
  • Reconnection logic must be implemented by the client
  • No built-in caching, retry, or error handling
  • Scaling to millions of connections requires pub/sub (Redis, Kafka)
Scaling WebSockets — The Pub/Sub Patterntext
Problem: 1M connected users across 100 servers.
  User A (Server 1) sends a message to User B (Server 47).
  Server 1 doesn't know about User B's connection.

Solution: Redis Pub/Sub (or Kafka)
  1. User A sends messageServer 1
  2. Server 1 publishes to Redis channel "chat:room:42"
  3. ALL servers subscribed to "chat:room:42" receive the message
  4. Server 47 finds User B's connection → pushes the message

  Server 1 ──→ Redis Pub/Sub ──→ Server 47 ──→ User B
  
  Every server subscribes to channels for its connected users.
  Redis broadcasts to all subscribers.
  This is how Slack, Discord, and every large chat system works.

🎯 Interview Insight

WebSockets are the default answer for real-time bidirectional communication. Always mention the scaling challenge: stateful connections require pub/sub (Redis) for cross-server messaging. Say: "WebSockets for the client connection, Redis Pub/Sub for broadcasting across server instances."

05

Server-Sent Events (SSE)

SSE is a one-way push mechanism: the server streams updates to the client over a persistent HTTP connection. The client opens a connection, and the server sends events as they happen. Unlike WebSockets, SSE is unidirectional — the client can't send data back over the same connection.

SSE — How It Workstext
Client opens connection:
  GET /events HTTP/1.1
  Accept: text/event-stream

Server responds with streaming headers:
  HTTP/1.1 200 OK
  Content-Type: text/event-stream
  Cache-Control: no-cache
  Connection: keep-alive

Server pushes events as they happen:
  data: {"type": "notification", "message": "New follower!"}

  data: {"type": "price_update", "symbol": "AAPL", "price": 178.50}

  data: {"type": "notification", "message": "Your order shipped!"}

  (connection stays open, server sends more events...)

Client-side (JavaScript):
  const source = new EventSource('/events');
  source.onmessage = (event) => {
    const data = JSON.parse(event.data);
    // Update UI with new data
  };
  // Auto-reconnects if connection drops!

Strengths

  • Simpler than WebSockets (standard HTTP, no upgrade)
  • Auto-reconnect built into the browser API
  • Works through HTTP proxies and firewalls
  • Event ID tracking (resume from where you left off)
  • Perfect for: notifications, live feeds, dashboards

Limitations

  • One-way only (server → client, no client → server)
  • Limited to ~6 connections per domain in HTTP/1.1
  • Text-only (no binary data without encoding)
  • Not suitable for bidirectional communication (chat, gaming)
  • Less efficient than WebSockets for high-frequency updates
FeatureWebSocketsSSE
DirectionBidirectionalServer → Client only
ProtocolWebSocket (TCP upgrade)HTTP (standard)
Auto-reconnectManual (you implement it)Built-in (browser handles it)
Binary dataYesNo (text only)
Proxy/firewallCan be blockedWorks everywhere (it's HTTP)
ComplexityHighLow
Best forChat, gaming, collaborationNotifications, feeds, dashboards

🎯 Interview Insight

SSE is underrated. For server-to-client-only updates (notifications, live feeds, dashboard metrics), SSE is simpler than WebSockets and works through any HTTP infrastructure. Use WebSockets only when you need bidirectional communication. Many systems that use WebSockets would be better served by SSE.

06

Push Notifications

Push notifications are delivered via OS-level services — Apple Push Notification Service (APNs) for iOS and Firebase Cloud Messaging (FCM) for Android/Web. They work even when the app is closed or the device is asleep. They're not for continuous data streams — they're for discrete alerts.

Push Notification — How It Workstext
1. REGISTRATION:
   App startsregisters with APNs/FCMgets a device token
   App sends device token to your backend
   Backend stores: user_iddevice_token

2. SENDING:
   Event occurs (new message, order shipped, price alert)
   BackendAPNs/FCM API: "Send to device_token: 'Your order shipped!'"
   APNs/FCMDevice: notification appears in system tray

3. DELIVERY:
   Device onlinenotification delivered immediately (~1-5 seconds)
   Device offlineAPNs/FCM queues it, delivers when device reconnects
   App closednotification still appears (OS-level delivery)

Flow:
  Your ServerAPNs (iOS) / FCM (Android) → User's Device

            Third-party service handles delivery,
            queuing, and device management
💬

Messaging Apps

'Alice sent you a message.' Delivered even when WhatsApp is closed. Tapping the notification opens the conversation.

🛒

E-Commerce

'Your order has shipped!' 'Price drop on your wishlist item.' Drives re-engagement and conversions.

🚨

Alerts & Monitoring

'Server CPU at 95%.' 'Payment failed for customer X.' Critical alerts that need immediate attention.

Strengths

  • Works when app is closed (OS-level delivery)
  • Reliable delivery (APNs/FCM handle queuing and retry)
  • Cross-platform (iOS, Android, Web)
  • No persistent connection needed from your server
  • Drives user re-engagement

Limitations

  • Limited payload size (~4 KB for APNs, ~4 KB for FCM)
  • Delivery latency: 1-30 seconds (not instant)
  • Dependent on third-party services (APNs, FCM)
  • Users can disable notifications (opt-out)
  • Not for continuous data streams (use WebSockets/SSE)

🎯 Interview Insight

Push notifications complement WebSockets/SSE — they don't replace them. Use WebSockets for in-app real-time data. Use push notifications for when the user isn't in the app. A chat app uses both: WebSocket for live messages when the app is open, push notification to alert the user when the app is closed.

07

End-to-End Scenario

Let's design the real-time layer for a chat application with a live dashboard and offline notifications.

💬 Chat App — 10M Connected Users

Features: live chat, typing indicators, online presence, admin dashboard, offline alerts.

Requirements: sub-100ms message delivery, works when app is closed.

1

WebSockets → Live chat messages + typing indicators

Each connected user maintains a WebSocket connection. Messages are delivered in <50ms. Typing indicators ('Alice is typing...') are sent as lightweight WebSocket frames. Server uses Redis Pub/Sub to broadcast messages across 100+ server instances. 10M concurrent connections distributed across servers.

2

SSE → Admin dashboard (live metrics)

The admin dashboard shows: messages/sec, active users, error rates. SSE streams these metrics from the server every second. No bidirectional communication needed — the dashboard only receives data. Simpler than WebSockets, auto-reconnects on network issues.

3

Push Notifications → Offline message alerts

User B is offline (app closed). User A sends a message. The server detects B is not connected via WebSocket. Server sends a push notification via FCM/APNs: 'Alice: Hey, are you free tonight?' B's phone shows the notification. Tapping it opens the app and loads the conversation.

4

Long Polling → Fallback for restricted networks

Some corporate networks block WebSocket connections. The client detects the WebSocket failure and falls back to long polling. Messages are delivered with slightly higher latency (~1-2 seconds) but the app still works. Graceful degradation.

Architecture — All Mechanisms Togethertext
User online (app open):
  ClientWebSocketServerRedis Pub/SubOther Servers
Live messages, typing, presence (sub-50ms)

User offline (app closed):
  ServerFCM/APNsDevice notification
"Alice sent you a message" (1-5 seconds)

Admin dashboard:
  BrowserSSEServer
Live metrics stream (1 update/sec)

Fallback (WebSocket blocked):
  ClientLong PollingServer
Messages delivered with ~1-2s latency

Connection management:
  10M users × 100 servers = 100K connections per server
  Redis Pub/Sub for cross-server message routing
  Presence service tracks who's online (Redis SET)
08

Trade-offs & Decision Making

MechanismDirectionLatencyComplexityScalingBest For
Long PollingClient-drivenMedium (0-30s)LowEasy (stateless HTTP)Fallback, low-frequency updates
SSEServer → ClientLow (instant push)MediumModerate (persistent HTTP)Notifications, feeds, dashboards
WebSocketsBidirectionalVery low (<50ms)HighHard (stateful, pub/sub needed)Chat, gaming, collaboration
Push NotifsServer → DeviceMedium (1-30s)MediumEasy (third-party handles it)Offline alerts, re-engagement

Decision Flowchart

Which Mechanism to Use?text
Need bidirectional communication?
  ├── YESWebSockets (chat, gaming, collaboration)
  └── NO

Need to reach users when app is closed?
  ├── YESPush Notifications (alerts, re-engagement)
  └── NO

Need serverclient streaming?
  ├── YESSSE (notifications, feeds, live metrics)
  └── NO

Need near-real-time with HTTP compatibility?
  ├── YESLong Polling (fallback, restricted networks)
  └── NORegular HTTP polling is fine

🎯 The Practical Rule

Most real-time systems use multiple mechanisms together. WebSockets for in-app real-time. SSE for dashboards. Push notifications for offline users. Long polling as a fallback. Don't pick one — layer them based on the use case.

09

Interview Questions

Q:WebSockets vs SSE — when to use each?

A: WebSockets when you need bidirectional communication — chat (client sends messages AND receives them), gaming (client sends inputs, server sends state), collaborative editing. SSE when you only need server-to-client push — notifications, live feeds, dashboard metrics, stock tickers. SSE is simpler (standard HTTP, auto-reconnect, works through proxies) and should be preferred when bidirectional isn't needed. Many systems use WebSockets unnecessarily when SSE would suffice.

Q:Why is long polling inefficient?

A: Each response requires a new HTTP request — full headers (~800 bytes) sent every time. The server holds thousands of open connections waiting for data. There's a latency gap between receiving a response and establishing the next connection. For high-frequency updates, the overhead of constant reconnection is significant. Long polling was the best option before WebSockets (2011) — now it's primarily a fallback for restricted networks.

Q:How do you scale WebSocket connections?

A: WebSocket connections are stateful — each is tied to a specific server. To broadcast a message to all users in a chat room across 100 servers: use Redis Pub/Sub or Kafka. When a message arrives at Server 1, it publishes to a Redis channel. All servers subscribed to that channel receive the message and push it to their connected clients. Also: use connection-aware load balancing (sticky sessions or a connection registry), implement heartbeats to detect dead connections, and set connection limits per server (~100K connections per instance).

Q:When to use push notifications?

A: When the user isn't actively using the app. Push notifications are delivered by the OS (APNs/FCM) even when the app is closed. Use for: new messages when offline, order status updates, price alerts, breaking news. Don't use for: continuous data streams (use WebSockets), in-app updates (use SSE), or high-frequency data (push notifications have 1-30 second latency and limited payload). A chat app uses WebSockets when open + push notifications when closed.

10

Pitfalls

🔌

Using WebSockets when SSE would suffice

Building a notification feed with WebSockets when the client never sends data back. WebSockets add connection management complexity, load balancer configuration, and reconnection logic — all unnecessary when SSE provides server-to-client push with auto-reconnect built in.

Ask: 'Does the client need to send data over this connection?' If no → use SSE. If yes → use WebSockets. SSE is simpler, works through HTTP proxies, and auto-reconnects. Reserve WebSockets for truly bidirectional use cases.

📡

Not handling connection drops

Assuming WebSocket connections stay open forever. In reality: mobile users switch networks (WiFi → cellular), connections time out, servers restart, load balancers close idle connections. Without reconnection logic, users silently stop receiving updates.

Implement: (1) heartbeat/ping-pong to detect dead connections, (2) automatic reconnection with exponential backoff, (3) message buffering on the server (deliver missed messages on reconnect), (4) connection state tracking (detect when a user goes offline). Libraries like Socket.IO handle most of this automatically.

📈

Scaling issues with persistent connections

Running 1M WebSocket connections on 5 servers (200K each). One server crashes → 200K users disconnect simultaneously → all reconnect to the remaining 4 servers → each now handles 250K → cascading overload.

Distribute connections across many servers (50+ for 1M users = 20K each). Use graceful shutdown (drain connections before stopping a server). Implement connection limits per server. Use a connection registry (Redis) so any server can route messages to any user. Plan for thundering herd on reconnection (stagger reconnect with jitter).

🔔

Misusing push notifications for real-time streaming

Sending push notifications for every chat message, stock price update, or live score change. Push notifications have 1-30 second latency, limited payload (4 KB), and users will disable notifications if they receive 100/day. They're for alerts, not data streams.

Push notifications for: discrete, important events when the user isn't in the app (new message, order shipped, price alert). WebSockets/SSE for: continuous, in-app data streams (live chat, stock ticker, dashboard). A notification that says 'You have 47 new messages' is useful. 47 individual push notifications is spam.