Design a Video Streaming Platform (YouTube / Netflix)
An end-to-end interview-ready walkthrough — from capacity math through deep dives on resumable uploads, transcoding DAGs, adaptive bitrate streaming, CDN edge delivery, view-count pipelines, and recommendation feeds. Structured to mirror a 45-minute system design interview.
Requirements
A video streaming platform is one of the most complex systems you can be asked to design. The scope ranges from a simple "upload and play" service to a multi-petabyte global delivery network with ML-powered recommendations. Anchoring the requirements early prevents you from drowning in scope creep — and signals to the interviewer that you know how to frame a problem before solving it.
Functional Requirements
Core business logic & features
- 01.Video UploadUsers can upload videos of any size (up to 256 GB). Uploads must be resumable — a dropped connection shouldn't restart from zero.
- 02.Video StreamingUsers can watch videos with smooth playback. The player adapts quality based on network conditions (adaptive bitrate).
- 03.Video ProcessingUploaded videos are transcoded into multiple resolutions and codecs to support all devices and bandwidths.
- 04.Video MetadataEach video has a title, description, thumbnail, tags, upload date, and view count. Users can search and browse.
- 05.Comments & ReactionsUsers can like/dislike videos and post comments. Comments are threaded and paginated.
- 06.Personalized FeedHomepage shows recommended videos based on watch history, subscriptions, and trending content.
Non-Functional
System constraints
Availability
99.99% uptime for streaming. A video platform that buffers loses users permanently.
Latency
Video playback starts in <2s. Seek operations complete in <500ms. Global reach.
Scale
500 hours of video uploaded per minute. 1B+ video views per day. 100M+ DAU.
Durability
Zero data loss. Once uploaded, a video must never be lost. 11 nines of object durability.
🎯 Clarifying questions that change the design
Each of these steers you toward a fundamentally different architecture:
- Live streaming or VOD only? Live adds WebRTC/RTMP ingest, real-time transcoding, and sub-second latency targets. VOD is pre-processed and CDN-cached.
- What's the average video length? 5-min clips (TikTok) vs 2-hour movies (Netflix) changes storage, transcoding time, and chunking strategy.
- Global or single-region? Global means multi-region object storage, CDN edge nodes on every continent, and geo-routed DNS.
- How fresh do view counts need to be? Real-time counters vs eventually-consistent aggregates — different pipelines.
- DRM required? Encrypted segments, license servers, Widevine/FairPlay integration — adds an entire subsystem.
- Monetization model? Ads require ad-insertion points in manifests (SSAI). Subscription is simpler.
In scope vs out of scope
| In Scope | Out of Scope | Why |
|---|---|---|
| Upload + transcode + stream (VOD) | Live streaming (RTMP ingest) | Live is a separate system — different latency model entirely |
| Adaptive bitrate (HLS/DASH) | DRM / content encryption | DRM adds Widevine/FairPlay — enterprise feature, not core architecture |
| Resumable chunked uploads | Client-side video editing | Editor is a frontend concern, not a distributed systems one |
| View counts + trending | Real-time ad auction / SSAI | Ad tech is its own 45-minute interview |
| Recommendation feed (high-level) | Full ML model training pipeline | We cover the serving layer, not the training infrastructure |
| Comments + likes | Content moderation ML | Moderation is a product/ML concern, not a systems one |
| CDN-based global delivery | P2P delivery (WebTorrent) | P2P is niche — CDN is the industry standard |
💡 Interviewer signal
The strongest opening is: "I'll focus on the upload-to-playback pipeline — that's where the distributed systems complexity lives. Comments and likes are standard CRUD. The recommendation feed I'll cover at the serving layer, not the training side." This shows you know where the interesting problems are.