play_circle

Design
YouTube

Storing petabytes of video data and streaming it to millions of users simultaneously. Latency isn't just annoying; it's a product killer.

Requirements

Functional

  • 1. Upload video (Resume capability).
  • 2. View video (Streaming).
  • 3. Change quality (144p to 4K).
  • 4. Analytics / View counts.

Non-Functional

  • 1. Reliability: Uploads must not fail midway.
  • 2. Availability: Videos must play instantly.
  • 3. Throughput: Massive outbound bandwidth.

Upload & Transcoding

You upload a raw 4K `.mov` file. We can't stream that. We need to convert it.

The Pipeline

upload_file
Original Storage (S3)
arrow_forward
content_cut
Chunking
arrow_forward
transform
Transcoding Workers
Why Chunking?
Processing a 1-hour video as one file is slow. Split it into 5-minute chunks. Process in parallel.
Why Transcoding?
Convert to different formats (mp4, webm) and resolutions (360p, 720p, 1080p).

Streaming (HLS/DASH)

Not just a file download

We don't download `movie.mp4`. We download a playlist of small chunks (`.ts` files).

  • HLS (HTTP Live Streaming): Apple's standard.
  • MPEG-DASH: Open standard.
// master.m3u8
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=1000000,RESOLUTION=640x360
360p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=4000000,RESOLUTION=1920x1080
1080p.m3u8

Adaptive Bitrate

speed

Network Fluctuation

User starts on WiFi (1080p). Walks outside to 4G (720p). Enters elevator (360p).

1080p
720p
480p
144p
Client logic detects bandwidth & switches chunk source automatically.

CDN & Edge Optimization

We cannot serve 1 billion users from a single datacenter in US-East.

Edge Caching

Popular videos (Viral hits) are cached in thousands of ISP Point-of-Presence (PoP) locations globally.

User in India → Mumbai Edge Node → Video Chunk

Long Tail

Unpopular videos (My cat video from 2012) are not on the Edge.

User → Edge (Miss) → Origin Server (S3 Glacier/Standard) → Edge (Cache) → User