Design a
Feed System
Twitter, Facebook, Instagram. The core feature is the "News Feed". How do you generate a personalized timeline for a user with 5,000 friends instantly?
Requirements
Functional
- 1. User can post a tweet.
- 2. User can see a timeline of tweets from people they follow.
- 3. Timeline is sorted by time (reverse chronological) or relevance.
- 4. Support media (images/videos).
Non-Functional
- 1. Extreme Read Load: Users read feeds much more than they post.
- 2. Low Latency: Feed generation must be under 200ms.
- 3. Eventual Consistency: It's okay if a tweet takes a few seconds to appear.
Pull Model (Fan-out on Read)
Simple but Slow
We don't pre-calculate anything. When User A requests their feed, we build it on the fly.
WHERE user_id IN (SELECT followee_id FROM follows WHERE follower_id = A)
ORDER BY created_at DESC
LIMIT 20;
- Simple architecture.
- Real-time updates (instant visibility).
- Space efficient (no duplicate data).
- N+1 Problem: If I follow 5,000 people, the DB query is massive.
- High latency for the user viewing the feed.
Push Model (Fan-out on Write)
Fast Reads, Heavy Writes
Each user has a pre-computed "Feed List" in memory (Redis). When I tweet, the system pushes that tweet ID to all my followers' lists.
If Justin Bieber has 100M followers, a single tweet triggers 100M writes to Redis. This causes massive lag.
Hybrid Model (The Solution)
Best of Both Worlds
We treat normal users and celebrities differently.
For Normal Users
Use Push Model.
When I (300 followers) tweet, push to all 300 feeds. Fast and efficient.
For Celebrities
Use Pull Model.
When Bieber tweets, do nothing. When a user loads their feed, we Merge their pre-computed feed with tweets from celebrities they follow (fetched at read time).
Caching Strategies
Redis List. Stores just Tweet IDs [101, 99, 85]. Not the content.
Redis Hash. Stores content (text, media_url). Key: tweet_id.
Stores user profile (name, avatar). Key: user_id.
Hydration: When reading feed, fetch IDs from Feed Cache, then fetch content from Tweet/User Cache in parallel (multiget).