Design a
Chat App
Discord, WhatsApp, Slack. Real-time messages, read receipts, and scaling to millions of concurrent connections. HTTP is not enough here.
Requirements
Functional
- 1. 1:1 Chat and Group Chat.
- 2. Messages should be delivered instantly (Real-time).
- 3. Message History (Persistence).
- 4. Online/Offline Status (Presence).
- 5. "Seen" status (Read Receipts).
Non-Functional
- 1. Low Latency: < 100ms for delivery.
- 2. High Scale: 10M concurrent users.
- 3. Consistency: Messages must appear in order.
WebSocket Architecture
Stateful Connections
We cannot use simple REST Load Balancing. User A is connected to Server 1. User B is on Server 2.
Data Model (Cassandra)
Chat is Write-Heavy. RDBMS (Postgres) struggles with billions of inserts. Discord uses Cassandra (ScyllaDB).
Messages Table
Why Cassandra?
- Log-structured merge tree (LSM) engine = Fast writes.
- Partition by `channel_id` keeps all messages for a chat together on disk.
- Cluster by `message_id` keeps them sorted by time.
- Easy horizontal scaling.
Message Flow
User A sends message to Chat Server via WebSocket.
Chat Server saves message to Cassandra (Async).
Chat Server finds which server User B is on (using Redis/Zookeeper).
Chat Server forwards message to that server.
That server pushes to User B via WebSocket.
Unread Counters
The Scalability Killer
Counting unread messages for every user in every channel is expensive. SELECT count(*) is too slow.
Optimized Approach
Store `last_read_message_id` for every user-channel pair.
Discord's Trick
They don't store the count. They client-side calculate it or only load it when you hover the channel list. Push notifications just say "New Message", not "5 New Messages" (sometimes).
Presence (Online/Offline)
How do I know if my friend is online?
Heartbeats
Client sends a "ping" every 30s. Server updates timestamp in Redis.
Fan-out
When status changes, who do we tell?
2. Filter only those who are currently online.
3. Push "User A is Online" event to them.