Day 19 — Designing a Chat / Messaging System at Scale
Chat exercises every hard design lever: fan-out vs fan-in, presence, ordering, push vs pull, media uploads, end-to-end encryption. WhatsApp / Slack / Teams patt…
Chat looks deceptively like CRUD on messages. In reality it's a real-time, ordered, multi-device, multi-region, encrypted distributed system. Let's tour the levers.
🧠 Concept
Why it matters & the mental model.
1. Connection layer
- Long-poll / SSE: easy, works through proxies, one-way (server → client).
- WebSocket: bidirectional, persistent — the modern default for chat.
- MQTT (Facebook Messenger): lightweight pub/sub, great on mobile.
- gRPC streaming: structured, typed, for internal services.
Stateful connection servers (a la Erlang VM, Akka, Go) hold millions of connections per box and route messages to the right user's connection.
2. The architecture
3. Message flow
- Client A sends to edge over WS.
- Edge → router; router looks up B's connection (in Redis:
user_id → edge_node). - Forward to B's edge → push down WS.
- Persist to storage (idempotent on message_id).
- If B offline → enqueue push notification.
4. Ordering
- Per-conversation ordering (not global) is enough.
- Server-assigned message_id with monotonic timestamp per conversation (Snowflake).
- Client treats out-of-order arrivals as "merge" using id order.
🛠 Deep Dive
Internals, code, architecture.
5. Storage
Discord moved from Cassandra → ScyllaDB for messages (low-latency wide-column, range scans by conversation_id + ts). Schema:
- Partition key:
conversation_id. - Cluster key:
message_id DESC(so reading recent is fast). - TTL only if retention policy allows.
6. Group / channel fan-out
Two extremes:
- Fan-out on write: when a message is sent to a group of N, write N inbox rows (one per user) → cheap reads, expensive writes for big groups. Twitter classic.
- Fan-out on read: write one row, on read each user's timeline service fetches subscribed channels and merges → cheap writes, expensive celebrity reads.
- Hybrid: small groups fan-out write, big channels fan-out read. WhatsApp / Slack do variants of this.
7. Presence
Lightweight signal: heartbeat from client every ~30s → Redis with TTL. Subscribers (chat list) get presence updates via pub/sub. Don't store presence in main DB.
8. Push notifications
- Persistent connection wakes the device (silent push or VoIP push on iOS).
- Display via APNs / FCM with template that the app expands locally (so notification reflects updated data).
- Critical path: do NOT block message ack on push delivery (queue async).
🚀 In Practice
Trade-offs, exercises, what to ship today.
9. Media (images, video, files)
- Client uploads to signed URL → object store (S3) directly.
- Message contains pointer + thumbnail.
- Recipient downloads from CDN.
- Optional virus scan + content moderation pipeline in async.
10. E2EE (Signal protocol)
- Double Ratchet + X3DH key agreement.
- Server stores ciphertext only (and prekeys).
- Multi-device: each device has its own key; sender encrypts once per device of each recipient (Sender Keys for groups).
- Trade-off: search and ML on content are impossible server-side (push to client).
11. Failure modes
- Edge node dies: reconnect to another; presence updates; conversation continues.
- Message dedup: idempotent on (sender_id, client_msg_id).
- Out-of-order during reconnect: catch-up cursor (resume from
last_seen_message_id).
12. What to take away
Strong answers cover: connection protocol + scale of connections per box, ordering scheme, storage schema with partition key, fan-out hybrid, presence in Redis, push pipeline, and one failure recovery story.
Resources
- 🎥 System Design Interview — Chat (Alex Xu)
- 📖 Discord — How we store billions of messages
- 📖 WhatsApp Engineering — 1M connections per server
- 📖 Signal Protocol — Double Ratchet
Practice Problem: Design Twitter (Medium)