system designintermediate 12m2026-06-09

News Feed / Timeline System — Fanout-on-Read vs Write, Ranking

Session 35 of the 48-session learning series.

Date: Mon, 2026-07-06 · Time: 18:00–20:00 IST · Track: 🏗️ System Design (SYS) · Parent 28-day topic: Day 19 · Est. read: 2 h

Why this session matters

This is Session 35 of 48 in the System Design track. Every consumer app — Twitter, Instagram, LinkedIn, TikTok — has a feed at its heart. The architectural choice between fanout-on-write and fanout-on-read shapes the entire system's cost, latency, and scale ceiling. Interview-favourite for a reason.

Agenda

Push (fanout-on-write) vs pull (fanout-on-read) — when each wins
Hybrid fanout — the celebrity user problem
Feed ranking — chrono → weighted → ML-ranked
Read path optimisation — cache, materialised feeds
Write path scalability — sharding, async pipelines, fan-out service

Pre-read (skim before the session)

Deep dive

1. The two fundamental approaches

Fanout-on-write (push) — when user A posts, the system writes A's post into the feed-cache of every follower immediately. Read is then a cheap lookup.

Fanout-on-read (pull) — when user X opens their feed, the system queries posts from everyone X follows and merges them in real time.

PUSH                              PULL
post → fanout                     post → store
   ├→ follower1.feed                ↓
   ├→ follower2.feed              feed open?
   └→ ... N followers              ↓
                                  query all-followed
read = single cache hit           merge top-K

The classic trade-off:

Model	Write cost	Read cost	Stale risk
Push	O(followers)	O(1)	low
Pull	O(1)	O(followed × posts/sec)	high

2. Who uses what

Twitter (early) — pull for celebrities, push for everyone else. Hybrid. Instagram — primarily push, ranking layer on top. Facebook — pull-heavy with aggressive caching. LinkedIn — hybrid; ranks at read time.

In 2026, the trend is hybrid with ML ranking dominating the experience over raw chronology.

3. The celebrity user problem

Lady Gaga has 80M followers. If she posts and we push to all 80M feeds: 80M writes + 80M cache invalidations. Slow, expensive, error-prone.

Solutions:

Pull-on-celebrities, push-on-normals — define a threshold (e.g. >1M followers); their posts are pulled at read time, mixed with pushed posts from normal accounts.
Lazy push — only push to active followers (logged in last 7 days).
Async fanout with priority queues — push immediately to active power-users; trickle to inactive followers.
Materialise per-celebrity feed shard — followers don't get the post in their feed; their feed-render pulls the celebrity post separately.

4. The data model

Per user, conceptually:

inbox: deque[(post_id, ts, score)]  # max ~500–2000 items
following: set[user_id]
followers: set[user_id]

In practice:

inbox lives in Redis (or per-user partitioned Kafka).
following/followers in graph DB or sharded RDBMS.
post_body in object store; ID is the cache key.

5. Write path

user posts
   ↓
write to posts.db (durable)
   ↓
publish event to Kafka topic "post.created"
   ↓
fanout service consumes:
   - look up followers
   - for active followers: push post_id into their inbox cache
   - for inactive: skip (lazy-load on next visit)
   - if celebrity: skip fanout entirely

Fan-out service is the heavy-iron. Typically: thousands of consumers, sharded by follower-id, retry queue for failed pushes.

6. Read path

user opens feed
   ↓
fetch user.inbox from Redis (push results)
   ↓
fetch celebrity_posts where author in user.celebrities (pull)
   ↓
fetch newly-posted items from active follows not yet pushed
   ↓
merge + rank + paginate
   ↓
hydrate post bodies, author info, engagement counts

Latency budget for the whole thing: < 200 ms. Each step has tight sub-budgets.

7. Ranking — chronological → ML

Era 1: pure chronological. Easy, predictable, falls apart at scale (firehose).

Era 2: weighted recency:

score = engagement_score * exp(-age_hours / half_life)

Tune by content type; tweak constantly.

Era 3: ML-ranked (where everyone is now):

Retrieve candidates (push inbox + pull celebrities + suggested = ~1000).
Score each with a click-prediction model (S27).
Re-rank for diversity, freshness, advertiser slots.
Serve top-50.

Per-request: ~5 ms ranking budget. Pre-compute features; serve model with low-latency stack.

8. Counters, likes, replies

Engagement signals power ranking. Showing accurate counts to millions in real time is its own subproblem:

Sharded counters — partition by post_id % N; merge at read.
Approximate counts — HyperLogLog for unique likers if exact not needed.
Lazy increment — write to Kafka; aggregator updates counter every N seconds.
Caching with staleness — TTL 30s on "like count"; user perceives "live".

9. Storage tiering

Hot — last 24 hours. In Redis + memory. Reads bypass disk.
Warm — last 30 days. In SSD-backed RDBMS / NoSQL.
Cold — older. In object store; rarely queried.

Most reads hit hot. Cost of cold storage is negligible; throwing posts away makes legal/UX problems.

10. Failure modes

Fan-out service backlog — bursty viral post stalls behind queue. Mitigate with sharded queues, autoscaling, backpressure.
Inbox cache cold — Redis flush wipes feed; cold start storms backend. Mitigate with cache warmer + soft fallback.
Celebrity broadcast storm — every follower reloads at the same moment. Mitigate with stagger + CDN-cached celebrity post.
Schema migration — feed format change; old client crashes. Mitigate with version field; tolerate forever.

11. Notification fan-out

Same shape as feed fan-out but lower latency, higher priority. Push notifications go through:

APNS / FCM / WebPush adapters.
Per-user preferences (do not disturb).
De-dup (don't notify same like twice).
Rate limit per user.

Often a separate service from feed fanout, sharing the same event bus.

12. Reality check

Build order for a new feed:

Pull-only, paginated, no ranking — get product launched.
Add Redis cache for hot posts; TTL = a few minutes.
Add push fanout for active users; keep pull as fallback.
Add celebrity special-case at the threshold that bites you.
Add ML ranker once you have engagement data to learn from.

Don't over-engineer day 1. But know where the road leads.

Link: https://leetcode.com/problems/design-twitter/
Difficulty: Medium
Why this problem: Implement post + follow + topK feed merge. The model problem for fanout systems. (Yes, we saw this in S27 — different lens here, system-design framing.)
Time-box: 30 minutes. Look up the editorial only after.

Post-session checklist

By the end of this session you should be able to:

Compare push vs pull fanout on write cost, read cost, staleness.
Identify the celebrity-user problem and 3 mitigations.
Sketch the write path including Kafka + fanout service + Redis inbox.
Sketch the read path including merge of pushed + pulled + suggested.
Explain why chronological-only feeds fail at scale.
Solve design-twitter — heap-based k-way merge of per-user post streams.

Generated from sessions_data.py + content_part*.py. To edit a video / leetcode / title, edit the data file and re-run write_sessions.py.

← previous

LLM Serving Part 2 — Speculative Decoding, Quantisation, Throughput

Multimodal LLMs — Vision, Language, Audio, Tool Use Combined