Kafka 101 for ML engineers
Topics, partitions, consumer groups — the parts of Kafka that actually matter when you put ML features behind it.
ML systems don't fail because the model is bad. They fail because the feature pipeline upstream is bad. Kafka is the single most common piece of that pipeline. Here's the part you actually need to know.
The mental model
Kafka is a durable, partitioned, replayable log. That's the whole product.
- A topic is a named log.
- A topic is split into partitions, each an ordered append-only file.
- Producers write to a partition (chosen by key hash).
- Consumers read sequentially and commit their offset back to Kafka.
flowchart LR
P1[Producer A] --> T1[(Topic: user-events)]
P2[Producer B] --> T1
T1 --> CG1[Consumer Group: feature-store]
T1 --> CG2[Consumer Group: realtime-ranker]
CG1 --> FS[(Feature store)]
CG2 --> RR[Realtime ranker]
Partition keys: get this right
Kafka guarantees order within a partition, not across. If you key by user_id, every event for that user lands on the same partition — order is preserved per user. Forget the key and you'll be debugging "why did the model see a click before the impression" for a week.
Consumer groups in one line
Consumers in the same group split partitions among themselves. Consumers in different groups each read the whole topic.
That single property is what lets your feature store, your monitoring, and your retraining job all coexist on one topic without stepping on each other.
Things that will bite you
The minimum useful diagram
For ML use cases, picture three flows on top of every topic:
- Hot path — realtime feature aggregation feeding the ranker.
- Warm path — minute-level rollups into your feature store.
- Cold path — periodic dump to your lake for training.
Same topic. Three consumer groups. Three SLAs. That's Kafka for ML in one sentence.
Mid-article nudge
Liked this so far? Subscribe and the next deep dive lands in your inbox Monday.