Structured plan · 28 days · 5 tracks
The 28-Day Engineering Plan
One deep topic per day, rotating across Data Engineering, Machine Learning, AI & LLMs, OOP & Programming, and System Design. Each day is self-contained: a primary video, three readings, a hands-on exercise, a LeetCode problem, a reflection prompt, and ~1,000 words of distilled notes with diagrams.
28 days · self-paced · standalone, but stronger in sequence.
- D01
Day 01 — Transformer Internals — Attention, Embeddings, Positional Encoding
Every modern LLM, agent and RAG stack rests on the transformer. Knowing how Q/K/V flow through multi-head attention with residual streams is the unlock for prom…
- D02
Day 02 — Apache Spark Architecture — Driver, Executors, Shuffles, Catalyst
Spark is still the workhorse for petabyte ETL and feature engineering. Understanding the execution model is the difference between a 9-minute job and a 9-hour j…
- D03
Day 03 — Gradient Boosted Trees — XGBoost / LightGBM, Loss, Regularisation
On tabular data (still the majority of business ML) GBDTs beat deep nets and are the default at every credit, fraud and ads shop. Knowing the loss math and the…
- D04
Day 04 — Designing a URL Shortener at Scale — IDs, Storage, Cache, CDN
The classic warm-up for every system design loop. It exercises ID generation, key-value modelling, caching, hot-key handling, analytics and CDN edge — all trans…
- D05
Day 05 — SOLID Principles + Strategy / Factory / Observer Patterns in Python
Clean OO design isn't legacy folklore — it's how you keep agent frameworks, data pipelines, and microservices maintainable. SOLID + a handful of patterns is the…
- D06
Day 06 — Retrieval-Augmented Generation (RAG) End-to-End
RAG is the most-shipped LLM pattern in industry today — every internal knowledge bot, support agent and code-search tool is some flavour of it. Knowing chunking…
- D07
Day 07 — Apache Kafka Deep Dive — Partitions, Replication, Consumer Groups, Exactly-Once
Kafka is the universal log of modern data infra. Mastery of partition keys, consumer-group rebalancing, and EOS is the difference between a streaming system tha…
- D08
Day 08 — Embeddings, Vector Spaces & Contrastive Learning
Embeddings power search, RAG, recsys, clustering, deduplication and anomaly detection. Understanding *why* a contrastive objective produces useful vectors (vs s…
- D09
Day 09 — CAP, PACELC, Consensus — Raft, Quorums, and Realistic Trade-offs
Every distributed system you'll design has to make a CAP-style call. Understanding Raft / Paxos and quorum reads/writes lets you reason precisely instead of wav…
- D10
Day 10 — Concurrency Models — Threads, Asyncio, GIL, Actors
Every backend you build will block on IO or compute. Knowing *which* concurrency model to pick (and *why*) cuts latency by 10× and prevents the classes of bugs…
- D11
Day 11 — Function Calling, Tool Use, and Agentic Loops
Tool calling turns LLMs from text generators into autonomous workers. Mastering the agent loop (plan → call → observe → continue) is the bedrock of every Copilo…
- D12
Day 12 — Lakehouse Architecture — Delta Lake / Iceberg / Hudi, ACID on Object Storage
The lakehouse is now the default analytics substrate (Databricks, Snowflake Iceberg, Microsoft Fabric, AWS Glue Iceberg). ACID + time travel + schema evolution…
- D13
Day 13 — MLOps — Experiment Tracking, Model Registry, CI/CD for Models
Models that don't ship don't matter. MLOps is the engineering wrapper that turns notebook experiments into versioned, monitored, retrainable production assets.
- D14
Day 14 — Sharding, Replication & Multi-Region Databases
The moment one database can't hold your data, you shard. The moment one region can't serve your users, you go multi-region. Both decisions cascade into every ot…
- D15
Day 15 — Memory Model & Garbage Collection — Heap, GC, Leaks, Profiling
High-throughput services live and die by GC. Knowing the heap layout, GC algorithms and how to read a flame graph is the difference between '99p = 80 ms' and '9…
- D16
Day 16 — LLM Evaluation — Benchmarks, LLM-as-Judge, RAGAS, Inspect
If you can't measure it, you can't ship it. Modern LLM eval is its own discipline — task-specific benchmarks, golden sets, LLM judges with rubrics, and slice-le…
- D17
Day 17 — Streaming with Flink / Spark Structured Streaming — Watermarks & Windows
Real-time analytics, fraud, IOT, personalisation — all flow through stream processors. Watermarks, late data, and exactly-once semantics are the hard parts that…
- D18
Day 18 — Recommender Systems — Two-Tower, Multi-Stage Ranking
Recsys drives YouTube, TikTok, Amazon, Spotify, Pinterest — and is one of the highest-ROI ML problems anywhere. The two-tower retriever + multi-stage ranker is…
- D19
Day 19 — Designing a Chat / Messaging System at Scale
Chat exercises every hard design lever: fan-out vs fan-in, presence, ordering, push vs pull, media uploads, end-to-end encryption. WhatsApp / Slack / Teams patt…
- D20
Day 20 — Idiomatic Python (and C#) — Type Hints, Protocols, Dataclasses, Pattern Matching
Idiomatic code is the difference between a senior who writes maintainable systems and a junior who writes 'Python that runs'. Type hints + Protocols + dataclass…
- D21
Day 21 — LLM Serving — vLLM, Continuous Batching, KV Cache, Speculative Decoding
Inference cost and latency are the dominant operational concerns for any LLM product. vLLM-style continuous batching gives 5-20× throughput; speculative decodin…
- D22
Day 22 — Data Modelling — Dimensional, Data Vault, OBT for the Lakehouse Era
Storage is cheap, but a bad model rots a platform from inside. Knowing when to dimensional-model, when to use Data Vault, and when to flat-OBT determines whethe…
- D23
Day 23 — Multimodal LLMs — Vision-Language, Audio, and Tool-Use Combined
2025 is the year multimodal went default. GPT-4o, Claude 3.5 Sonnet vision, Gemini 1.5/2 — every serious agent now sees and hears. Understanding how visual toke…
- D24
Day 24 — Data Governance, Lineage, Quality — Catalogs, Contracts, Observability
At scale, governance isn't bureaucracy; it's how you keep trust in your data. Lineage, quality contracts, and observability tools are now first-class platform c…
- D25
Day 25 — Practical Fine-Tuning — LoRA / QLoRA, PEFT, Instruction Datasets, DPO
Fine-tuning is back as the way to specialise models for your domain and reduce inference cost. LoRA + QLoRA make it tractable on commodity GPUs; DPO / ORPO have…
- D26
Day 26 — Caching Strategies — CDN, Application Cache, Cache-Aside, Read-Through, Write-Through
Caching is the single biggest lever for latency and cost. Cache invalidation is one of two hard problems in CS. Knowing the standard patterns + their failure mo…
- D27
Day 27 — API Design — REST, GraphQL, gRPC; Versioning, Pagination, Errors
APIs are contracts that outlive their authors. Bad API design ripples for years; good API design quietly enables product velocity. Knowing when to pick REST / G…
- D28
Day 28 — Putting It Together — A Production AI Agent (Capstone Day)
Final synthesis day. You've covered transformers, RAG, tools, evals, fine-tuning, serving, multimodal. Today you combine them into one complete agent design — a…
All 28 days available · Looking for the rest of the blog? Back to the feed →