ai ml intermediate 6m2026-06-09 The 48-Session Learning Series A 48-session deep-prep curriculum across LLMs, ML, system design, data engineering and OOP. From Jun 10 to Jul 17 2026 — five and a half weeks of focused study.
Welcome to the 48-Session Learning Series — a structured curriculum I'm following from Jun 10 → Jul 17 2026 (about five and a half weeks). It's the rebuilt-and-paced version of an earlier 28-day deep-prep plan, broken into 48 thinner sessions so each topic gets the breathing room it deserves.
The original 28-day plan looked great on paper. In practice, several days were trying to cram a full afternoon of material into a single evening. Two-hour windows for "transformers from scratch" or "designing Twitter" don't actually deliver the depth they promise.
So I split the heavy topics in two, extended a few into adjacent specialist sessions, and ended up at 48 thinner sessions across the same surface area. Same goal — be genuinely fluent across the LLM / ML / system-design / data-engineering / OOP stack. Better pacing.
Weekday slot: 1 session · 18:00–20:00 IST
Weekend slot: 2 sessions · 09:00–11:00 + 14:30–16:30 IST
Cadence: 9 sessions / week
Track rotation: always interleaved — never two same-track sessions back-to-back
🧠 LLM — LLMs & agents (13 sessions)
📈 ML — Machine learning (7 sessions)
🏗️ SYS — System design (10 sessions)
🗂️ DE — Data engineering (11 sessions)
🧱 OOP — OOP & languages (7 sessions)
Front-matter — date, time, track, parent topic, estimated read 2 h
Agenda — 5 bullets, what we'll cover
Pre-read — 3–5 papers / blog posts / official docs to skim before
Deep dive — explanations, math where useful, ASCII diagrams, code, real production numbers
Reading material — books / papers / docs to come back to later
In-depth research material — curated external links
Video reference — one YouTube URL, hand-picked
LeetCode problem — URL + difficulty + 2-line hint
Post-session checklist — what you should be able to do / explain
# Track Title 01 🧠 LLM Transformers Part 1 — Attention, Q/K/V, Multi-Head 02 🗂️ DE Spark Part 1 — Driver, Executors, RDDs, Lazy Evaluation 03 🏗️ SYS URL Shortener Part 1 — Numbers, IDs, Storage 04 🧱 OOP SOLID Part 1 — SRP, OCP, LSP with Python Examples 05 📈 ML Gradient Boosted Trees Part 1 — Boosting Intuition, Trees, Loss 06 🧠 LLM Transformers Part 2 — Positional Encoding, RoPE, MLP, LayerNorm 07 🗂️ DE Spark Part 2 — Shuffles, Catalyst, AQE, Tuning 08 🏗️ SYS URL Shortener Part 2 — Cache, CDN, Hot Keys, Abuse 09 🧠 LLM RAG Part 1 — Why, Chunking, Embeddings, Vector Stores 10 🗂️ DE Kafka Part 1 — Brokers, Topics, Partitions, Producers 11 🧱 OOP SOLID Part 2 — ISP, DIP, and Design Patterns (Strategy, Factory, Observer) 12 🏗️ SYS CAP, PACELC, Quorums — How Distributed Systems Actually Trade Off 13 🧠 LLM RAG Part 2 — Retrieval, Re-Ranking, Generation, Evaluation 14 📈 ML GBDT Part 2 — XGBoost, LightGBM, Regularisation, In-Practice Tuning 15 🗂️ DE Kafka Part 2 — Replication, ISR, Consumer Groups, Exactly-Once 16 🧱 OOP Concurrency Models — Threads, Asyncio, GIL, Actors 17 🧠 LLM Embeddings, Vector Spaces, Contrastive Learning 18 🏗️ SYS Sharding & Replication — Partition Keys, Hot Spots, Multi-Region 19 🗂️ DE Lakehouse — Delta Lake, Iceberg, Hudi, ACID on Object Storage 20 🧱 OOP Memory Model, GC, Heap, GC Leaks, Profiling 21 🧠 LLM Function Calling, Tool Use, Agentic Loops 22 🏗️ SYS Designing a Chat System — Connections, Fanout, Storage, Delivery 23 🗂️ DE Streaming with Flink/Spark — Watermarks, Windows, State 24 📈 ML MLOps — Experiment Tracking, Model Registry, CI/CD for Models 25 🧠 LLM LLM Evaluation — Benchmarks, LLM-as-Judge, RAGAS, Inspect 26 🏗️ SYS Caching Strategies — CDN, Application Cache, Cache-Aside, Read-Through 27 📈 ML Recommender Systems — Two-Tower, Multi-Stage Ranking 28 🗂️ DE Data Modelling — Dimensional, Data Vault, OBT for the Lakehouse 29 🧱 OOP Idiomatic Python (and a Touch of C++) — Type Hints, Protocols, Dataclasses 30 🧠 LLM LLM Serving Part 1 — vLLM, KV Cache, Continuous Batching 31 🏗️ SYS API Design — REST, GraphQL, gRPC, Versioning, Pagination, Errors 32 🗂️ DE Data Governance — Lineage, Quality, Catalogs, Contracts, Observability 33 📈 ML Practical Fine-Tuning — LoRA, QLoRA, PEFT, Instruction Datasets 34 🧠 LLM LLM Serving Part 2 — Speculative Decoding, Quantisation, Throughput 35 🏗️ SYS News Feed / Timeline System — Fanout-on-Read vs Write, Ranking 36 🧠 LLM Multimodal LLMs — Vision, Language, Audio, Tool Use Combined 37 🗂️ DE Petabyte Cost Optimisation — Compression, Partitioning, Z-Order, File Sizing 38 📈 ML Feature Engineering & Feature Stores at Scale 39 🏗️ SYS Designing a Distributed Job Queue — Reliability, Backoff, Idempotency 40 🗂️ DE Change Data Capture — Debezium, Outbox Pattern, Snapshot+Stream 41 🧱 OOP Testing, Mocks, Property-Based Tests, Mutation Testing 42 🧠 LLM Prompt Engineering at Production Scale — Templates, Caching, Drift 43 📈 ML Online Learning, Bandits, Counterfactual Evaluation 44 🏗️ SYS Designing a Search Engine — Crawl, Index, Query, Ranking 45 🧠 LLM LLM Safety — Jailbreaks, Prompt Injection, Output Filtering, Red-Teaming 46 🗂️ DE Observability for Data Pipelines — SLAs, SLOs, Freshness, Data Tests 47 🧱 OOP Production Error Handling — Retries, Circuit Breakers, Timeouts, Bulkheads 48 🧠 LLM Capstone — Building a Production AI Agent End-to-End
Repo: dinesh-coderepo/preparation/48-sessions
If you want to follow along, every session is open. Skip the ones you already know; double down on the ones that bite.