ai mlintermediate 6m2026-06-09

The 48-Session Learning Series

A 48-session deep-prep curriculum across LLMs, ML, system design, data engineering and OOP. From Jun 10 to Jul 17 2026 — five and a half weeks of focused study.

The 48-Session Learning Series

Welcome to the 48-Session Learning Series — a structured curriculum I'm following from Jun 10 → Jul 17 2026 (about five and a half weeks). It's the rebuilt-and-paced version of an earlier 28-day deep-prep plan, broken into 48 thinner sessions so each topic gets the breathing room it deserves.

Why 48 sessions

The original 28-day plan looked great on paper. In practice, several days were trying to cram a full afternoon of material into a single evening. Two-hour windows for "transformers from scratch" or "designing Twitter" don't actually deliver the depth they promise.

So I split the heavy topics in two, extended a few into adjacent specialist sessions, and ended up at 48 thinner sessions across the same surface area. Same goal — be genuinely fluent across the LLM / ML / system-design / data-engineering / OOP stack. Better pacing.

Rhythm

Weekday slot: 1 session · 18:00–20:00 IST
Weekend slot: 2 sessions · 09:00–11:00 + 14:30–16:30 IST
Cadence: 9 sessions / week
Track rotation: always interleaved — never two same-track sessions back-to-back

Tracks

🧠 LLM — LLMs & agents (13 sessions)
📈 ML — Machine learning (7 sessions)
🏗️ SYS — System design (10 sessions)
🗂️ DE — Data engineering (11 sessions)
🧱 OOP — OOP & languages (7 sessions)

Each session contains

Front-matter — date, time, track, parent topic, estimated read 2 h
Agenda — 5 bullets, what we'll cover
Pre-read — 3–5 papers / blog posts / official docs to skim before
Deep dive — explanations, math where useful, ASCII diagrams, code, real production numbers
Reading material — books / papers / docs to come back to later
In-depth research material — curated external links
Video reference — one YouTube URL, hand-picked
LeetCode problem — URL + difficulty + 2-line hint
Post-session checklist — what you should be able to do / explain

The schedule

#	Track	Title
01	🧠 LLM	Transformers Part 1 — Attention, Q/K/V, Multi-Head
02	🗂️ DE	Spark Part 1 — Driver, Executors, RDDs, Lazy Evaluation
03	🏗️ SYS	URL Shortener Part 1 — Numbers, IDs, Storage
04	🧱 OOP	SOLID Part 1 — SRP, OCP, LSP with Python Examples
05	📈 ML	Gradient Boosted Trees Part 1 — Boosting Intuition, Trees, Loss
06	🧠 LLM	Transformers Part 2 — Positional Encoding, RoPE, MLP, LayerNorm
07	🗂️ DE	Spark Part 2 — Shuffles, Catalyst, AQE, Tuning
08	🏗️ SYS	URL Shortener Part 2 — Cache, CDN, Hot Keys, Abuse
09	🧠 LLM	RAG Part 1 — Why, Chunking, Embeddings, Vector Stores
10	🗂️ DE	Kafka Part 1 — Brokers, Topics, Partitions, Producers
11	🧱 OOP	SOLID Part 2 — ISP, DIP, and Design Patterns (Strategy, Factory, Observer)
12	🏗️ SYS	CAP, PACELC, Quorums — How Distributed Systems Actually Trade Off
13	🧠 LLM	RAG Part 2 — Retrieval, Re-Ranking, Generation, Evaluation
14	📈 ML	GBDT Part 2 — XGBoost, LightGBM, Regularisation, In-Practice Tuning
15	🗂️ DE	Kafka Part 2 — Replication, ISR, Consumer Groups, Exactly-Once
16	🧱 OOP	Concurrency Models — Threads, Asyncio, GIL, Actors
17	🧠 LLM	Embeddings, Vector Spaces, Contrastive Learning
18	🏗️ SYS	Sharding & Replication — Partition Keys, Hot Spots, Multi-Region
19	🗂️ DE	Lakehouse — Delta Lake, Iceberg, Hudi, ACID on Object Storage
20	🧱 OOP	Memory Model, GC, Heap, GC Leaks, Profiling
21	🧠 LLM	Function Calling, Tool Use, Agentic Loops
22	🏗️ SYS	Designing a Chat System — Connections, Fanout, Storage, Delivery
23	🗂️ DE	Streaming with Flink/Spark — Watermarks, Windows, State
24	📈 ML	MLOps — Experiment Tracking, Model Registry, CI/CD for Models
25	🧠 LLM	LLM Evaluation — Benchmarks, LLM-as-Judge, RAGAS, Inspect
26	🏗️ SYS	Caching Strategies — CDN, Application Cache, Cache-Aside, Read-Through
27	📈 ML	Recommender Systems — Two-Tower, Multi-Stage Ranking
28	🗂️ DE	Data Modelling — Dimensional, Data Vault, OBT for the Lakehouse
29	🧱 OOP	Idiomatic Python (and a Touch of C++) — Type Hints, Protocols, Dataclasses
30	🧠 LLM	LLM Serving Part 1 — vLLM, KV Cache, Continuous Batching
31	🏗️ SYS	API Design — REST, GraphQL, gRPC, Versioning, Pagination, Errors
32	🗂️ DE	Data Governance — Lineage, Quality, Catalogs, Contracts, Observability
33	📈 ML	Practical Fine-Tuning — LoRA, QLoRA, PEFT, Instruction Datasets
34	🧠 LLM	LLM Serving Part 2 — Speculative Decoding, Quantisation, Throughput
35	🏗️ SYS	News Feed / Timeline System — Fanout-on-Read vs Write, Ranking
36	🧠 LLM	Multimodal LLMs — Vision, Language, Audio, Tool Use Combined
37	🗂️ DE	Petabyte Cost Optimisation — Compression, Partitioning, Z-Order, File Sizing
38	📈 ML	Feature Engineering & Feature Stores at Scale
39	🏗️ SYS	Designing a Distributed Job Queue — Reliability, Backoff, Idempotency
40	🗂️ DE	Change Data Capture — Debezium, Outbox Pattern, Snapshot+Stream
41	🧱 OOP	Testing, Mocks, Property-Based Tests, Mutation Testing
42	🧠 LLM	Prompt Engineering at Production Scale — Templates, Caching, Drift
43	📈 ML	Online Learning, Bandits, Counterfactual Evaluation
44	🏗️ SYS	Designing a Search Engine — Crawl, Index, Query, Ranking
45	🧠 LLM	LLM Safety — Jailbreaks, Prompt Injection, Output Filtering, Red-Teaming
46	🗂️ DE	Observability for Data Pipelines — SLAs, SLOs, Freshness, Data Tests
47	🧱 OOP	Production Error Handling — Retries, Circuit Breakers, Timeouts, Bulkheads
48	🧠 LLM	Capstone — Building a Production AI Agent End-to-End

Source

Repo: dinesh-coderepo/preparation/48-sessions

If you want to follow along, every session is open. Skip the ones you already know; double down on the ones that bite.

← previous

Capstone — Building a Production AI Agent End-to-End