Search Tech Journey

Find topics, journeys and posts

back to blog
ai mlintermediate 6m2026-06-09

The 48-Session Learning Series

A 48-session deep-prep curriculum across LLMs, ML, system design, data engineering and OOP. From Jun 10 to Jul 17 2026 — five and a half weeks of focused study.

The 48-Session Learning Series

Welcome to the 48-Session Learning Series — a structured curriculum I'm following from Jun 10 → Jul 17 2026 (about five and a half weeks). It's the rebuilt-and-paced version of an earlier 28-day deep-prep plan, broken into 48 thinner sessions so each topic gets the breathing room it deserves.

Why 48 sessions

The original 28-day plan looked great on paper. In practice, several days were trying to cram a full afternoon of material into a single evening. Two-hour windows for "transformers from scratch" or "designing Twitter" don't actually deliver the depth they promise.

So I split the heavy topics in two, extended a few into adjacent specialist sessions, and ended up at 48 thinner sessions across the same surface area. Same goal — be genuinely fluent across the LLM / ML / system-design / data-engineering / OOP stack. Better pacing.

Rhythm

  • Weekday slot: 1 session · 18:00–20:00 IST
  • Weekend slot: 2 sessions · 09:00–11:00 + 14:30–16:30 IST
  • Cadence: 9 sessions / week
  • Track rotation: always interleaved — never two same-track sessions back-to-back

Tracks

  • 🧠 LLM — LLMs & agents (13 sessions)
  • 📈 ML — Machine learning (7 sessions)
  • 🏗️ SYS — System design (10 sessions)
  • 🗂️ DE — Data engineering (11 sessions)
  • 🧱 OOP — OOP & languages (7 sessions)

Each session contains

  1. Front-matter — date, time, track, parent topic, estimated read 2 h
  2. Agenda — 5 bullets, what we'll cover
  3. Pre-read — 3–5 papers / blog posts / official docs to skim before
  4. Deep dive — explanations, math where useful, ASCII diagrams, code, real production numbers
  5. Reading material — books / papers / docs to come back to later
  6. In-depth research material — curated external links
  7. Video reference — one YouTube URL, hand-picked
  8. LeetCode problem — URL + difficulty + 2-line hint
  9. Post-session checklist — what you should be able to do / explain

The schedule

#TrackTitle
01🧠 LLMTransformers Part 1 — Attention, Q/K/V, Multi-Head
02🗂️ DESpark Part 1 — Driver, Executors, RDDs, Lazy Evaluation
03🏗️ SYSURL Shortener Part 1 — Numbers, IDs, Storage
04🧱 OOPSOLID Part 1 — SRP, OCP, LSP with Python Examples
05📈 MLGradient Boosted Trees Part 1 — Boosting Intuition, Trees, Loss
06🧠 LLMTransformers Part 2 — Positional Encoding, RoPE, MLP, LayerNorm
07🗂️ DESpark Part 2 — Shuffles, Catalyst, AQE, Tuning
08🏗️ SYSURL Shortener Part 2 — Cache, CDN, Hot Keys, Abuse
09🧠 LLMRAG Part 1 — Why, Chunking, Embeddings, Vector Stores
10🗂️ DEKafka Part 1 — Brokers, Topics, Partitions, Producers
11🧱 OOPSOLID Part 2 — ISP, DIP, and Design Patterns (Strategy, Factory, Observer)
12🏗️ SYSCAP, PACELC, Quorums — How Distributed Systems Actually Trade Off
13🧠 LLMRAG Part 2 — Retrieval, Re-Ranking, Generation, Evaluation
14📈 MLGBDT Part 2 — XGBoost, LightGBM, Regularisation, In-Practice Tuning
15🗂️ DEKafka Part 2 — Replication, ISR, Consumer Groups, Exactly-Once
16🧱 OOPConcurrency Models — Threads, Asyncio, GIL, Actors
17🧠 LLMEmbeddings, Vector Spaces, Contrastive Learning
18🏗️ SYSSharding & Replication — Partition Keys, Hot Spots, Multi-Region
19🗂️ DELakehouse — Delta Lake, Iceberg, Hudi, ACID on Object Storage
20🧱 OOPMemory Model, GC, Heap, GC Leaks, Profiling
21🧠 LLMFunction Calling, Tool Use, Agentic Loops
22🏗️ SYSDesigning a Chat System — Connections, Fanout, Storage, Delivery
23🗂️ DEStreaming with Flink/Spark — Watermarks, Windows, State
24📈 MLMLOps — Experiment Tracking, Model Registry, CI/CD for Models
25🧠 LLMLLM Evaluation — Benchmarks, LLM-as-Judge, RAGAS, Inspect
26🏗️ SYSCaching Strategies — CDN, Application Cache, Cache-Aside, Read-Through
27📈 MLRecommender Systems — Two-Tower, Multi-Stage Ranking
28🗂️ DEData Modelling — Dimensional, Data Vault, OBT for the Lakehouse
29🧱 OOPIdiomatic Python (and a Touch of C++) — Type Hints, Protocols, Dataclasses
30🧠 LLMLLM Serving Part 1 — vLLM, KV Cache, Continuous Batching
31🏗️ SYSAPI Design — REST, GraphQL, gRPC, Versioning, Pagination, Errors
32🗂️ DEData Governance — Lineage, Quality, Catalogs, Contracts, Observability
33📈 MLPractical Fine-Tuning — LoRA, QLoRA, PEFT, Instruction Datasets
34🧠 LLMLLM Serving Part 2 — Speculative Decoding, Quantisation, Throughput
35🏗️ SYSNews Feed / Timeline System — Fanout-on-Read vs Write, Ranking
36🧠 LLMMultimodal LLMs — Vision, Language, Audio, Tool Use Combined
37🗂️ DEPetabyte Cost Optimisation — Compression, Partitioning, Z-Order, File Sizing
38📈 MLFeature Engineering & Feature Stores at Scale
39🏗️ SYSDesigning a Distributed Job Queue — Reliability, Backoff, Idempotency
40🗂️ DEChange Data Capture — Debezium, Outbox Pattern, Snapshot+Stream
41🧱 OOPTesting, Mocks, Property-Based Tests, Mutation Testing
42🧠 LLMPrompt Engineering at Production Scale — Templates, Caching, Drift
43📈 MLOnline Learning, Bandits, Counterfactual Evaluation
44🏗️ SYSDesigning a Search Engine — Crawl, Index, Query, Ranking
45🧠 LLMLLM Safety — Jailbreaks, Prompt Injection, Output Filtering, Red-Teaming
46🗂️ DEObservability for Data Pipelines — SLAs, SLOs, Freshness, Data Tests
47🧱 OOPProduction Error Handling — Retries, Circuit Breakers, Timeouts, Bulkheads
48🧠 LLMCapstone — Building a Production AI Agent End-to-End

Source

Repo: dinesh-coderepo/preparation/48-sessions

If you want to follow along, every session is open. Skip the ones you already know; double down on the ones that bite.