Idiomatic Python (and a Touch of C++) — Type Hints, Protocols, Dataclasses
Session 29 of the 48-session learning series.
Date: Thu, 2026-07-02 · Time: 18:00–20:00 IST · Track: 🧱 OOP & Languages (OOP) · Parent 28-day topic: Day 20 · Est. read: 2 h
Why this session matters
This is Session 29 of 48 in the OOP & Languages track. "Idiomatic" Python looks like Python; novice Python looks like Java written in Python's syntax. The gap shows up in code reviews, in interviews, and in maintainability over years. A bit of modern C++ contrast keeps you sharp on what "fast" and "low-level" actually mean.
Agenda
- Type hints, generics,
TypeVar,ParamSpec— modern Python typing - Protocols vs ABCs — structural vs nominal subtyping
- Dataclasses, frozen, slots; when to use Pydantic instead
- Pythonic idioms — comprehensions, context managers, generators, dunder methods
- A short detour: equivalent C++ idioms (RAII, templates,
concept)
Pre-read (skim before the session)
- PEP 484 — Type hints
- PEP 544 — Protocols (structural typing)
- PEP 695 — Type Parameter Syntax (3.12+)
- Brett Slatkin — Effective Python (2nd ed.)
Deep dive
1. Why type hints
They don't enforce anything at runtime. So why bother?
- mypy / pyright catch a real class of bugs (
Nonepassed to astrparam) at PR time. - IDE autocomplete becomes useful — methods, fields, return values all surface.
- Documentation that doesn't drift — the type is the spec.
- Refactoring is safe — rename a field and the type checker finds every caller.
Cost: ~10% more keystrokes; pays back within a quarter on any non-trivial codebase.
2. Modern typing essentials
from typing import Optional, Sequence, Iterator, Callable
from collections.abc import Mapping
def top_k(items: Sequence[int], k: int = 10) -> list[int]:
return sorted(items, reverse=True)[:k]
def parse(text: str) -> Optional[dict]:
...
def stream() -> Iterator[bytes]:
...
Handler = Callable[[str, int], bool]
Python 3.9+: list[int] instead of List[int]. 3.10+: int | None instead of Optional[int]. 3.12+: cleaner type syntax (PEP 695).
3. Generics and TypeVar
from typing import TypeVar
T = TypeVar("T")
def first[T](items: list[T]) -> T: # 3.12+ syntax
return items[0]
def first_legacy(items: list[T]) -> T: # pre-3.12
return items[0]
Use generics on containers, factories, and any function whose return type depends on input type.
4. Protocols (structural typing)
ABCs (abc.ABC) require explicit inheritance — nominal typing. Protocols check "does this object have these methods?" at type-check time — structural typing (Go interfaces, TS interfaces).
from typing import Protocol
class Readable(Protocol):
def read(self, n: int = -1) -> bytes: ...
def consume(src: Readable) -> bytes:
return src.read()
# Works with file, BytesIO, anything with a .read() method — no inheritance required.
Use Protocols when:
- Defining duck-typed APIs.
- Decoupling from a specific class hierarchy.
- Mocking — your test double satisfies the Protocol; no inheritance ceremony.
5. Dataclasses
from dataclasses import dataclass, field
@dataclass(frozen=True, slots=True)
class Point:
x: float
y: float
label: str = "anon"
metadata: dict = field(default_factory=dict)
p = Point(1.0, 2.0)
frozen=True— immutable, hashable (good for cache keys).slots=True— no__dict__; smaller memory; ~20% attribute access speedup.field(default_factory=...)— for mutable defaults; neverfield=[].
Default to dataclass for plain data carriers. Reach for Pydantic when you need validation/parsing from JSON.
6. Pydantic v2
from pydantic import BaseModel, Field
class User(BaseModel):
id: int
email: str = Field(pattern=r"[^@]+@[^@]+\.[^@]+")
age: int | None = None
u = User.model_validate_json('{"id": 1, "email": "a@b.com"}')
Use Pydantic at the edges of your system (HTTP request parsing, config loading). Use dataclasses internally. Mixing them inside business logic creates redundant validation.
7. Comprehensions, generators, the itertools toolbox
squares = [x*x for x in range(10)]
even_squares = [x*x for x in range(10) if x % 2 == 0]
lookup = {u.id: u for u in users}
unique_emails = {u.email for u in users}
# Generator (lazy, low memory)
def stream_squares(n):
for x in range(n):
yield x * x
from itertools import chain, groupby, accumulate, pairwise
Generators are the killer feature for ETL — process TB-sized streams without loading into RAM.
8. Context managers
from contextlib import contextmanager
@contextmanager
def timed(label: str):
t = time.perf_counter()
try:
yield
finally:
print(f"{label}: {time.perf_counter() - t:.3f}s")
with timed("query"):
rows = db.fetch(...)
Use them for: timing, transactions, locks, temp-file cleanup, mocking. Anything with "set up, do work, always tear down" shape.
9. Dunder methods
Implement the protocol the language expects:
| Want | Dunder |
|---|---|
len(x) | __len__ |
for ... in x | __iter__ |
x[i] | __getitem__ |
x == y | __eq__ |
hash(x) | __hash__ (must match __eq__) |
x + y | __add__ |
print(x) | __str__ (user); __repr__ (dev) |
with x: | __enter__ + __exit__ |
x() | __call__ |
Always implement __repr__ on data classes — debugging without it is misery.
10. The performance escape hatches
Python is slow; sometimes you need fast. Order of attempt:
numpy/pandas/polars— vectorise. 100x easy.numba@jit— JIT compile a hot loop. 10–100x for numeric code.cython— compile a module. Static types optional, escape GIL withnogil:.pybind11/cffi— bind C/C++ for true native speed.- Rewrite the hot path in Rust (
pyo3). Modern teams' choice.
Profile before optimising. cProfile + snakeviz for CPU; tracemalloc for memory; py-spy for prod sampling.
11. A short C++ contrast
| Concept | Python | Modern C++ |
|---|---|---|
| Resource cleanup | with / __exit__ | RAII (destructors) |
| Polymorphism | duck typing / Protocols | virtual functions / concept (C++20) |
| Generics | TypeVar / Protocols | templates / concept |
| Immutability | frozen=True dataclass | const, constexpr |
| Threads | GIL — use multiprocessing/asyncio | true parallel threads + std::atomic |
| Memory | GC | manual via unique_ptr / shared_ptr |
| Build | pip install | cmake / vcpkg (sigh) |
C++20 concept is essentially compile-time Protocols. The convergence is real.
12. Reality check
Idiomatic Python checklist for a new project:
- Strict mode
mypy --strict(orpyrightstrict). - Dataclasses or Pydantic — pick per layer, don't mix internally.
rufffor lint + format (replaces black/isort/flake8 in 1 tool).- pytest with
pytest-cov, target 80% coverage on critical paths. - Pre-commit hook: ruff + mypy + pytest fast tier.
You won't regret typed Python at 6 months. Lots of teams regret not adopting it earlier.
Reading material
- Fluent Python (Luciano Ramalho, 2nd ed.)
- Effective Python (Brett Slatkin, 2nd ed.)
- Python typing docs
- Pydantic v2 docs
In-depth research material
Video reference
▶︎ Python Typing Deep Dive (mCoding)
Pick a quiet 30 minutes during this session to actually watch it. Don't multitask.
LeetCode — Design HashMap
- Link: https://leetcode.com/problems/design-hashmap/
- Difficulty: Easy
- Why this problem: Implement
__getitem__/__setitem__/__delitem__from scratch — the dict-like Protocol in disguise. - Time-box: 20 minutes. Look up the editorial only after.
Post-session checklist
By the end of this session you should be able to:
- Write a generic function with
TypeVar(and 3.12+ syntax). - Pick between Protocol and ABC for a given API surface.
- Use dataclasses with
frozen=True, slots=Trueandfield(default_factory=...). - Implement
__iter__,__len__,__eq__,__hash__correctly. - Pick from numpy → numba → cython → C++ binding as performance escape hatches.
- Solve
design-hashmap— basic open addressing or chaining; the data-model contract.
Generated from sessions_data.py + content_part*.py. To edit a video / leetcode / title, edit the data file and re-run write_sessions.py.