Search Tech Journey

Find topics, journeys and posts

back to blog
engineeringadvanced 12m2026-06-13

Day 15 — Memory Model & Garbage Collection — Heap, GC, Leaks, Profiling

High-throughput services live and die by GC. Knowing the heap layout, GC algorithms and how to read a flame graph is the difference between '99p = 80 ms' and '9…

Every managed runtime gives you the illusion of infinite memory. The bill comes due as pause times, OOMs, and gradually-rising RSS. Understanding the model lets you predict and prevent both.

🧠 Concept

Why it matters & the mental model.

1. Two object lifecycle models

  • Reference counting (CPython, Swift, Obj-C ARC): each object has a count; decremented on every deref; freed at 0. Pros: deterministic, low pause. Cons: doesn't handle cycles → CPython also runs a cycle collector periodically. ARC requires programmer to break cycles with weak refs.
  • Tracing GC (JVM, Go, .NET, JavaScript): periodically traverse from GC roots, mark reachable, sweep / compact the rest. Pros: handles cycles, can compact. Cons: stop-the-world pauses (mitigated by generational / concurrent collectors).

2. Generational hypothesis

Most objects die young. Split the heap into young (eden + survivor) and old generations; collect young frequently (cheap), old rarely (expensive). All modern GCs use this.

3. JVM collector zoo (high level)

  • Serial / Parallel: throughput-optimised, multi-second pauses.
  • CMS (deprecated): concurrent old gen, fragmented.
  • G1: regional, target pause time, default in 11+. Good general choice.
  • ZGC / Shenandoah: sub-millisecond pauses on huge heaps (TB scale), concurrent compaction. Use for latency-critical services. Tune via -Xms = -Xmx (avoid resize), -XX:MaxGCPauseMillis, -XX:+UseStringDeduplication.

🛠 Deep Dive

Internals, code, architecture.

4. Go's GC

Concurrent mark-sweep, non-generational (yet), tri-colour, write barriers. Goal: sub-ms STW. Goroutine stacks grow/shrink. GOGC=100 (default) means GC when heap doubles since last GC.

5. .NET GC

Generational, server vs workstation modes. Server GC uses one heap per core, parallel collection — much higher throughput on multicore. Background GC does concurrent collection of gen 2.

6. Python (CPython)

RefCount + cycle collector. No heap compaction (objects don't move). Big allocator pools (pymalloc for small objects). Memory often returned to pool, not OS → RSS doesn't shrink even after del. The fix is process recycling (e.g. gunicorn max_requests).

7. Common leaks

  • Global / module-level caches with no eviction.
  • Long-lived listeners holding references to short-lived objects (event bus, signals).
  • Closures capturing big locals (especially in async).
  • Connection pools without size limit.
  • Logging frameworks buffering records.
  • In JVM: ThreadLocals in pooled threads; classloader leaks in app servers.

🚀 In Practice

Trade-offs, exercises, what to ship today.

8. Profiling toolbox

  • py-spy / austin / scalene for Python: sampling profilers, low overhead, work on live processes.
  • async-profiler for JVM: CPU + alloc + lock profiling, flame graph output.
  • pprof for Go: built-in, go tool pprof.
  • dotnet-trace / dotnet-counters for .NET.
  • eBPF (bcc, bpftrace) for OS-level allocs, page faults, syscall flame graphs.

9. Reading a flame graph

Width = time spent. Look for wide flat plateaus (true hot path) and unexpected ancestors (a string formatting call eating 30% under your "fast" function). Diff two flame graphs to see what changed across deploys.

10. Practical tips

  • Pre-size collections (list(int), dict, StringBuilder).
  • Pool large objects (buffers, regex, JSON encoders).
  • Avoid object churn in hot loops; reuse instances or use array/numpy for primitives.
  • Stream don't slurp — line-by-line iteration beats f.read().
  • For services: profile first deploy under load, set heap = 1.5× steady state, monitor gc_pause_p99.

11. What to take away

"How would you debug a memory leak in production?" Strong answers name the profiler, the heap dump tool, the difference between RSS / committed / used, and the steady-state-after-warmup question. Bonus: distinguish leak from caching working as designed.

Key points

    Resources

    Practice Problem: Trapping Rain Water (Hard)