Deploying Web Applications in Azure with Docker

A self-sufficient, production-minded walkthrough — from Docker internals to a hardened deploy on Azure App Service / Container Apps.

Deploying Web Applications in Azure with Docker

What you'll leave with: a correct mental model of containers (not just the commands), a secure production-grade Dockerfile pattern, a working local dev loop with Docker Compose, and two deployment paths on Azure — App Service (classic) and Container Apps (modern, scale-to-zero) — with the exact CLI commands, gotchas, and cost/architecture tradeoffs.


1. A correct mental model: containers are not VMs

A very common misunderstanding is that containers are "lightweight VMs". They are not. A VM virtualises hardware (its own kernel, bootloader, emulated devices). A container virtualises the OS view for a group of processes — they all share the host kernel.

Three kernel features do the work:

Feature What it gives you Example
Namespaces Isolated view of resources pid (its own PID 1), net (its own NIC), mnt (its own filesystem root), uts (its own hostname), ipc, user
cgroups v2 Resource limits & accounting --memory=512m, --cpus=1.5
Union FS (overlayfs) Layered, copy-on-write images Each RUN/COPY in a Dockerfile = one read-only layer; container gets a thin R/W layer on top

Consequences that matter:

  • Kernel features the host lacks, you can't get inside the container. (e.g. newer io_uring features, newer eBPF, modern seccomp profiles.)
  • You cannot run Linux containers on Windows/macOS natively — Docker Desktop runs a tiny Linux VM for you; that's why file-sync can be slow.
  • Image size is dominated by layers, so ordering COPY and RUN correctly is a real perf/cost lever.

The runtime stack on Linux: docker CLI → dockerd → containerd → runc. runc is the OCI reference that actually calls clone(), sets up the namespaces and cgroups, pivot_roots the filesystem, and execves your entrypoint. Knowing this lets you debug "why is my container dying with exit 137?" (answer: OOMKilled by the memory cgroup).


2. A production-grade Dockerfile pattern

The Dockerfile most tutorials show you (including the earlier version of this post) has at least four production problems:

  1. Runs as root.
  2. Uses COPY . /app before pip install, so every code change busts the pip cache.
  3. Uses a single stage — your build tools (gcc, dev headers, pip wheel cache) ship to production.
  4. No HEALTHCHECK, no EXPOSE discipline, no explicit WORKDIR.

Here is a pattern that fixes all of this for a Python/Flask app (the same shape works for Node/Go/Java):

# syntax=docker/dockerfile:1.7
# ---------- Stage 1: builder ----------
FROM python:3.12-slim AS builder

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=1 \
    PIP_NO_CACHE_DIR=1

WORKDIR /build

# Install only build-time OS deps, then drop them
RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential gcc \
    && rm -rf /var/lib/apt/lists/*

# Copy ONLY dependency manifests first so this layer is cached on code-only changes
COPY requirements.txt .
RUN pip install --prefix=/install -r requirements.txt

# ---------- Stage 2: runtime ----------
FROM python:3.12-slim AS runtime

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PORT=8000

# Non-root user
RUN useradd --create-home --shell /usr/sbin/nologin --uid 10001 app
WORKDIR /app

# Bring in only the built site-packages — no gcc, no caches
COPY --from=builder /install /usr/local

# App code last = best cache behaviour
COPY --chown=app:app . .

USER app
EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
  CMD python -c "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://127.0.0.1:8000/healthz', timeout=2).status==200 else 1)"

# Use gunicorn, not `python app.py`, in production
CMD ["gunicorn", "-b", "0.0.0.0:8000", "-w", "2", "--threads", "4", "app:app"]

Pair it with a strict .dockerignore:

.git
.venv
__pycache__
*.pyc
.env
.env.*
tests/
docs/
node_modules/

Why this matters (concrete numbers): - Python base image: python:3.12 ≈ 1.0 GB, python:3.12-slim ≈ 130 MB, multi-stage + slim + --no-cache-dir typically lands a Flask app around 170–220 MB. - Proper layer order: on a code-only edit, rebuild time drops from ~90s to ~5s because pip install is cached. - Non-root user: blocks ~90% of common image-level CVE exploit paths.


3. Local dev loop with Docker Compose

Real apps have more than one process. A docker-compose.yml lets you bring up the whole system with one command and a consistent network.

services:
  web:
    build: .
    ports: ["8000:8000"]
    environment:
      DATABASE_URL: postgresql://app:app@db:5432/app
      REDIS_URL: redis://cache:6379/0
    depends_on:
      db: { condition: service_healthy }
      cache: { condition: service_started }
    develop:
      watch:
        - action: sync
          path: ./src
          target: /app
        - action: rebuild
          path: ./requirements.txt

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: app
      POSTGRES_PASSWORD: app
      POSTGRES_DB: app
    volumes: ["pgdata:/var/lib/postgresql/data"]
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app"]
      interval: 5s
      retries: 5

  cache:
    image: redis:7-alpine

volumes:
  pgdata:

Key points most tutorials get wrong:

  • Service name = DNS name. web reaches Postgres at host db, not localhost. Compose puts every service on a user-defined bridge network where Docker's embedded DNS resolves service names.
  • depends_on alone does not wait for readiness. Use condition: service_healthy with a healthcheck.
  • Named volumes, not bind mounts, for databases. Bind mounts on macOS/Windows are slow and can corrupt Postgres WAL on some setups.
  • develop.watch (Compose v2.22+) replaces the old docker-compose up --build loop with fast file sync — much closer to a native dev experience.

Run: docker compose up --watch.


4. Security hardening checklist

Production containers need more than "it works":

  • Non-root user (USER directive in Dockerfile; runAsNonRoot: true in Kubernetes).
  • Minimal base image-slim, distroless, or chainguard images. Avoid -alpine for Python (musl libc compatibility issues with many wheels).
  • Pin versionsFROM python:3.12.5-slim@sha256:<digest>, not python:latest.
  • No secrets in layers. Anything COPYed is recoverable via docker history. Inject secrets via env vars / Azure Key Vault at runtime.
  • Scan on every builddocker scout cves <image>, trivy image <image>, or grype <image>.
  • Read-only root filesystem at runtime — --read-only + a tmpfs mount for /tmp if needed.
  • Drop capabilities--cap-drop=ALL --cap-add=NET_BIND_SERVICE (if you need < 1024 ports).
  • Resource limits — always set --memory and --cpus (or their Compose/K8s equivalents). Without them, one runaway container can OOM the whole host.

A fast CI gate, in one line:

trivy image --severity HIGH,CRITICAL --exit-code 1 myapp:${GIT_SHA}

5. Pushing the image: Azure Container Registry (ACR)

Prefer ACR over Docker Hub for Azure deploys — it's inside your VNET, supports managed-identity pulls, and avoids Docker Hub's public rate limits.

# One-time setup
az group create -n rg-blog -l eastus
az acr create  -n myblogacr -g rg-blog --sku Basic

# Every build
az acr login -n myblogacr
docker build -t myblogacr.azurecr.io/blog:$(git rev-parse --short HEAD) .
docker push     myblogacr.azurecr.io/blog:$(git rev-parse --short HEAD)

# Or build in the cloud (no local Docker needed)
az acr build -r myblogacr -t blog:$(git rev-parse --short HEAD) .

az acr build is a sleeper feature — it runs the build on ACR's build agents, which is often faster than your laptop and doesn't require Docker locally (great for CI runners).


6. Deployment path A — Azure App Service (classic PaaS)

Best for: a single long-running web app, simple scaling, 1+ instances always on, mature blue/green via deployment slots, custom domain + managed SSL in one click.

# Linux App Service plan (P1V3 recommended for prod; B1 for dev)
az appservice plan create -g rg-blog -n plan-blog --is-linux --sku B1

# Web app that pulls from ACR using a system-assigned managed identity
az webapp create \
  -g rg-blog -p plan-blog -n myblog-app \
  --deployment-container-image-name myblogacr.azurecr.io/blog:latest

# Give the webapp a managed identity and AcrPull on the registry
az webapp identity assign -g rg-blog -n myblog-app
APP_PRINCIPAL_ID=$(az webapp identity show -g rg-blog -n myblog-app --query principalId -o tsv)
ACR_ID=$(az acr show -n myblogacr --query id -o tsv)
az role assignment create --assignee <span class="katex-inline" data-katex="APP_PRINCIPAL_ID --role AcrPull --scope "></span>ACR_ID

# Tell App Service to use MI for pulls (no registry username/password stored)
az webapp config set -g rg-blog -n myblog-app --generic-configurations '{"acrUseManagedIdentityCreds": true}'

# App settings — surface as env vars inside the container
az webapp config appsettings set -g rg-blog -n myblog-app --settings \
  WEBSITES_PORT=8000 \
  WEBSITES_ENABLE_APP_SERVICE_STORAGE=false \
  DATABASE_URL="@Microsoft.KeyVault(...)"

Gotchas that catch people:

  • WEBSITES_PORT — App Service will only route traffic to the port you declare here. If your container listens on 8080, set WEBSITES_PORT=8080.
  • Always-On — enable it on any non-Free tier, otherwise the first request after idle takes ~20s.
  • Deployment slots give you true zero-downtime: deploy to staging, test, then az webapp deployment slot swap — DNS doesn't change, and the slot warms up before swap.
  • Log streaming: az webapp log tail -g rg-blog -n myblog-app.

7. Deployment path B — Azure Container Apps (modern, serverless)

Best for: microservices, event-driven workloads, anything that benefits from scale-to-zero, Dapr, or KEDA-based scaling (e.g., scale on queue depth).

az provider register -n Microsoft.App
az provider register -n Microsoft.OperationalInsights

az containerapp env create \
  -g rg-blog -n cae-blog -l eastus

az containerapp create \
  -g rg-blog -n ca-blog \
  --environment cae-blog \
  --image myblogacr.azurecr.io/blog:latest \
  --registry-server myblogacr.azurecr.io \
  --registry-identity system \
  --target-port 8000 --ingress external \
  --min-replicas 0 --max-replicas 10 \
  --cpu 0.5 --memory 1Gi \
  --secrets "db-url=keyvaultref:https://kv-blog.vault.azure.net/secrets/db-url,identityref:system" \
  --env-vars "DATABASE_URL=secretref:db-url"

Why you'd pick this over App Service:

Dimension App Service Container Apps
Scale to zero No (B1+) Yes
Per-second billing when idle No Yes
Scale on queue / custom metric Limited KEDA — yes
Multiple revisions / traffic-split Slots (2) Revisions (N), % traffic split
Dapr sidecars No Yes
VNET integration Regional or private endpoint Internal env or workload profiles
Min latency on cold start n/a (always on) ~2–5s cold start

Rule of thumb: one user-facing monolith with steady traffic → App Service. Many small services or bursty/event-driven → Container Apps.


8. Observability you should turn on day one

  • Log Analytics workspace attached to the App Service / ACA environment — stream stdout/stderr to KQL.
  • Application Insights SDK inside the app for distributed tracing (opentelemetry-instrumentation-flask is three lines of code).
  • Container health probes — App Service uses HEALTHCHECK from your Dockerfile; ACA has explicit liveness / readiness / startup probes you should configure.
  • Alerts on: HTTP 5xx rate > 1%, p95 latency > 1s, memory > 80%, restart count > 0 in 10 min.

A minimal KQL query for "who's erroring right now":

AppServiceConsoleLogs
| where TimeGenerated > ago(15m)
| where ResultDescription has_any ("ERROR","Exception","Traceback")
| summarize count() by bin(TimeGenerated, 1m), _ResourceId
| render timechart

9. Common mistakes to avoid (corrections to the original post)

The original draft of this post was an AI-generated placeholder with several issues worth calling out:

  1. "Use python:3.8-slim" — 3.8 reached end-of-life in Oct 2024. Use a supported Python (3.11/3.12 at the time of writing).
  2. CMD ["python", "app.py"] in production — Flask's dev server is single-threaded and debug-mode by default. Use gunicorn (sync + threads, or gthread/uvicorn workers for async frameworks).
  3. Pushing to Docker Hub for Azure deploys — works, but you pay rate-limit tax and lose managed-identity pulls. Use ACR.
  4. "Day 1 / Day 2" fake learning plan — replace with concrete milestones tied to the repo you actually ship.
  5. No .dockerignore — without it, COPY . . leaks your .git, .env, and __pycache__ into the image.

10. TODO — self-sufficient action list

After reading the above, you should be able to tick each of these off without any extra reading. If a step feels unclear, re-read the section linked in parentheses.

Foundations

  • [ ] Install Docker Desktop (or Colima on macOS for a lighter alternative), run docker run hello-world, then explain in your own words what namespaces/cgroups did (§1).
  • [ ] Run docker inspect <container> and find the PID, MountID, and cgroup path. Open /proc/<pid>/ns/ on the host and show the container's namespaces.

Build a real image

  • [ ] Port the two-stage Dockerfile in §2 to your actual app (Python or otherwise). Commit a .dockerignore.
  • [ ] Record image sizes before and after multi-stage. Target: ≤ 250 MB for a Python web app, ≤ 80 MB for a Go binary.
  • [ ] Add a HEALTHCHECK that hits a real /healthz endpoint and returns JSON including git SHA + uptime.

Local dev loop

  • [ ] Write a docker-compose.yml for app + Postgres + Redis with health-gated startup (§3). Verify docker compose up --watch hot-reloads on file change without full rebuild.

Security pass

  • [ ] Run trivy image against your image; get HIGH/CRITICAL count to zero (update base image / pin versions as needed).
  • [ ] Confirm with docker run --rm myapp whoami that the process is NOT root.
  • [ ] Add --read-only + tmpfs /tmp to your docker run command and verify the app still boots.

Ship to ACR

  • [ ] Create an ACR, push a tagged image built remotely with az acr build (no local Docker needed).
  • [ ] Tag with the git short SHA; never ship :latest to production.

Deploy — pick ONE path to start, then do the other

  • [ ] App Service path: provision Linux plan + webapp, wire managed-identity AcrPull, set WEBSITES_PORT, deploy, hit the public URL.
  • [ ] Add a staging slot, deploy v2 to it, verify, then swap → confirm zero downtime.
  • [ ] Container Apps path: provision an environment, deploy with --min-replicas 0, confirm scale-to-zero after idle, confirm cold-start behaviour.
  • [ ] Create a second revision, split traffic 50/50 between revisions, then promote to 100%.

Observability

  • [ ] Enable App Insights (SDK in code + connection string in env).
  • [ ] Write 3 KQL queries in Log Analytics: error rate by endpoint, p95 latency, slow DB calls.
  • [ ] Wire one alert: HTTP 5xx > 1% over 5 min → email/Teams.

Stretch (only after the above)

  • [ ] Migrate the deploy to Bicep or Terraform so the whole stack is azd up-able.
  • [ ] Add GitHub Actions: build → scan → push to ACR → az containerapp update --image … on merge to main.
  • [ ] Compare cost for your actual traffic pattern: App Service B1 24×7 vs Container Apps scale-to-zero. Document the break-even point.

When every box above is checked, flip status: workinprogressstatus: published in the frontmatter.

Back to Blog About the Author
🧘