system designintermediate 30m read

MS Stack Ch 18 — CI/CD with Azure DevOps

YAML pipelines, templates + parameters + extends, agent pools, federated identity for Azure deploys, SDL gates (CodeQL, credential scanner, container scanning), multi-stage deploys, environments + approvals.

Chapter 18 of From Novice to Fluent on the Modern Microsoft Web Stack — a 22-chapter self-study plan.

Why this chapter

CI/CD is the conveyor belt between a commit and a customer. Inside Microsoft and most large enterprises, that conveyor belt is Azure DevOps Pipelines — defined in YAML, versioned in the repo, executed by hosted or self-hosted agents, deploying through environments that carry approvals and gates. The public-facing twin is GitHub Actions: different syntax, near-identical mental model. If you understand one well, the other is a weekend's reading.

The shipping-grade version of this skill is "I can write a azure-pipelines.yml that builds, tests, packages, and deploys an ASP.NET Core API to App Service with a manual approval before prod". The expert-tier version is "I can factor a pipeline into reusable templates, enforce SDL gates through an extends pipeline that consuming teams cannot bypass, deploy with federated workload identity (no secrets stored anywhere), and orchestrate a region-by-region rollout with bake times and automatic rollback".

Most of what differentiates the two tiers is not new YAML keywords. It is knowing what to factor where, what to never put in a pipeline, and what compliance gates the org expects to find baked in. That is the chapter's payload.

You finish this chapter when you can sit down with a blank repo, write a YAML pipeline that builds and deploys to three environments with the right approvals, and explain — without notes — how extends templates, federated identity, and SDL gates fit together.

Pipeline anatomy

Triggers, stages, jobs, steps, tasks, scripts — the YAML object model.

Reusable YAML

Templates, parameters, extends — DRY pipelines that scale to dozens of repos.

Agents + auth

Microsoft-hosted vs self-hosted; NuGetAuthenticate; service connections.

Artifacts

Pipeline vs build artifacts; publish and consume across jobs and stages.

Environments

Approvals, gates, deployment history, the audit trail prod needs.

SDL + secrets

CodeQL, CredScan, BinSkim, antimalware, PoliCheck; Key Vault-linked variable groups.

Concepts and depth

Pipeline anatomy: triggers, stages, jobs, steps

Every Azure DevOps pipeline is a tree. At the root sits a trigger that decides when the pipeline runs. Below that, stages group work into deployable units (CI, Deploy_Dev, Deploy_Prod). Each stage contains one or more jobs, and each job runs on a single agent as an ordered list of steps. Steps are either a task (a versioned, parameterised binary the Azure DevOps team or third parties publish — UseDotNet@2, AzureWebApp@1, PublishTestResults@2) or a script (a shell line you wrote yourself).

Triggers come in four flavours you actually use. CI triggers fire on a push to a branch; configure them with trigger: at the top of the YAML. PR triggers fire when a pull request is opened or updated; configure with pr:. Scheduled triggers run on cron strings (nightly main builds, weekly dependency-scan refreshes). Pipeline resource triggers fire when a different pipeline completes successfully — the lever that lets a downstream "release" pipeline react to an upstream "build" pipeline without polling. The fifth, manual, has no special syntax; any pipeline can be queued by a human from the UI.

Variables come in three layers. Inline variables: blocks live in the YAML and are good for non-sensitive build constants (buildConfiguration: 'Release'). Variable groups live in the Library and are referenced by name — they centralise values shared across pipelines and can be linked to Azure Key Vault so the real values never live in version control. Pipeline-run variables are typed by the operator when queuing a manual run; mark them settable at queue time to allow overrides.

The gotcha 90% of newcomers hit is treating task vs script as a style preference. It is not. Tasks are versioned, sandboxed, secret-redacted, and produce structured logs the Pipelines UI can render. Scripts are raw shell. If a task exists for what you want to do — auth, publish, deploy — use the task. Reach for script: for genuinely custom logic, and never for credential handling.

Good enough to ship

• Single-file YAML with CI + PR triggers, one CI stage, one deploy stage
• Variables defined inline for non-secret build config
• Tasks (not scripts) for restore, build, test, publish, deploy
• condition: always() on test-publish so failures still surface results

Expert tier

• Path-filtered triggers per service in a monorepo
• pipeline: resource triggers chaining build → release pipelines
• Variable groups linked to Key Vault, with audit-logged read access
• Scripts are exception cases; everything else flows through versioned tasks

Root

Trigger

CI / PR / cron / pipeline

Stage

CI · Deploy_Dev · Deploy_Prod

Job

runs on one agent

Step

task or script

Out

Artifact

published, consumed

The YAML object model: trigger → stage → job → step → artifact.

Reusable YAML: templates, parameters, extends

A pipeline that lives only in one repo can grow without discipline. A pipeline that lives in twenty repos cannot — the moment one team forgets to add a CodeQL step, the security org has a CVE story to write. The answer is templates.

Templates come in two shapes. Include templates (- template: build.yml) inline another file's jobs or steps into the consuming pipeline. They are the lightweight "extract a function" move. Extends templates (extends: template: secure-pipeline.yml) are the opposite: the consuming pipeline submits itself to a parent template that decides the overall structure. The parent owns the stages; the child only supplies parameters. This is how Microsoft's 1ES (One Engineering System) enforces SDL gates on every team — the extends template the org publishes runs the gates before and after the consuming team's build, and there is no YAML the consuming team can write to skip them.

parameters: are the typed input contract. Unlike variables (string-only, evaluated at run time), parameters are typed (string, boolean, number, object, step, stepList, job, jobList, stage, stageList), validated at compile time, and expanded with the ${{ parameters.foo }} macro syntax. The macro syntax fires before the pipeline starts; this is what makes parameters useful for conditional ${{ if eq(parameters.deployProd, true) }} blocks that include or omit whole stages.

The non-obvious gotcha is expression syntax. Azure DevOps has three of them. $(var) is a runtime macro that the agent expands when it sees the line. ${{ expr }} is a compile-time template expression evaluated when the YAML is processed. $[ expr ] is a runtime expression evaluated when the job starts. Mix them up and your condition: lines silently evaluate to "always true" or "never". Read the docs once carefully and bookmark the page.

Good enough to ship

• templates/build.yml extracted, reused from one repo
• Typed parameters: for solution path and configuration
• Use ${{ if }} blocks to opt stages in or out

Expert tier

• Org-published extends template; consuming pipelines cannot bypass SDL
• Template repo versioned via resources.repositories with pinned ref
• Parameter contracts documented; breaking changes shipped behind feature flags

Agent pools: Microsoft-hosted vs self-hosted

A pool is a set of agents (machines) that pick up jobs. Microsoft-hosted pools (vmImage: ubuntu-latest, windows-latest, macOS-latest) are clean per-run VMs with a generous tool catalogue pre-installed (.NET SDKs, Node, Python, common CLIs). They cost parallel-job slots and have public egress. Self-hosted pools are agents you install on your own VMs or in a VM Scale Set; they retain state between runs (warm caches, pre-pulled images), can reach your private network, and are the only option for jobs that need on-prem connectivity or GPU.

The choice is rarely binary. Most repos start on Microsoft-hosted, switch a hot job (the one that downloads 4 GB of NuGet packages every run) to a self-hosted agent with a persistent package cache, and leave the rest. The trap is "let's move everything to self-hosted to save money" — you then own patching, scaling, isolation, and the inevitable "agent is offline" Slack message at 2am.

Demands are how a job declares it needs a particular capability (Agent.OS -equals Linux, or azurecli -exists). On a self-hosted pool, demands let you target a subset of agents (the ones with the GPU, the ones with the proprietary driver).

Microsoft-hosted

default

•Clean VM per run
•No infra to manage
•Public egress only
•Use for: most repos

Self-hosted VM

private network

•Reaches private endpoints
•Custom tooling installed
•You own patching + scaling
•Use for: enterprise integrations

Self-hosted VMSS

elastic at scale

•Auto-scales agent count
•Image refreshed on schedule
•Cheaper for high-volume orgs
•Use for: platform teams

Authenticated package restore (NuGetAuthenticate)

Most enterprises host internal NuGet feeds on Azure Artifacts. Your pipeline must authenticate to those feeds before dotnet restore can fetch packages. The clean way is the NuGetAuthenticate@1 task — it injects credentials into the agent's NuGet config using the pipeline's service identity, so subsequent dotnet or nuget commands work without you handling tokens.

The legacy way (PAT in a NuGet config) still works and you will find it in older repos, but the PAT eventually expires, the renewal email goes to whoever set it up two years ago, and the pipeline starts failing at midnight on a renewal weekend. NuGetAuthenticate does not expire; it derives credentials from the pipeline's identity on each run.

steps:
- task: NuGetAuthenticate@1
- script: dotnet restore --configfile NuGet.config
  displayName: Restore (authenticated)

NPM has the same pattern (npmAuthenticate@0), and Python has Twine integration for Azure Artifacts feeds. The principle is identical: never store long-lived package-feed credentials in source.

Good enough to ship

• NuGetAuthenticate@1 before any restore step
• NuGet.config checked in, packageSourceCredentials NOT checked in

Expert tier

• Internal feed mirrors NuGet.org with vetted versions only
• package source mapping pins each package to its source
• Restore time tracked as a pipeline KPI

Artifacts: pipeline vs build, publish and consume

An artifact is the binary output of one stage that a later stage will consume. There are two flavours and the naming is confusing.

Pipeline artifacts (publish: keyword and the PublishPipelineArtifact@1 task) are the modern default. They live in the pipeline's storage, compress on the fly, and are downloaded with download: current; artifact: <name> in deploy stages. They are scoped to the pipeline run.

Build artifacts (PublishBuildArtifacts@1) are the legacy form, slower, stored in a different backend, and rarely used in new pipelines unless you need cross-pipeline consumption with DownloadBuildArtifacts@1.

The cross-pipeline-and-pipeline-run flow uses pipeline resources: a release pipeline declares an upstream pipeline as a resource, and download: <pipelineName> pulls the artifact from the latest successful run of that pipeline.

resources:
  pipelines:
  - pipeline: api-build
    source: API.CI
    trigger:
      branches: { include: [ main ] }
 
stages:
- stage: Deploy
  jobs:
  - deployment: Deploy
    environment: prod
    strategy:
      runOnce:
        deploy:
          steps:
          - download: api-build
            artifact: api
          - task: AzureWebApp@1
            inputs: { ... }

Good enough to ship

• publish: to emit a pipeline artifact
• download: current to consume in a later stage
• One artifact per deployable unit (api, web, infra-templates)

Expert tier

• Pipeline resources chain CI → release across repos
• Artifact provenance recorded (commit SHA, build number, signer)
• Retention policy explicit; nothing kept > 90 days unless tagged

Environments, approvals, and gates

An environment in Azure DevOps is more than a string. It is a first-class object with its own resource page, deployment history, and checks that the pipeline must clear before a deployment job targeting that environment can run. Checks include manual approvals, business-hours windows, REST-call gates that ping a deployment-freeze service, query-work-item gates, evaluate-artifact policies, and required-template enforcement (you can demand that a deploy job extends a specific template).

The deployment history view is what makes incident response fast: every deployment is timestamped, linked to its commit, and attributed to the pipeline run that produced it. When prod breaks at 3am, the on-call engineer opens the environment page, sees "deploy 4271 ran at 02:51 UTC, commit a1b2c3", and has the suspect set in thirty seconds.

The gotcha is treating environments as decorative. Teams sometimes use one environment for all stages "for simplicity". Then there is no audit boundary, no approval gating, no history segmentation, and a misconfigured pipeline can push dev artifacts straight to prod. Separate environments per logical stage, full stop.

Good enough to ship

• One environment per stage (dev, staging, prod)
• Manual approval required on prod, at minimum two approvers
• Business-hours check blocks 5pm Friday deploys

Expert tier

• REST gate calls deployment-freeze service before each prod stage
• Required-template check enforces extends lineage
• Query-work-item gate verifies a change ticket is "Approved" status

Test publishing and code coverage

Tests that pass silently are wasted. The pipeline must surface results to the run page so failures are findable and trends are trackable. The pattern is: run your test runner with a structured-output flag, then publish those results with the appropriate task.

- script: dotnet test -c Release --no-build --logger trx --collect "XPlat Code Coverage"
  displayName: Test
- task: PublishTestResults@2
  condition: always()
  inputs:
    testRunner: VSTest
    testResultsFiles: '**/*.trx'
    failTaskOnFailedTests: true
- task: PublishCodeCoverageResults@2
  condition: always()
  inputs:
    summaryFileLocation: '**/coverage.cobertura.xml'

condition: always() is critical. Without it, the publish task is skipped when the test step fails, and you lose the very results you need to triage. The same pattern applies to npm: vitest run --reporter=junit --outputFile=test-results.xml then publish.

For coverage, the lesson is "track the trend, not the threshold". A hard 80% gate creates incentive to write low-quality tests that touch lines without asserting anything meaningful. A trend dashboard ("coverage dropped 5% on this PR — explain") makes the conversation about whether the loss is justified.

Good enough to ship

• TRX/JUnit results published on every run with condition: always()
• Cobertura coverage summary published
• Failed tests fail the build

Expert tier

• Flaky-test detection (retry once, tag, report)
• Coverage trend dashboard not a hard gate
• Slow-test report surfaces the top 10 every run

Compliance gates: CodeQL, CredScan, BinSkim, antimalware, PoliCheck

Enterprise pipelines run a fixed catalogue of security and compliance scans. You should know each by name, by what it catches, and by how it integrates.

CodeQL (GitHub's static analysis engine) builds a queryable database of your code and runs vulnerability queries against it. Catches SQL injection, path traversal, unsafe deserialisation, hard-coded crypto keys. Runs as a task; results post to the Advanced Security tab when enabled.

CredScan (Credential Scanner) greps the repo for accidentally committed secrets, private keys, connection strings. Has a deep ruleset built from leaked-credential post-mortems. Fail the build on any high-confidence finding.

BinSkim scans compiled Windows PE binaries for safe-compile flags (ASLR, DEP, control-flow guard, stack protection). It will reject a DLL built without the modern hardening flags. Mostly relevant for native code; managed assemblies still benefit from a subset.

Antimalware runs Defender (or a configured engine) over published artifacts. Catches the unlikely-but-possible "build server compromised, injecting payload" attack class.

PoliCheck scans for prohibited terms (legal, trademark, geopolitically sensitive). Mostly relevant inside Microsoft and other multinationals; surprisingly useful for catching demo-code variable names that would not survive customer review.

Outside Microsoft, the open-source equivalents you will reach for are Semgrep (CodeQL alternative), GitLeaks / TruffleHog (CredScan), Trivy / Grype (container scanning, dependency CVEs), Syft / CycloneDX (SBOM generation), and cosign (Sigstore signing).

How are these injected? The answer is the extends template again. The org publishes a pipeline template that runs the SDL gates before and after the consuming team's build stage. The consuming team writes a short azure-pipelines.yml that supplies parameters; the org's template owns the gate steps. This is the only sustainable way — leaving each team to remember nine security tasks is a guarantee that one team will not.

Good enough to ship

• CodeQL + CredScan + dependency scan + container scan + SBOM
• Run on every PR, fail the build on high-severity findings
• Antimalware on published artifacts

Expert tier

• All gates injected by extends template, not consumer YAML
• Custom CodeQL queries tuned for your domain
• SBOM signed; provenance attestation aligned with SLSA

Service connections: secret-based vs federated workload identity

A service connection is the credential your pipeline uses to talk to Azure (or any external system). Two flavours, and the gap between them is the largest single security upgrade Azure DevOps has shipped in a decade.

Secret-based service connections store a client secret (or certificate) for a service principal in Azure DevOps. The pipeline uses it to authenticate when calling Azure. The problem is the secret lives somewhere persistent — if Azure DevOps were ever compromised, every team's prod environment is on the table. Worse, secrets expire, the renewal lands on a single person's calendar, and rotation drift causes outages.

Federated workload identity (Workload Identity Federation, or WIF) eliminates the stored secret entirely. The service connection is configured to trust the Azure DevOps tenant as an OIDC identity provider. At pipeline run time, Azure DevOps mints a short-lived OIDC token, exchanges it for an Azure AD access token via Entra ID, and uses that token for API calls. The credentials live for the duration of the job and never persist anywhere. There is nothing to rotate, nothing to leak.

Configure WIF in two places: in Entra ID, register a federated credential on the application object that trusts the Azure DevOps service connection's subject claim; in Azure DevOps, create the service connection with "Workload Identity Federation" instead of "Secret". The migration tooling (UI and az CLI) has gotten very good — a five-minute change per connection.

GitHub Actions has the same pattern (azure/login@v2 with client-id, tenant-id, subscription-id, no client-secret). If you are setting up a new project in 2026, secret-based service connections should not exist anywhere in your topology.

Good enough to ship

• All Azure service connections use federated identity
• Per-environment service principals (no shared one across dev/prod)
• Least-privilege RBAC on each principal

Expert tier

• Subject-claim scoping pins WIF to a specific pipeline + branch
• Just-in-time RBAC: principal gets Contributor for the deploy window, then loses it
• Audit log streamed to a central SIEM for review

Secrets in pipelines: variable groups linked to Key Vault

Secrets that pipelines need (database connection strings, third-party API keys) belong in Azure Key Vault. Variable groups in the Azure DevOps Library can be linked to a Key Vault, so referencing $(MySecret) in YAML transparently reads the latest value from KV at run time. Rotate the secret in KV; the next pipeline run picks up the new value with no YAML change.

Two iron rules. Never hardcode a secret in YAML — Git history is forever, and the next CredScan run will set off alarms. Never echo a secret in a script. Azure DevOps will redact it (***) in the log, but if you pipe it to a file, write it to an artifact, or send it to an external service, it leaks. Treat any line containing $(SecretVar) as radioactive.

Service-account-style secrets (API keys you cannot federate) should be rotated on a schedule, with the rotation itself automated via a pipeline that uses WIF to write the new value to KV. Manual rotation is where secrets go to expire silently.

Good enough to ship

• KV-linked variable groups for all secrets
• Secrets marked secret in inline variables (never plain value:)
• Never piped to logs, files, or external services

Expert tier

• Automated rotation pipeline reads from a source-of-truth and writes to KV
• KV access reviews quarterly; orphaned identities removed
• Soft-delete + purge protection on every KV

Worked examples

A minimal CI + deploy pipeline

# azure-pipelines.yml
trigger:
  branches: { include: [ main ] }
pr:
  branches: { include: [ main ] }
 
pool:
  vmImage: ubuntu-latest
 
variables:
  buildConfiguration: 'Release'
  netVersion: '8.0.x'
 
stages:
- stage: CI
  jobs:
  - job: BuildAndTest
    steps:
    - task: UseDotNet@2
      inputs: { version: $(netVersion) }
    - task: NuGetAuthenticate@1
    - script: dotnet restore
      displayName: Restore
    - script: dotnet build -c $(buildConfiguration) --no-restore
      displayName: Build
    - script: dotnet test -c $(buildConfiguration) --no-build --logger trx --collect "XPlat Code Coverage"
      displayName: Test
    - task: PublishTestResults@2
      condition: always()
      inputs: { testRunner: VSTest, testResultsFiles: '**/*.trx', failTaskOnFailedTests: true }
    - task: PublishCodeCoverageResults@2
      condition: always()
      inputs: { summaryFileLocation: '**/coverage.cobertura.xml' }
    - script: dotnet publish src/Api -c Release -o $(Build.ArtifactStagingDirectory)/api --no-build
      displayName: Publish API
    - publish: $(Build.ArtifactStagingDirectory)/api
      artifact: api

What to notice:

UseDotNet@2 pins the SDK; never rely on the agent's pre-installed version.
NuGetAuthenticate@1 runs before restore — without it, internal feeds 401.
--no-restore and --no-build on the downstream tasks cut runtime in half.
TRX + coverage published with condition: always() so failures surface.
publish: emits a pipeline artifact named api for downstream stages to consume.

Multi-stage deploy with environments and approvals

stages:
- stage: CI    # as above
- stage: Deploy_Dev
  dependsOn: CI
  jobs:
  - deployment: DeployDev
    environment: dev
    strategy:
      runOnce:
        deploy:
          steps:
          - download: current
            artifact: api
          - task: AzureWebApp@1
            inputs:
              azureSubscription: 'svc-conn-dev'
              appName: 'myapi-dev'
              package: $(Pipeline.Workspace)/api
 
- stage: Deploy_Staging
  dependsOn: Deploy_Dev
  jobs:
  - deployment: DeployStaging
    environment: staging     # ← approvals attached in UI
    strategy:
      runOnce:
        deploy:
          steps:
          - download: current
            artifact: api
          - task: AzureWebApp@1
            inputs:
              azureSubscription: 'svc-conn-staging'
              appName: 'myapi-staging'
              package: $(Pipeline.Workspace)/api
          - script: ./tests/smoke.sh https://myapi-staging.azurewebsites.net
            displayName: Smoke test
 
- stage: Deploy_Prod
  dependsOn: Deploy_Staging
  jobs:
  - deployment: DeployProd
    environment: prod        # ← manual approval + business-hours check in UI
    strategy:
      runOnce:
        deploy:
          steps:
          - download: current
            artifact: api
          - task: AzureWebApp@1
            inputs:
              azureSubscription: 'svc-conn-prod'
              appName: 'myapi-prod'
              deployToSlotOrASE: true
              slotName: 'staging'
              package: $(Pipeline.Workspace)/api
          - task: AzureAppServiceManage@0
            inputs:
              action: 'Swap Slots'
              webAppName: 'myapi-prod'
              sourceSlot: 'staging'
              targetSlot: 'production'
              preserveVnet: true

What to notice:

Three environments, three service connections (each scoped to its own subscription/RG).
Staging runs a smoke test against the deployed URL; staging is the last cheap place to catch regressions.
Prod uses slot-swap: deploy to the staging slot, warm it, then swap. Rollback is one more swap call.
dependsOn chains stages; failure in one short-circuits the next.

Reusable build template

# templates/build.yml
parameters:
- name: solution
  type: string
- name: configuration
  type: string
  default: Release
- name: publishProject
  type: string
- name: artifactName
  type: string
  default: app
 
jobs:
- job: Build
  pool: { vmImage: ubuntu-latest }
  steps:
  - task: UseDotNet@2
    inputs: { version: '8.0.x' }
  - task: NuGetAuthenticate@1
  - script: dotnet restore ${{ parameters.solution }}
  - script: dotnet build ${{ parameters.solution }} -c ${{ parameters.configuration }} --no-restore
  - script: dotnet test ${{ parameters.solution }} -c ${{ parameters.configuration }} --no-build --logger trx
  - task: PublishTestResults@2
    condition: always()
    inputs: { testRunner: VSTest, testResultsFiles: '**/*.trx' }
  - script: dotnet publish ${{ parameters.publishProject }} -c ${{ parameters.configuration }} -o $(Build.ArtifactStagingDirectory)/${{ parameters.artifactName }} --no-build
  - publish: $(Build.ArtifactStagingDirectory)/${{ parameters.artifactName }}
    artifact: ${{ parameters.artifactName }}

# azure-pipelines.yml — consumes the template
trigger: { branches: { include: [ main ] } }
 
stages:
- stage: CI
  jobs:
  - template: templates/build.yml
    parameters:
      solution: 'src/MyApp.sln'
      publishProject: 'src/MyApp.Api/MyApp.Api.csproj'
      artifactName: 'api'

What to notice:

Parameters are typed; mis-spelling one fails at compile time.
The template owns task versions and the build flow; consumers cannot drift.
For org-wide enforcement, switch from template: to extends: and the consumer can only fill in the blanks.

Federated identity for a no-secret deploy

- stage: Deploy
  jobs:
  - deployment: ProdDeploy
    environment: prod
    strategy:
      runOnce:
        deploy:
          steps:
          - task: AzureCLI@2
            displayName: Deploy infra
            inputs:
              azureSubscription: 'svc-conn-prod-fed'   # Workload Identity Federation
              scriptType: bash
              scriptLocation: inlineScript
              inlineScript: |
                az deployment group create \
                  --resource-group rg-myapp-prod \
                  --template-file infra/main.bicep \
                  --parameters infra/prod.bicepparam
          - task: AzureWebApp@1
            inputs:
              azureSubscription: 'svc-conn-prod-fed'
              appName: 'myapi-prod'
              package: $(Pipeline.Workspace)/api

What to notice:

One service connection name, two consuming tasks — they share the same federated identity.
No client-secret anywhere; nothing in Azure DevOps to rotate.
The Entra app's federated credential is scoped to this pipeline's run identity; another pipeline cannot impersonate it.

Hands-on exercises

Goal: Convert a classic-UI pipeline to YAML. Steps: (1) Pick a small repo with a classic pipeline. (2) In the UI, export the pipeline to YAML (button is on the pipeline page). (3) Commit the YAML, point the pipeline to it, retire the classic definition. You're done when the YAML pipeline runs green and the classic one is disabled.
Goal: Extract a build template and reuse it across two pipelines. Steps: (1) Identify the common steps in two repos' pipelines. (2) Create templates/build.yml with parameters: for the moving parts. (3) Point both pipelines at the template, supplying parameters. (4) Confirm both pipelines still produce identical artifacts. You're done when changing the template once updates both pipelines on the next run.
Goal: Set up Workload Identity Federation for a service connection. Steps: (1) In Entra ID, find or create an app registration for the deploy. (2) Add a federated credential whose subject claim matches the Azure DevOps connection. (3) In Azure DevOps, create the service connection with WIF. (4) Run a deploy pipeline; verify in Azure that the principal authenticated. You're done when the same pipeline succeeds without any stored client secret.
Goal: Add CodeQL, dependency review, and CredScan; verify a fail. Steps: (1) Add the three tasks (CodeQL via Advanced Security, dependency review via OWASP Dependency-Check or Dependabot, CredScan via Microsoft Security DevOps task). (2) Intentionally commit a fake AWS key like AKIA1234567890ABCDEF. (3) Push, watch the build fail on the CredScan step. (4) Revert the commit. You're done when the build fails on the planted secret and passes after revert.
Goal: Configure a "prod" environment with manual approval + business-hours check. Steps: (1) In Azure DevOps → Environments, create prod. (2) Add a "Approvals" check listing two approvers. (3) Add a "Business hours" check restricted to 09:00–17:00 local time, weekdays only. (4) Point a deploy stage at this environment. (5) Try to deploy at 18:30; confirm the gate blocks. You're done when an off-hours deploy is queued and blocked by the gate.
Goal: Build a slot-swap deploy with auto-rollback. Steps: (1) Deploy to the staging slot. (2) Run a warmup HTTP request against the slot. (3) Swap slots. (4) Run a smoke test against production. (5) On smoke failure, run a second swap to roll back. You're done when a broken build is auto-rolled back without manual intervention.
Goal: Author a multi-region rollout. Steps: (1) Define a canary stage targeting one region. (2) Insert a 30-minute bake by querying Application Insights for error rate. (3) On healthy bake, fan out to wave1 (2 regions) and wave2 (remaining). (4) Wire each wave's success to be a precondition for the next. You're done when a bad canary blocks the wider rollout automatically.

Self-check questions

Explain the difference between a stage and a job. Why split into multiple jobs within one stage?
What does an "environment" give you that a stage doesn't?
Why federated identity over a stored client secret? Name two concrete failure modes that disappear.
What's the order: CI, SDL gates, deploy — and what would break if you reversed any pair?
What's the difference between runOnce, rolling, and canary deployment strategies, and when do you reach for each?
Why do path filters matter for monorepos, and what is the failure mode if you forget them?
What's a variable group, and when do you link to Key Vault vs embed values inline?
What's the difference between an extends template and a regular include template?
Why does condition: always() belong on the test-publish task?
Walk through what happens, second by second, when a pipeline using Workload Identity Federation calls az group list.
Name three SDL gates and what each catches that the others do not.
Why is script: a last resort compared to a task?

High-signal resources

Official docs

Azure Pipelines YAML schema — the canonical reference.
Workload Identity Federation for service connections.
Environments + approvals + checks.
Templates: parameters and extends.
Microsoft SDL practices.
SLSA framework — supply-chain integrity levels.

Books or courses

Continuous Delivery — Jez Humble & David Farley. The canonical text; every chapter still relevant.
Accelerate — Forsgren, Humble, Kim. The data behind why CI/CD discipline correlates with business outcomes.

Practitioner posts

The DevOps Handbook — Gene Kim et al; the playbook that consolidated the practices.
Honeycomb's CI/CD writeups — incident-driven posts on production CI.
The Twelve-Factor App — codebase, build, release, run separation; the conceptual base.
Microsoft DevOps Resource Center — broad collection of Azure DevOps and GitHub Actions guidance.

Weekly milestones

Day 1: Read the YAML schema overview + the pipeline anatomy section. Author a hello-world pipeline with a trigger, one stage, one job, two scripts. Answer self-check 1, 2, 9.
Day 2: Add a real build: dotnet restore/build/test, publish test results, publish an artifact. Run on PRs and on main. Answer self-check 6, 12.
Day 3: Extract a templates/build.yml. Consume from a second pipeline. Read the templates docs end to end. Answer self-check 8.
Day 4-5: Add Deploy_Dev, Deploy_Staging, Deploy_Prod with environments and a manual approval on prod. Migrate the prod service connection to Workload Identity Federation. Answer self-check 3, 4, 5, 10.
Day 6-7: Add CodeQL, CredScan, dependency review, container scan, SBOM. Wire them into your extends template if you have one, otherwise into the build stage. Answer self-check 7, 11.

How it shows up in the capstone

The capstone repo ships with azure-pipelines.yml and a templates/ folder. The root pipeline declares CI + Deploy_Dev + Deploy_Staging + Deploy_Prod stages; templates/build.yml owns the dotnet flow; an extends template (imagined to live in a shared platform repo) wraps everything in SDL gates. Service connections use federated identity end to end; secrets live in Key Vault and are read via linked variable groups.

The prod environment has a manual approval (two reviewers), a business-hours check, and a REST gate that pings a deployment-freeze service. The prod stage deploys to a staging slot, runs warmup and smoke tests, then swaps with production. A second pipeline triggers on the build-pipeline resource and runs a multi-region rollout for the rare cases where a single-region App Service is not enough.

When you can describe each of those decisions and defend why they are not over-engineering for the capstone's scope, you have the chapter.

Previous chapter → Ch 17 — Caching
Next chapter → Ch 19 — IaC + rollout orchestration