feat(course): build out all 27 modules, capstone, scaffold, and conventions

Scaffold the course repo and author the full curriculum in dependency-chain order, following the settled build decisions in handoff.md. - Scaffold: course README, vendor-neutral AGENTS.md (dogfoods Module 5), _TEMPLATE.md (the fixed 9-section module shape), root .gitignore, ship config. - Modules 1-2: reference exemplars (locked for tone/depth/lab style). - Modules 3-27: full lessons + runnable labs, each following the template, respecting the chain, vendor/model-agnostic, with "feel the pain" labs. - Module 8 hosting comparison web-researched and date-stamped (as of 2026-06-22), not written from memory; expansion-zone modules carry Verify-before-publish. - Capstone: the full loop end to end on the running tasks-app example. Lab code syntax-checked (Python/shell/YAML); every module has the 7 core template sections. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
2026-06-22 12:18:30 -04:00
parent 4bd586bbd0
commit fbec36cb67
117 changed files with 15131 additions and 1 deletions
@@ -0,0 +1,384 @@
+# Module 18 — Continuous Delivery and Deployment
+
+> **Merged isn't running.** This module closes the last gap in the pipeline — getting approved code
+> from `main` to something actually serving traffic, automatically, with a way back when it's wrong.
+
+---
+
+## Prerequisites
+
+- **Module 10 — Reviewing Code You Didn't Write.** The PR review gate. Auto-deploy is only safe
+  because a human (or an agent under supervision) signed off on the diff first.
+- **Module 14 — Continuous Integration.** You already have a pipeline that lints, builds, and tests
+  on every push. CD is not a new system — it's **more stages on that same pipeline**, after the
+  checks pass.
+- **Module 15 — Security Scanning.** Dependency, secret, and static-analysis gates on the same
+  pushes. These are part of what makes shipping without a human in the loop survivable.
+- **Module 16 — Containers and Reproducible Environments.** The container image is *what you ship*.
+  CD takes that image and runs it somewhere. This module assumes you can already build and tag an
+  image of the `tasks-app`.
+- **Module 17 — Secrets, Config, and Environments.** A running service needs configuration and
+  secrets at runtime — *what it needs to run*. CD wires those into the deploy step instead of baking
+  them into the image.
+
+If you've done 14–17, you have all the parts. This module is the assembly.
+
+---
+
+## Learning objectives
+
+By the end of this module you can:
+
+1. State the precise difference between continuous **delivery** and continuous **deployment**, and
+   decide which one a given project should use.
+2. Extend your CI pipeline with build-and-publish stages that turn a merge into a versioned,
+   deployable artifact.
+3. Wire a deploy step that takes that artifact, injects runtime config/secrets, and brings up the
+   new version — provider-neutrally.
+4. Add a health check and an automatic **rollback** so a bad deploy reverts itself instead of
+   staying down.
+5. Reason about the deploy gate the way this audience already reasons about change windows: what's
+   automated, what's manual, and where the stop button is.
+
+---
+
+## Key concepts
+
+### The gap nobody automated yet
+
+Walk the pipeline you've built so far. A change gets proposed (Module 9), implemented on a branch
+(Module 6), reviewed as a PR (Module 10), checked by CI (Module 14), scanned for vulnerabilities
+(Module 15). It merges. `main` is now correct, tested, and clean.
+
+And then nothing happens. The code that's "done" is sitting in a Git history. The thing your users
+touch is still running last week's version. Somebody — usually you, usually at 6pm — has to SSH in,
+pull, build, restart, and pray. That manual last mile is where most outages are actually born:
+inconsistent steps, a forgotten config flag, a half-restarted service, "wait, which version is in
+prod right now?"
+
+CI answered *"is this change good?"* CD answers the next question: ***"now get the good change
+running, the same way every time."*** It's the same instinct that made CI worth it — replace an
+error-prone manual ritual with an automated, repeatable one — pointed at the last step.
+
+### Delivery vs. deployment: the distinction that matters
+
+These two terms get used interchangeably and they are not the same thing. The difference is exactly
+one decision: **who pushes the button to prod.**
+
+- **Continuous Delivery** — every merge to `main` automatically produces a **deployable artifact**
+  (a built, tagged, tested container image, sitting in a registry) and deploys it as far as a
+  staging/pre-prod environment. Production deploy is **one click by a human**. The pipeline
+  guarantees the artifact is *ready to ship at any moment*; a person decides *when*.
+
+- **Continuous Deployment** — same pipeline, but there's **no button**. If it passes every gate, it
+  goes all the way to production automatically. Merge is the last human action.
+
+```
+                 merge to main
+                      │
+        ┌─────────────┴──────────────┐
+   CONTINUOUS DELIVERY          CONTINUOUS DEPLOYMENT
+        │                            │
+   build + test + scan          build + test + scan
+        │                            │
+   publish artifact             publish artifact
+        │                            │
+   deploy to staging            deploy to staging
+        │                            │
+   [human clicks "ship"] ──►    deploy to prod  (automatic)
+        │                            │
+   deploy to prod                  done
+```
+
+Both are "CD." When someone says "we do CD," ask which one — the operational risk is completely
+different. Continuous deployment is not the more advanced/better option you graduate to; it's a
+different risk posture that's appropriate for some systems and reckless for others. A blog,
+internal dashboard, or stateless web service with good tests is a fine candidate. A billing engine,
+a database migration, or anything with a regulatory change-control requirement usually is not — and
+"a human clicks deploy" is a perfectly mature answer there, not a failure to automate.
+
+The honest default for most teams adopting this: **start with continuous *delivery*.** Get the
+artifact and the deploy step fully automated and trustworthy, keep the human on the prod button, and
+remove that button only once you trust the gates more than you trust the click.
+
+### The artifact is the unit of deploy
+
+Here's the discipline that makes CD reliable, and it comes straight from Module 16: **you deploy a
+built image, not a Git ref.** "Deploy `main`" is ambiguous — it means "go to the prod box, pull,
+and rebuild," and that rebuild can pull a different base image or dependency version than CI tested.
+"Deploy `tasks-app:9f3a2c1`" is not ambiguous. It's the exact bytes CI built and tested.
+
+So the build-and-publish stage does this once, centrally:
+
+1. Build the image from the merged code.
+2. Tag it with something **immutable and traceable** — the Git commit SHA is the standard choice
+   (`tasks-app:9f3a2c1`). Optionally also a moving tag like `:latest` or `:staging` for convenience,
+   but the SHA tag is the one you trust.
+3. Push it to a container registry — the durable, shared home for images, the same way a Git remote
+   (Module 8) is the durable home for commits.
+
+Every later deploy — to staging, to prod, a rollback — just says "run *this* tag." Build once, run
+the identical artifact everywhere. That single property is what kills "works on my machine" at the
+deploy layer.
+
+### The deploy step, provider-neutrally
+
+The shape of a deploy is the same everywhere, whatever the target — a cloud platform, a Kubernetes
+cluster, a single VM, a PaaS:
+
+1. **Pull** the specific image tag onto the target.
+2. **Inject runtime config and secrets** (Module 17) — environment variables, mounted secret files,
+   a secrets-manager lookup. Never baked into the image; supplied at run time so the *same* image
+   runs in staging and prod with different config.
+3. **Start the new version** alongside or in place of the old one.
+4. **Health-check** it before sending real traffic.
+5. **Cut over** if healthy; **roll back** if not.
+
+This module is deliberately provider-agnostic on *where* — the same way Module 8 stayed neutral on
+hosts. The mechanics differ (a `kubectl` apply, a platform CLI, a `docker run`, a `compose up`), but
+the five steps don't. The lab does the simplest possible real version: a local container run. The
+logic is identical at scale.
+
+### Health checks and rollback: the part beginners skip
+
+A deploy that can't tell whether it worked isn't a deploy, it's a gamble. The single most important
+thing CD adds over "SSH in and restart" is that **the pipeline verifies the new version is alive
+before trusting it, and reverses itself when it isn't.**
+
+A health check is a cheap, honest signal that the new version is actually serving — typically an
+endpoint like `/health` that returns `200` only when the app has started clean. The deploy step
+hits it after starting the new version and **waits for green before cutting over.**
+
+Rollback is the other half: if the health check fails, the deploy stops the broken new version and
+brings the **previous known-good image tag** back up. Because you deploy immutable tags, rollback is
+trivial — you still have `tasks-app:<previous-sha>`, so "go back" is just "run the old tag again."
+No rebuild, no git revert race, no scramble. (Reverting the *source* is still Module 12's job for the
+code; rollback here is about the *running artifact*.) The strategies have names you'll meet —
+blue-green (run old and new side by side, flip a switch), canary (send 5% of traffic to new, watch,
+ramp) — but they're all variations on "keep the old one ready until the new one proves itself."
+
+> **Reframe for the ops reader:** you already know this instinct. It's the deployment equivalent of
+> a maintenance window with a back-out plan — except the back-out plan is automated, tested on every
+> single deploy, and takes seconds instead of a panicked hour. CD doesn't remove the discipline you
+> already have; it encodes it so it runs every time instead of only when someone remembers.
+
+---
+
+## The AI angle
+
+CI existed long before AI, and so did CD. What changed is the **rate**, and rate is everything for
+the merged-to-prod gate.
+
+AI writes and ships changes dramatically faster. More PRs open, more merge, and they merge sooner.
+That's the upside — and it means the volume of code flowing toward production goes *up*, while the
+human attention available to babysit each deploy stays flat. The gap between "merged" and "in prod"
+stops being a quiet formality and becomes the place where the speed either pays off or hurts you.
+
+Two consequences follow, and they pull in opposite directions:
+
+- **Automating the deploy matters more.** If a human has to hand-deploy every AI-generated change,
+  the manual last mile becomes the bottleneck that eats all the speed AI just gave you. CD is what
+  lets the throughput actually reach users.
+- **The gate matters more.** Faster shipping of code that *looks right* (the recurring AI failure
+  mode from Modules 1 and 14) means a bad change reaches prod faster too — unless something catches
+  it. This is the crucial point: **continuous deployment is only survivable because of the gates in
+  front of it.** Review (Module 10), CI tests (Module 14), and security scanning (Module 15) are not
+  bureaucracy you tolerate — they are the *entire reason* you're allowed to remove the human from the
+  deploy button. Take auto-deploy without those gates and you've built a machine that ships AI
+  mistakes to production at full speed.
+
+So the AI-era posture is specific: **strengthen the early gates, then automate the late ones.** The
+more you trust review + CI + scanning, the further right you can safely push automation — up to and
+including no human on the prod button. The strength of the gates is the dial that decides whether
+continuous *deployment* is responsible or reckless for a given repo. And when an agent itself is the
+one merging (Unit 5), this stops being theoretical: the deploy gate is the last thing standing
+between an autonomous contributor and your users.
+
+---
+
+## Hands-on lab
+
+**Lab language:** shell, driving the container tooling from Module 16. You'll extend the `tasks-app`
+into a tiny running service, then build a deploy script that ships it locally with a health check and
+automatic rollback — the whole CD motion, simulated on your own machine.
+
+This lab simulates deployment with a **local container run** so it works on any machine with no cloud
+account. The five deploy steps are real; only the *target* is your laptop instead of a server.
+
+**You'll need:**
+
+- A container runtime from Module 16 — Docker or Podman. (Commands below use `docker`; if you run
+  Podman, `alias docker=podman` or substitute.)
+- The `tasks-app` from Modules 1–2, now a Git repo.
+- `curl` (for the health check) and a bash-capable shell. On Windows, use WSL or Git Bash.
+- Your AI assistant — by now, ideally editor-integrated (Module 4).
+
+Starter files are in this module's `lab/` folder:
+
+- `serve.py` — turns the `tasks-app` into a minimal HTTP service with a `/health` endpoint, using
+  only the Python standard library (no dependencies). This is the long-running thing CD deploys.
+- `Dockerfile` — the Module 16 container image, adjusted to run the service.
+- `deploy.sh` — the deploy step: build, tag, run, health-check, cut over or roll back.
+- `cd-starter.yml` — the CD pipeline stages, written as GitHub Actions and extending the Module 14
+  CI file. GitLab/other-forge notes are in the comments.
+
+### Part A — Make something worth deploying
+
+A CLI that exits immediately is awkward to "deploy." Give the app a long-running face.
+
+1. Copy `lab/serve.py` and `lab/Dockerfile` into your `tasks-app` folder next to `tasks.py` and
+   `cli.py`. Read `serve.py` — it's ~40 lines wrapping the `TaskList` you already have in a stdlib
+   HTTP server with two routes: `/health` and `/tasks`.
+
+2. Run it locally first, no container, to see it work:
+
+   ```bash
+   python serve.py        # serves on http://localhost:8000
+   ```
+
+   In another terminal:
+
+   ```bash
+   curl localhost:8000/health     # {"status": "ok", "version": "dev"}
+   curl localhost:8000/tasks      # your tasks as JSON
+   ```
+
+   Stop it with Ctrl-C. Commit this (`git add . && git commit -m "Add HTTP service + Dockerfile"`).
+
+### Part B — Build and tag the artifact
+
+3. Build the image and tag it with the current commit SHA — the immutable, traceable tag:
+
+   ```bash
+   SHA=$(git rev-parse --short HEAD)
+   docker build -t tasks-app:$SHA -t tasks-app:latest .
+   docker images tasks-app        # see both tags pointing at one image
+   ```
+
+   That `:$SHA` tag is the unit of deploy. Everything downstream refers to *this exact image*.
+
+### Part C — Deploy it (with a net)
+
+4. Read `lab/deploy.sh`. It does the five steps: stops any running `tasks-app` container, starts the
+   new image with runtime config injected as env vars (Module 17 — note the `APP_VERSION` and the
+   *absence* of any secret baked into the image), polls `/health` until green, and on failure rolls
+   back to the previous tag it recorded. Make it executable and run it:
+
+   ```bash
+   chmod +x deploy.sh
+   ./deploy.sh $SHA
+   ```
+
+   Watch it build, run, health-check, and report the deploy healthy. Hit it:
+
+   ```bash
+   curl localhost:8000/health     # now reports the SHA you deployed
+   ```
+
+   Run `./deploy.sh` again after another commit and notice it records the prior version as the
+   rollback target. You now have continuous *delivery* in miniature: one command turns a commit into
+   a running, version-tagged service.
+
+### Part D — Break a deploy and watch it roll back
+
+5. Now prove the net works. The service honors a `BREAK=1` env var that makes `/health` return `500`
+   — a stand-in for "this build starts but is actually broken." Deploy a healthy version first so
+   there's a known-good to fall back to, then force a bad one:
+
+   ```bash
+   ./deploy.sh $SHA               # healthy baseline
+   BREAK=1 ./deploy.sh $SHA       # same image, but the new instance fails its health check
+   ```
+
+   The script starts the "new" version, the health check fails, and it **automatically stops the
+   broken instance and brings the previous good one back up.** Confirm you're still serving:
+
+   ```bash
+   curl localhost:8000/health     # ok — the bad deploy reverted itself
+   ```
+
+   That automatic reversal — not the build, not the run — is the part that makes auto-deploy
+   something you can sleep through.
+
+### Part E — Wire it into the pipeline (read + reason)
+
+6. Open `lab/cd-starter.yml` and compare it to the Module 14 `ci-starter.yml`. It's the **same
+   pipeline with stages appended**: the lint/test/scan gates run first (unchanged), and only `on:
+   push` to `main` (a merge) do the build-publish-deploy stages run. Trace the `needs:`/dependency
+   chain that makes deploy run *only after* the checks pass.
+
+7. Find the one line that is the delivery-vs-deployment switch — the deploy-to-prod step gated behind
+   a manual approval (`environment:` with a required reviewer, commented in the file). Decide, for
+   the `tasks-app`, which side you'd choose and why, and ask your AI assistant to make the case for
+   the *other* choice. The goal isn't a "right" answer; it's being able to articulate the risk
+   posture either way.
+
+> **A note on running the full pipeline:** actually executing `cd-starter.yml` end to end needs a
+> forge with a container registry and a deploy target wired up — that's environment-specific and
+> partly Module 19's territory (the runners and compute underneath). Parts A–D give you the deploy
+> *logic* runnable today on your own machine; the YAML shows how it slots into the automated
+> pipeline you already started in Module 14.
+
+---
+
+## Where it breaks
+
+Be honest about the edges — this is where teams get burned.
+
+- **The deploy is only as safe as the gates in front of it.** Continuous deployment with weak tests
+  and no review isn't "moving fast," it's an automated mistake-shipping machine. If you haven't done
+  the Module 10/14/15 work, do *delivery* (human on the button), not *deployment*. Auto-deploy is a
+  reward you earn by trusting your gates, not a default you turn on.
+- **Health checks lie.** A `200` from `/health` means "the process started," not "the feature
+  works." A shallow health check passes while the app returns garbage to users. Make the check
+  meaningful (does it reach its database? can it serve a real request?) and lean on canary/gradual
+  rollout for anything important — but know that no health check replaces real tests and real
+  monitoring.
+- **Rollback isn't free, and some things don't roll back.** Reverting the *running image* is cheap.
+  Reverting a **database migration**, a sent email, a charged credit card, or a published message is
+  not — those are forward-only. The cleaner the separation between code deploys and irreversible
+  state changes, the more rollback actually saves you. Don't assume "we can always roll back" covers
+  data.
+- **This lab simulates the target.** A local `docker run` is the deploy logic, not the deploy
+  reality. Real targets add networking, DNS cutover, load balancers, zero-downtime orchestration,
+  and multiple instances. The five steps hold; the operational surface around them is larger. The
+  *compute* that runs all of this — and why you might run your own — is Module 19.
+- **"Build once" only holds if you actually do.** The instant someone rebuilds on the prod box "just
+  to be sure," you've lost the guarantee that prod runs what CI tested. Deploy the artifact CI built.
+  No rebuilds downstream.
+
+---
+
+## Check for understanding
+
+**You're done when:**
+
+- You can state the difference between continuous delivery and continuous deployment in one sentence
+  — *who clicks the prod button* — and say which one `tasks-app` should use and why.
+- `./deploy.sh` builds, tags by commit SHA, runs the container, and reports a healthy deploy you can
+  `curl`.
+- You have **watched a bad deploy roll itself back** to the previous good version, and the service
+  stayed up.
+- You can point at the line in `cd-starter.yml` that turns delivery into deployment, and explain what
+  gates have to be trustworthy before you'd flip it.
+
+When a deploy is one command, a bad one reverts itself, and you can argue the delivery-vs-deployment
+call for a given repo, you've closed the merged-to-running gap. Module 19 goes underneath all of
+this — the runners and compute actually executing your CI/CD, and why you'd own them.
+
+---
+
+## Verify-before-publish
+
+This is expansion-zone material (Module 15+); some specifics drift. Re-check at build/publish time:
+
+- [ ] **Action/runner versions** in `cd-starter.yml` (`actions/checkout`, `actions/setup-python`,
+      any build/login/push actions) — pin to current major versions and confirm they still exist.
+- [ ] **Registry login + push syntax** — the standard build-and-push action names and auth flow
+      change; verify against current forge docs rather than the comments here.
+- [ ] **Manual-approval mechanism** — the way a forge gates a job behind human approval
+      (GitHub `environment` protection rules, GitLab `when: manual`, others) shifts in naming/UI.
+      Confirm the delivery-vs-deployment switch still maps to the current feature.
+- [ ] **Container runtime commands** — confirm `docker`/`podman` flags used in `deploy.sh`
+      (`run`, `--health-*`, `inspect`) match current CLI behavior.
+- [ ] **Cross-references** to Modules 16, 17, and 19 still match those modules' final content.
@@ -0,0 +1,24 @@
+# The Module 16 container image for the tasks-app, set to run the HTTP service from serve.py.
+#
+# This is *what you ship* (Module 16). Continuous delivery/deployment (this module) builds this
+# once, tags it with the commit SHA, and runs that exact artifact everywhere.
+#
+# Note what is NOT here: no secrets, no environment-specific config. Those are injected at run time
+# (Module 17), which is why the same image can run in staging and prod unchanged.
+
+FROM python:3.12-slim
+
+WORKDIR /app
+
+# The app is dependency-free (stdlib only), so there is nothing to pip install. Copy the source.
+COPY tasks.py cli.py serve.py ./
+
+# Document the port the service listens on.
+EXPOSE 8000
+
+# A built-in container health check. The deploy step also checks /health from outside, but this
+# lets the runtime itself know whether the container is healthy.
+HEALTHCHECK --interval=5s --timeout=3s --retries=3 \
+    CMD python -c "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://localhost:8000/health').status==200 else 1)"
+
+CMD ["python", "serve.py"]
@@ -0,0 +1,87 @@
+# Starter CD pipeline for the tasks-app — GitHub Actions flavor, extending the Module 14 CI file.
+#
+# The whole idea: CD is not a new system. It is MORE STAGES on the SAME pipeline, after the checks
+# pass. The lint/test gates below are the Module 14 pipeline, unchanged. Everything from the
+# `build-and-publish` job down is new in this module.
+#
+# Where this file goes: .github/workflows/cd.yml (or fold it into your existing ci.yml). On GitLab,
+# the same shape is stages in .gitlab-ci.yml with `needs:`/`rules:`; Forgejo/Gitea use Actions-
+# compatible YAML. The concept — gated stages from merge to running — is identical everywhere.
+#
+# VERIFY BEFORE PUBLISH: action versions, the registry login/build-push action names, and the
+# manual-approval mechanism all drift. Check current forge docs at build time (see README checklist).
+
+name: CD
+
+on:
+  push:
+    branches: [main]      # only a MERGE to main triggers a deploy
+  pull_request:           # PRs still run the gates, but never deploy
+
+jobs:
+  # ---- The Module 14 gates: nothing ships without passing these first. ----------------------------
+  check:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+      - run: pip install pytest ruff
+      - run: ruff check .          # lint
+      - run: pytest -q             # test
+      # In a real pipeline a security-scan job (Module 15) would also gate here.
+
+  # ---- Build the artifact ONCE and publish it. The unit of deploy is an immutable, SHA-tagged image.
+  build-and-publish:
+    needs: check                   # only runs if the gates passed
+    if: github.ref == 'refs/heads/main'
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      # Log in to your container registry (Module 16's images need a durable home, like a Git remote
+      # is for commits). Registry/credentials are provider-specific — supply them as secrets,
+      # never inline (Module 17).
+      # - uses: docker/login-action@v3
+      #   with:
+      #     registry: ${{ vars.REGISTRY }}
+      #     username: ${{ secrets.REGISTRY_USER }}
+      #     password: ${{ secrets.REGISTRY_TOKEN }}
+
+      # Build and push, tagging with the commit SHA (immutable + traceable) and :staging (moving).
+      # - uses: docker/build-push-action@v6
+      #   with:
+      #     push: true
+      #     tags: |
+      #       ${{ vars.REGISTRY }}/tasks-app:${{ github.sha }}
+      #       ${{ vars.REGISTRY }}/tasks-app:staging
+      - run: echo "build + push tasks-app:${{ github.sha }} (wire up the registry steps above)"
+
+  # ---- Deploy to a NON-prod environment automatically. Safe to do on every merge. ----------------
+  deploy-staging:
+    needs: build-and-publish
+    if: github.ref == 'refs/heads/main'
+    runs-on: ubuntu-latest
+    steps:
+      # The five deploy steps live in deploy.sh in this folder. On a real target this would run the
+      # platform's deploy (kubectl / platform CLI / compose) against the SHA-tagged image, inject
+      # runtime config + secrets (Module 17), health-check, and roll back on failure.
+      - run: echo "deploy tasks-app:${{ github.sha }} to STAGING, health-check, roll back if red"
+
+  # ---- THIS JOB IS THE DELIVERY-vs-DEPLOYMENT SWITCH. ---------------------------------------------
+  #
+  # As written, `environment: production` requires a human to approve before this job runs (set a
+  # required reviewer on the 'production' environment in the forge). That is CONTINUOUS DELIVERY:
+  # the artifact is auto-built and staged; a person clicks to ship to prod.
+  #
+  # Delete the `environment:` block and this becomes CONTINUOUS DEPLOYMENT: merge -> prod, no human.
+  # Only remove it once you trust your review + CI + security gates (Modules 10/14/15) more than you
+  # trust the click. On GitLab the equivalent switch is `when: manual` vs. automatic.
+  deploy-prod:
+    needs: deploy-staging
+    if: github.ref == 'refs/heads/main'
+    runs-on: ubuntu-latest
+    environment: production        # <-- required-reviewer gate = delivery. Remove = deployment.
+    steps:
+      - run: echo "deploy tasks-app:${{ github.sha }} to PRODUCTION (gated on human approval)"
@@ -0,0 +1,95 @@
+#!/usr/bin/env bash
+#
+# deploy.sh — the deploy step of CD, simulated with a local container run.
+#
+# The five steps of any deploy, provider-neutral (see the module README):
+#   1. build/pull the specific image tag        4. health-check before trusting it
+#   2. inject runtime config + secrets          5. cut over if healthy, ROLL BACK if not
+#   3. start the new version
+#
+# The *target* here is your own machine instead of a server, but the logic is the real thing.
+#
+# Usage:
+#   ./deploy.sh <tag>        # e.g. ./deploy.sh $(git rev-parse --short HEAD)
+#   BREAK=1 ./deploy.sh <tag># force the new version's health check to fail, to demo rollback
+#
+# Requires: docker (or `alias docker=podman`), curl.
+
+set -euo pipefail
+
+IMAGE="tasks-app"
+CONTAINER="tasks-app"
+PORT="8000"
+STATE_FILE=".deploy-state"          # records the last good tag, for rollback
+TAG="${1:-$(git rev-parse --short HEAD)}"
+
+say() { printf '\n=== %s\n' "$*"; }
+
+# --- Step 1: build the artifact for this tag (in real CD this was already built+pushed by CI) -----
+say "Building ${IMAGE}:${TAG}"
+docker build -t "${IMAGE}:${TAG}" .
+
+# Remember what is currently running so we can roll back to it.
+PREVIOUS=""
+if [ -f "${STATE_FILE}" ]; then
+  PREVIOUS="$(cat "${STATE_FILE}")"
+fi
+
+# --- Steps 2 + 3: start the new version with runtime config/secrets injected (Module 17) ----------
+# Note: APP_VERSION is config supplied at run time, NOT baked into the image. A real deploy would
+# also pass secrets here (e.g. --env-file, a mounted secret, or a secrets-manager lookup) — never
+# committed, never in the image.
+start_version() {
+  local tag="$1"
+  docker rm -f "${CONTAINER}" >/dev/null 2>&1 || true
+  docker run -d --name "${CONTAINER}" \
+    -p "${PORT}:8000" \
+    -e "APP_VERSION=${tag}" \
+    ${BREAK:+-e "BREAK=${BREAK}"} \
+    "${IMAGE}:${tag}" >/dev/null
+}
+
+say "Starting ${IMAGE}:${TAG}"
+start_version "${TAG}"
+
+# --- Step 4: health-check the new version before trusting it --------------------------------------
+healthy() {
+  for _ in $(seq 1 10); do
+    if curl -fs "http://localhost:${PORT}/health" >/dev/null 2>&1; then
+      return 0
+    fi
+    sleep 1
+  done
+  return 1
+}
+
+say "Health-checking http://localhost:${PORT}/health"
+if healthy; then
+  # --- Step 5a: cut over. Record this as the new known-good for the next deploy's rollback target.
+  echo "${TAG}" > "${STATE_FILE}"
+  say "DEPLOY OK — ${IMAGE}:${TAG} is live and healthy"
+  curl -s "http://localhost:${PORT}/health"; echo
+  exit 0
+fi
+
+# --- Step 5b: ROLLBACK. The new version failed its health check. ----------------------------------
+say "HEALTH CHECK FAILED for ${IMAGE}:${TAG} — rolling back"
+docker rm -f "${CONTAINER}" >/dev/null 2>&1 || true
+
+if [ -z "${PREVIOUS}" ]; then
+  echo "No previous known-good version to roll back to. Service is DOWN." >&2
+  echo "(Deploy a healthy version first, then re-run the BREAK=1 deploy to see rollback work.)" >&2
+  exit 1
+fi
+
+# Rollback is trivial because we deploy immutable tags: just run the old one again. No rebuild.
+say "Restoring previous good version ${IMAGE}:${PREVIOUS}"
+BREAK="" start_version "${PREVIOUS}"     # clear BREAK so the good version comes up clean
+if healthy; then
+  say "ROLLED BACK — ${IMAGE}:${PREVIOUS} is live and healthy. The bad deploy reverted itself."
+  curl -s "http://localhost:${PORT}/health"; echo
+  exit 1   # exit non-zero: the deploy you asked for did NOT ship, even though service recovered
+else
+  echo "Rollback FAILED — service is DOWN. Investigate ${IMAGE}:${PREVIOUS}." >&2
+  exit 2
+fi
@@ -0,0 +1,67 @@
+"""Minimal HTTP face for the tasks-app, so there is something long-running to *deploy*.
+
+Standard library only — no pip install, so the container image stays tiny and the lab has no
+dependencies to drift. It reuses the TaskList from tasks.py (Modules 1-2) unchanged.
+
+Run it:
+    python serve.py            # serves on http://localhost:8000
+
+Endpoints:
+    GET /health   -> {"status": "ok", "version": <APP_VERSION>}   (200)
+    GET /tasks    -> the current tasks as JSON
+
+Two environment knobs make this realistic for the CD lab (config injected at run time, Module 17):
+    APP_VERSION   what /health reports as the running version (set by deploy.sh to the commit SHA)
+    BREAK=1       force /health to return 500 — a stand-in for "this build starts but is broken",
+                  used in Part D to trigger an automatic rollback.
+"""
+
+import json
+import os
+from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
+from pathlib import Path
+
+from tasks import Task, TaskList
+
+STATE = Path(__file__).parent / "tasks.json"
+PORT = int(os.environ.get("PORT", "8000"))
+APP_VERSION = os.environ.get("APP_VERSION", "dev")
+BREAK = os.environ.get("BREAK") == "1"
+
+
+def load() -> TaskList:
+    if not STATE.exists():
+        return TaskList()
+    raw = json.loads(STATE.read_text())
+    return TaskList(tasks=[Task(**t) for t in raw])
+
+
+class Handler(BaseHTTPRequestHandler):
+    def _send(self, code: int, payload: dict) -> None:
+        body = json.dumps(payload).encode()
+        self.send_response(code)
+        self.send_header("Content-Type", "application/json")
+        self.send_header("Content-Length", str(len(body)))
+        self.end_headers()
+        self.wfile.write(body)
+
+    def do_GET(self) -> None:
+        if self.path == "/health":
+            # A real health check would also confirm dependencies (db, etc.) are reachable.
+            if BREAK:
+                self._send(500, {"status": "unhealthy", "version": APP_VERSION})
+            else:
+                self._send(200, {"status": "ok", "version": APP_VERSION})
+        elif self.path == "/tasks":
+            tlist = load()
+            self._send(200, {"tasks": [t.__dict__ for t in tlist.tasks]})
+        else:
+            self._send(404, {"error": "not found"})
+
+    def log_message(self, *args) -> None:  # keep the lab output clean
+        pass
+
+
+if __name__ == "__main__":
+    print(f"serving tasks-app version={APP_VERSION} on http://localhost:{PORT}")
+    ThreadingHTTPServer(("0.0.0.0", PORT), Handler).serve_forever()