07182429c4
CI / check (pull_request) Successful in 7s
Each lab now stands on its own; no hard dependency on prior labs. - App-based labs get a canonical tasks-app snapshot in lab/start/ (three baselines: v0 add/list/done; v1 +count; v2 +count/delete), assigned by where each module sits in the command timeline. Modules with a purpose-built app (M10 trap, M13 planted bug, M21) snapshot their own app; planted devices kept. - Self-contained labs (M15/17/18/19/22/23/24/25/27, which operate on their own lab files) get a preamble pointing at modules/NN/lab/. - Every module + capstone gets a "Starting point (skip-friendly)" preamble: copy the snapshot, git init -b main, commit -> clean status, then start. Lets a learner skip around or recover: copy start/, commit, go. All snapshots run; tools/check.sh passes; no em-dashes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
403 lines
22 KiB
Markdown
403 lines
22 KiB
Markdown
# Module 18: Continuous Delivery and Deployment
|
||
|
||
> **Merged isn't running.** This module closes the last gap in the pipeline: getting approved code
|
||
> from `main` to something actually serving traffic, automatically, with a way back when it's wrong.
|
||
|
||
---
|
||
|
||
## Prerequisites
|
||
|
||
- **Module 10: Reviewing Code You Didn't Write.** The PR review gate. Auto-deploy is only safe
|
||
because a human (or an agent under supervision) signed off on the diff first.
|
||
- **Module 14: Continuous Integration.** You already have a pipeline that lints, builds, and tests
|
||
on every push. CD is not a new system; it's **more stages on that same pipeline**, after the
|
||
checks pass.
|
||
- **Module 15: Security Scanning.** Dependency, secret, and static-analysis gates on the same
|
||
pushes. These are part of what makes shipping without a human in the loop survivable.
|
||
- **Module 16: Containers and Reproducible Environments.** The container image is *what you ship*.
|
||
CD takes that image and runs it somewhere. This module assumes you can already build and tag an
|
||
image of the `tasks-app`.
|
||
- **Module 17: Secrets, Config, and Environments.** A running service needs configuration and
|
||
secrets at runtime, *what it needs to run*. CD wires those into the deploy step instead of baking
|
||
them into the image.
|
||
|
||
If you've done 14–17, you have all the parts. This module is the assembly.
|
||
|
||
---
|
||
|
||
## Learning objectives
|
||
|
||
By the end of this module you can:
|
||
|
||
1. State the precise difference between continuous **delivery** and continuous **deployment**, and
|
||
decide which one a given project should use.
|
||
2. Extend your CI pipeline with build-and-publish stages that turn a merge into a versioned,
|
||
deployable artifact.
|
||
3. Wire a deploy step that takes that artifact, injects runtime config/secrets, and brings up the
|
||
new version, provider-neutrally.
|
||
4. Add a health check and an automatic **rollback** so a bad deploy reverts itself instead of
|
||
staying down.
|
||
5. Reason about the deploy gate the way this audience already reasons about change windows: what's
|
||
automated, what's manual, and where the stop button is.
|
||
|
||
---
|
||
|
||
## Key concepts
|
||
|
||
### The gap nobody automated yet
|
||
|
||
Walk the pipeline you've built so far. A change gets proposed (Module 9), implemented on a branch
|
||
(Module 6), reviewed as a PR (Module 10), checked by CI (Module 14), scanned for vulnerabilities
|
||
(Module 15). It merges. `main` is now correct, tested, and clean.
|
||
|
||
And then nothing happens. The code that's "done" is sitting in a Git history. The thing your users
|
||
touch is still running last week's version. Somebody (usually you, usually at 6pm) has to SSH in,
|
||
pull, build, restart, and pray. That manual last mile is where most outages are actually born:
|
||
inconsistent steps, a forgotten config flag, a half-restarted service, "wait, which version is in
|
||
prod right now?"
|
||
|
||
CI answered *"is this change good?"* CD answers the next question: ***"now get the good change
|
||
running, the same way every time."*** It's the same instinct that made CI worth it, the one that
|
||
replaces an error-prone manual ritual with an automated, repeatable one, now pointed at the last
|
||
step.
|
||
|
||
### Delivery vs. deployment: the distinction that matters
|
||
|
||
These two terms get used interchangeably and they are not the same thing. The difference is exactly
|
||
one decision: **who pushes the button to prod.**
|
||
|
||
- **Continuous Delivery:** every merge to `main` automatically produces a **deployable artifact**
|
||
(a built, tagged, tested container image, sitting in a registry) and deploys it as far as a
|
||
staging/pre-prod environment. Production deploy is **one click by a human**. The pipeline
|
||
guarantees the artifact is *ready to ship at any moment*; a person decides *when*.
|
||
|
||
- **Continuous Deployment:** same pipeline, but there's **no button**. If it passes every gate, it
|
||
goes all the way to production automatically. Merge is the last human action.
|
||
|
||
```
|
||
merge to main
|
||
│
|
||
┌─────────────┴──────────────┐
|
||
CONTINUOUS DELIVERY CONTINUOUS DEPLOYMENT
|
||
│ │
|
||
build + test + scan build + test + scan
|
||
│ │
|
||
publish artifact publish artifact
|
||
│ │
|
||
deploy to staging deploy to staging
|
||
│ │
|
||
[human clicks "ship"] ──► deploy to prod (automatic)
|
||
│ │
|
||
deploy to prod done
|
||
```
|
||
|
||
Both are "CD." When someone says "we do CD," ask which one; the operational risk is completely
|
||
different. Continuous deployment is not the more advanced/better option you graduate to; it's a
|
||
different risk posture that's appropriate for some systems and reckless for others. A blog,
|
||
internal dashboard, or stateless web service with good tests is a fine candidate. A billing engine,
|
||
a database migration, or anything with a regulatory change-control requirement usually is not, and
|
||
"a human clicks deploy" is a perfectly mature answer there, not a failure to automate.
|
||
|
||
The honest default for most teams adopting this: **start with continuous *delivery*.** Get the
|
||
artifact and the deploy step fully automated and trustworthy, keep the human on the prod button, and
|
||
remove that button only once you trust the gates more than you trust the click.
|
||
|
||
### The artifact is the unit of deploy
|
||
|
||
Here's the discipline that makes CD reliable, and it comes straight from Module 16: **you deploy a
|
||
built image, not a Git ref.** "Deploy `main`" is ambiguous; it means "go to the prod box, pull,
|
||
and rebuild," and that rebuild can pull a different base image or dependency version than CI tested.
|
||
"Deploy `tasks-app:9f3a2c1`" is not ambiguous. It's the exact bytes CI built and tested.
|
||
|
||
So the build-and-publish stage does this once, centrally:
|
||
|
||
1. Build the image from the merged code.
|
||
2. Tag it with something **immutable and traceable**: the Git commit SHA is the standard choice
|
||
(`tasks-app:9f3a2c1`). Optionally also a moving tag like `:latest` or `:staging` for convenience,
|
||
but the SHA tag is the one you trust.
|
||
3. Push it to a container registry, the durable home for images the same way a Git remote
|
||
(Module 8) is the durable home for commits.
|
||
|
||
Every later deploy (to staging, to prod, a rollback) just says "run *this* tag." Build once, run
|
||
the identical artifact everywhere. That single property is what kills "works on my machine" at the
|
||
deploy layer.
|
||
|
||
### The deploy step, provider-neutrally
|
||
|
||
The shape of a deploy is the same everywhere, whatever the target (a cloud platform, a Kubernetes
|
||
cluster, a single VM, a PaaS):
|
||
|
||
1. **Pull** the specific image tag onto the target.
|
||
2. **Inject runtime config and secrets** (Module 17): environment variables, mounted secret files,
|
||
a secrets-manager lookup. Never baked into the image; supplied at run time so the *same* image
|
||
runs in staging and prod with different config.
|
||
3. **Start the new version** alongside or in place of the old one.
|
||
4. **Health-check** it before sending real traffic.
|
||
5. **Cut over** if healthy; **roll back** if not.
|
||
|
||
This module is deliberately provider-agnostic on *where*, the same way Module 8 stayed neutral on
|
||
hosts. The mechanics differ (a `kubectl` apply, a platform CLI, a `docker run`, a `compose up`), but
|
||
the five steps don't. The lab does the simplest possible real version: a local container run. The
|
||
logic is identical at scale.
|
||
|
||
### Health checks and rollback: the part beginners skip
|
||
|
||
A deploy that can't tell whether it worked isn't a deploy, it's a gamble. The single most important
|
||
thing CD adds over "SSH in and restart" is that **the pipeline verifies the new version is alive
|
||
before trusting it, and reverses itself when it isn't.**
|
||
|
||
A health check is a cheap, honest signal that the new version is actually serving: typically an
|
||
endpoint like `/health` that returns `200` only when the app has started clean. The deploy step
|
||
hits it after starting the new version and **waits for green before cutting over.**
|
||
|
||
Rollback is the other half. If the health check fails, the deploy stops the broken new version and
|
||
brings the **previous known-good image tag** back up. Because you deploy immutable tags, rollback is
|
||
trivial: you still have `tasks-app:<previous-sha>`, so "go back" is just "run the old tag again."
|
||
No rebuild, no git revert race, no scramble. (Reverting the *source* is still Module 12's job for the
|
||
code; rollback here is about the *running artifact*.) The strategies have names you'll meet:
|
||
blue-green (run old and new side by side, flip a switch) and canary (send 5% of traffic to new,
|
||
watch, ramp). They're all variations on "keep the old one ready until the new one proves itself."
|
||
|
||
> **Reframe for the ops reader:** you already know this instinct. It's the deployment equivalent of
|
||
> a maintenance window with a back-out plan, except the back-out plan is automated, tested on every
|
||
> single deploy, and takes seconds instead of a panicked hour. CD doesn't remove the discipline you
|
||
> already have; it encodes it so it runs every time instead of only when someone remembers.
|
||
|
||
---
|
||
|
||
## The AI angle
|
||
|
||
CI existed long before AI, and so did CD. What changed is the **rate**, and rate is everything for
|
||
the merged-to-prod gate.
|
||
|
||
AI writes and ships changes dramatically faster. More PRs open, more merge, and they merge sooner.
|
||
That's the upside, and it means the volume of code flowing toward production goes *up*, while the
|
||
human attention available to babysit each deploy stays flat. The gap between "merged" and "in prod"
|
||
stops being a quiet formality and becomes the place where that speed either pays off or hurts you.
|
||
|
||
Two consequences follow, and they pull in opposite directions:
|
||
|
||
- **Automating the deploy matters more.** If a human has to hand-deploy every AI-generated change,
|
||
the manual last mile becomes the bottleneck that eats all the speed AI just gave you. CD is what
|
||
lets the throughput actually reach users.
|
||
- **The gate matters more.** Faster shipping of code that *looks right* (the recurring AI failure
|
||
mode from Modules 1 and 14) means a bad change reaches prod faster too, unless something catches
|
||
it. This is the crucial point: **continuous deployment is only survivable because of the gates in
|
||
front of it.** Review (Module 10), CI tests (Module 14), and security scanning (Module 15) are not
|
||
bureaucracy you tolerate. They are the *entire reason* you're allowed to remove the human from the
|
||
deploy button. Take auto-deploy without those gates and you've built a machine that ships AI
|
||
mistakes to production at full speed.
|
||
|
||
So the AI-era posture is specific: **strengthen the early gates, then automate the late ones.** The
|
||
more you trust review + CI + scanning, the further right you can safely push automation, up to and
|
||
including no human on the prod button. The strength of the gates is the dial that decides whether
|
||
continuous *deployment* is responsible or reckless for a given repo. And when an agent itself is the
|
||
one merging (Unit 5), this stops being theoretical: the deploy gate is the last thing standing
|
||
between an autonomous contributor and your users.
|
||
|
||
---
|
||
|
||
## Hands-on lab
|
||
|
||
|
||
> **Starting point (this lab is skip-friendly).** This lab is self-contained and does not depend
|
||
> on the earlier labs. Its files live in `modules/18-continuous-delivery-and-deployment/lab/`. Copy them into a working folder
|
||
> and make a first commit so you start clean:
|
||
>
|
||
> ```bash
|
||
> cp -r ~/ai-workflow-course/modules/18-continuous-delivery-and-deployment/lab ~/ai-workflow-course/18-continuous-delivery-and-deployment-lab
|
||
> cd ~/ai-workflow-course/18-continuous-delivery-and-deployment-lab && git init -b main && git add -A && git commit -m "start: module 18"
|
||
> ```
|
||
**Lab language:** shell, driving the container tooling from Module 16. You'll extend the `tasks-app`
|
||
into a tiny running service, then build a deploy script that ships it locally with a health check and
|
||
automatic rollback, the whole CD motion simulated on your own machine.
|
||
|
||
This lab simulates deployment with a **local container run** so it works on any machine with no cloud
|
||
account. The five deploy steps are real; only the *target* is your laptop instead of a server.
|
||
|
||
**You'll need:**
|
||
|
||
- A container runtime from Module 16: Docker or Podman. (Commands below use `docker`; if you run
|
||
Podman, `alias docker=podman` or substitute.) As in Module 16, the engine must be **running**
|
||
before you build or deploy. On macOS/Windows start Docker Desktop (or `podman machine start`);
|
||
`docker --version` succeeds even when the engine is stopped, so confirm it's live with
|
||
`docker info` first, or `deploy.sh`'s build step fails with "Cannot connect to the Docker daemon."
|
||
- The `tasks-app` from Modules 1–2, now a Git repo.
|
||
- `curl` (for the health check) and a bash-capable shell. On Windows, use WSL or Git Bash.
|
||
- Claude Code (sub your own agent), editor-integrated as of Module 4. From here you **direct it** to
|
||
do the setup, commit, build, and deploy work, then you **verify** the result; you don't type those
|
||
commands by hand.
|
||
|
||
Starter files are in this module's `lab/` folder:
|
||
|
||
- `serve.py`: turns the `tasks-app` into a minimal HTTP service with a `/health` endpoint, using
|
||
only the Python standard library (no dependencies). This is the long-running thing CD deploys.
|
||
- `Dockerfile`: the Module 16 container image, adjusted to run the service.
|
||
- `deploy.sh`: the deploy step: build, tag, run, health-check, cut over or roll back.
|
||
- `cd-starter.yml`: the CD pipeline stages, written as GitHub Actions and extending the Module 14
|
||
CI file. GitLab/other-forge notes are in the comments.
|
||
|
||
### Part A: Make something worth deploying
|
||
|
||
A CLI that exits immediately is awkward to "deploy." Give the app a long-running face.
|
||
|
||
1. Direct Claude Code to bring the starter files into your `tasks-app` folder next to `tasks.py` and
|
||
`cli.py`: *"Copy `serve.py`, `Dockerfile`, and `deploy.sh` from this module's `lab/` into the
|
||
tasks-app folder."* Then **read `serve.py` yourself**; it's ~40 lines wrapping the `TaskList` you
|
||
already have in a stdlib HTTP server with two routes, `/health` and `/tasks`. Verify the three
|
||
files landed next to `tasks.py`/`cli.py`.
|
||
|
||
2. Run the service locally first, no container, to see it work:
|
||
|
||
```bash
|
||
python serve.py # serves on http://localhost:8000
|
||
```
|
||
|
||
In another terminal:
|
||
|
||
```bash
|
||
curl localhost:8000/health # {"status": "ok", "version": "dev"}
|
||
curl localhost:8000/tasks # your tasks as JSON
|
||
```
|
||
|
||
Stop it with Ctrl-C. Now have Claude Code commit the new files: *"Stage and commit the HTTP
|
||
service and Dockerfile with a clear message."* **Verify** the commit before moving on: read the
|
||
diff it staged and confirm no secret, state file, or junk got swept in (it should be just
|
||
`serve.py`, `Dockerfile`, and `deploy.sh`).
|
||
|
||
### Part B: Build and tag the artifact
|
||
|
||
3. Have Claude Code build the image and tag it with the current commit SHA, the immutable, traceable
|
||
tag: *"Build the container image and tag it with the short commit SHA and also `:latest`."*
|
||
Getting the SHA is git work the agent drives. **Verify** the result yourself:
|
||
|
||
```bash
|
||
docker images tasks-app # both tags point at one image; note the SHA
|
||
```
|
||
|
||
That `:<sha>` tag is the unit of deploy. Everything downstream refers to *this exact image*.
|
||
|
||
### Part C: Deploy it (with a net)
|
||
|
||
4. **Read `lab/deploy.sh` yourself** before running it. It does the five steps: stops any running
|
||
`tasks-app` container, starts the new image with runtime config injected as env vars (Module 17,
|
||
note the `APP_VERSION` and the *absence* of any secret baked into the image), polls `/health`
|
||
until green, and on failure rolls back to the previous tag it recorded.
|
||
|
||
Now direct Claude Code to run the deploy against the SHA you just built: *"Run `deploy.sh` for the
|
||
current commit SHA and report whether it came up healthy."* The agent makes the script executable
|
||
and runs it. **Verify** the deploy yourself:
|
||
|
||
```bash
|
||
curl localhost:8000/health # now reports the SHA you deployed
|
||
```
|
||
|
||
Ask the agent to commit a trivial change and deploy again, then read back what it recorded as the
|
||
rollback target. You now have continuous *delivery* in miniature: one command turns a commit into
|
||
a running, version-tagged service.
|
||
|
||
### Part D: Break a deploy and watch it roll back
|
||
|
||
5. Now prove the net works. The service honors a `BREAK=1` env var that makes `/health` return
|
||
`500`, a stand-in for "this build starts but is actually broken." First have the agent deploy a
|
||
healthy version so there's a known-good to fall back to, then trigger the broken one yourself so
|
||
you watch it happen:
|
||
|
||
```bash
|
||
./deploy.sh # healthy baseline (defaults to the current commit SHA)
|
||
BREAK=1 ./deploy.sh # same image, but the new instance fails its health check
|
||
```
|
||
|
||
The script starts the "new" version, the health check fails, and it **automatically stops the
|
||
broken instance and brings the previous good one back up.** Confirm you're still serving:
|
||
|
||
```bash
|
||
curl localhost:8000/health # ok, the bad deploy reverted itself
|
||
```
|
||
|
||
That automatic reversal, not the build and not the run, is the part that makes auto-deploy
|
||
something you can sleep through.
|
||
|
||
### Part E: Wire it into the pipeline (read + reason)
|
||
|
||
6. Open `lab/cd-starter.yml` and compare it to the Module 14 `ci-starter.yml`. It's the **same
|
||
pipeline with stages appended**: the lint/test/scan gates run first (unchanged), and only `on:
|
||
push` to `main` (a merge) do the build-publish-deploy stages run. Trace the `needs:`/dependency
|
||
chain that makes deploy run *only after* the checks pass.
|
||
|
||
7. Find the one line that is the delivery-vs-deployment switch: the deploy-to-prod step gated behind
|
||
a manual approval (`environment:` with a required reviewer, commented in the file). Decide, for
|
||
the `tasks-app`, which side you'd choose and why, and ask Claude Code to make the case for the
|
||
*other* choice. The goal isn't a "right" answer; it's being able to articulate the risk posture
|
||
either way.
|
||
|
||
> **A note on running the full pipeline:** actually executing `cd-starter.yml` end to end needs a
|
||
> forge with a container registry and a deploy target wired up; that's environment-specific and
|
||
> partly Module 19's territory (the runners and compute underneath). Parts A–D give you the deploy
|
||
> *logic* runnable today on your own machine; the YAML shows how it slots into the automated
|
||
> pipeline you already started in Module 14.
|
||
|
||
---
|
||
|
||
## Where it breaks
|
||
|
||
Be honest about the edges: this is where teams get burned.
|
||
|
||
- **The deploy is only as safe as the gates in front of it.** Continuous deployment with weak tests
|
||
and no review isn't "moving fast," it's an automated mistake-shipping machine. If you haven't done
|
||
the Module 10/14/15 work, do *delivery* (human on the button), not *deployment*. Auto-deploy is a
|
||
reward you earn by trusting your gates, not a default you turn on.
|
||
- **Health checks lie.** A `200` from `/health` means "the process started," not "the feature
|
||
works." A shallow health check passes while the app returns garbage to users. Make the check
|
||
meaningful (does it reach its database? can it serve a real request?) and lean on canary/gradual
|
||
rollout for anything important, but know that no health check replaces real tests and real
|
||
monitoring.
|
||
- **Rollback isn't free, and some things don't roll back.** Reverting the *running image* is cheap.
|
||
Reverting a **database migration**, a sent email, a charged credit card, or a published message is
|
||
not. Those are forward-only. The cleaner the separation between code deploys and irreversible
|
||
state changes, the more rollback actually saves you. Don't assume "we can always roll back" covers
|
||
data.
|
||
- **This lab simulates the target.** A local `docker run` is the deploy logic, not the deploy
|
||
reality. Real targets add networking, DNS cutover, load balancers, zero-downtime orchestration,
|
||
and multiple instances. The five steps hold; the operational surface around them is larger. The
|
||
*compute* that runs all of this (and why you might run your own) is Module 19.
|
||
- **"Build once" only holds if you actually do.** The instant someone rebuilds on the prod box "just
|
||
to be sure," you've lost the guarantee that prod runs what CI tested. Deploy the artifact CI built.
|
||
No rebuilds downstream.
|
||
|
||
---
|
||
|
||
## Check for understanding
|
||
|
||
**You're done when:**
|
||
|
||
- You can state the difference between continuous delivery and continuous deployment in one sentence
|
||
(*who clicks the prod button*) and say which one `tasks-app` should use and why.
|
||
- `./deploy.sh` builds, tags by commit SHA, runs the container, and reports a healthy deploy you can
|
||
`curl`.
|
||
- You have **watched a bad deploy roll itself back** to the previous good version, and the service
|
||
stayed up.
|
||
- You can point at the line in `cd-starter.yml` that turns delivery into deployment, and explain what
|
||
gates have to be trustworthy before you'd flip it.
|
||
|
||
When a deploy is one command, a bad one reverts itself, and you can argue the delivery-vs-deployment
|
||
call for a given repo, you've closed the merged-to-running gap. Module 19 goes underneath all of
|
||
this: the runners and compute actually executing your CI/CD, and why you'd own them.
|
||
|
||
---
|
||
|
||
## Verify-before-publish
|
||
|
||
This is expansion-zone material (Module 15+); some specifics drift. Re-check at build/publish time:
|
||
|
||
- [ ] **Action/runner versions** in `cd-starter.yml` (`actions/checkout`, `actions/setup-python`,
|
||
any build/login/push actions); pin to current major versions and confirm they still exist.
|
||
- [ ] **Registry login + push syntax:** the standard build-and-push action names and auth flow
|
||
change; verify against current forge docs rather than the comments here.
|
||
- [ ] **Manual-approval mechanism:** the way a forge gates a job behind human approval
|
||
(GitHub `environment` protection rules, GitLab `when: manual`, others) shifts in naming/UI.
|
||
Confirm the delivery-vs-deployment switch still maps to the current feature.
|
||
- [ ] **Container runtime commands:** confirm `docker`/`podman` flags used in `deploy.sh`
|
||
(`run`, `--health-*`, `inspect`) match current CLI behavior.
|
||
- [ ] **Cross-references** to Modules 16, 17, and 19 still match those modules' final content.
|