feat(course): build out all 27 modules, capstone, scaffold, and conventions
Scaffold the course repo and author the full curriculum in dependency-chain order, following the settled build decisions in handoff.md. - Scaffold: course README, vendor-neutral AGENTS.md (dogfoods Module 5), _TEMPLATE.md (the fixed 9-section module shape), root .gitignore, ship config. - Modules 1-2: reference exemplars (locked for tone/depth/lab style). - Modules 3-27: full lessons + runnable labs, each following the template, respecting the chain, vendor/model-agnostic, with "feel the pain" labs. - Module 8 hosting comparison web-researched and date-stamped (as of 2026-06-22), not written from memory; expansion-zone modules carry Verify-before-publish. - Capstone: the full loop end to end on the running tasks-app example. Lab code syntax-checked (Python/shell/YAML); every module has the 7 core template sections. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
This commit is contained in:
@@ -0,0 +1,384 @@
|
||||
# Module 18 — Continuous Delivery and Deployment
|
||||
|
||||
> **Merged isn't running.** This module closes the last gap in the pipeline — getting approved code
|
||||
> from `main` to something actually serving traffic, automatically, with a way back when it's wrong.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Module 10 — Reviewing Code You Didn't Write.** The PR review gate. Auto-deploy is only safe
|
||||
because a human (or an agent under supervision) signed off on the diff first.
|
||||
- **Module 14 — Continuous Integration.** You already have a pipeline that lints, builds, and tests
|
||||
on every push. CD is not a new system — it's **more stages on that same pipeline**, after the
|
||||
checks pass.
|
||||
- **Module 15 — Security Scanning.** Dependency, secret, and static-analysis gates on the same
|
||||
pushes. These are part of what makes shipping without a human in the loop survivable.
|
||||
- **Module 16 — Containers and Reproducible Environments.** The container image is *what you ship*.
|
||||
CD takes that image and runs it somewhere. This module assumes you can already build and tag an
|
||||
image of the `tasks-app`.
|
||||
- **Module 17 — Secrets, Config, and Environments.** A running service needs configuration and
|
||||
secrets at runtime — *what it needs to run*. CD wires those into the deploy step instead of baking
|
||||
them into the image.
|
||||
|
||||
If you've done 14–17, you have all the parts. This module is the assembly.
|
||||
|
||||
---
|
||||
|
||||
## Learning objectives
|
||||
|
||||
By the end of this module you can:
|
||||
|
||||
1. State the precise difference between continuous **delivery** and continuous **deployment**, and
|
||||
decide which one a given project should use.
|
||||
2. Extend your CI pipeline with build-and-publish stages that turn a merge into a versioned,
|
||||
deployable artifact.
|
||||
3. Wire a deploy step that takes that artifact, injects runtime config/secrets, and brings up the
|
||||
new version — provider-neutrally.
|
||||
4. Add a health check and an automatic **rollback** so a bad deploy reverts itself instead of
|
||||
staying down.
|
||||
5. Reason about the deploy gate the way this audience already reasons about change windows: what's
|
||||
automated, what's manual, and where the stop button is.
|
||||
|
||||
---
|
||||
|
||||
## Key concepts
|
||||
|
||||
### The gap nobody automated yet
|
||||
|
||||
Walk the pipeline you've built so far. A change gets proposed (Module 9), implemented on a branch
|
||||
(Module 6), reviewed as a PR (Module 10), checked by CI (Module 14), scanned for vulnerabilities
|
||||
(Module 15). It merges. `main` is now correct, tested, and clean.
|
||||
|
||||
And then nothing happens. The code that's "done" is sitting in a Git history. The thing your users
|
||||
touch is still running last week's version. Somebody — usually you, usually at 6pm — has to SSH in,
|
||||
pull, build, restart, and pray. That manual last mile is where most outages are actually born:
|
||||
inconsistent steps, a forgotten config flag, a half-restarted service, "wait, which version is in
|
||||
prod right now?"
|
||||
|
||||
CI answered *"is this change good?"* CD answers the next question: ***"now get the good change
|
||||
running, the same way every time."*** It's the same instinct that made CI worth it — replace an
|
||||
error-prone manual ritual with an automated, repeatable one — pointed at the last step.
|
||||
|
||||
### Delivery vs. deployment: the distinction that matters
|
||||
|
||||
These two terms get used interchangeably and they are not the same thing. The difference is exactly
|
||||
one decision: **who pushes the button to prod.**
|
||||
|
||||
- **Continuous Delivery** — every merge to `main` automatically produces a **deployable artifact**
|
||||
(a built, tagged, tested container image, sitting in a registry) and deploys it as far as a
|
||||
staging/pre-prod environment. Production deploy is **one click by a human**. The pipeline
|
||||
guarantees the artifact is *ready to ship at any moment*; a person decides *when*.
|
||||
|
||||
- **Continuous Deployment** — same pipeline, but there's **no button**. If it passes every gate, it
|
||||
goes all the way to production automatically. Merge is the last human action.
|
||||
|
||||
```
|
||||
merge to main
|
||||
│
|
||||
┌─────────────┴──────────────┐
|
||||
CONTINUOUS DELIVERY CONTINUOUS DEPLOYMENT
|
||||
│ │
|
||||
build + test + scan build + test + scan
|
||||
│ │
|
||||
publish artifact publish artifact
|
||||
│ │
|
||||
deploy to staging deploy to staging
|
||||
│ │
|
||||
[human clicks "ship"] ──► deploy to prod (automatic)
|
||||
│ │
|
||||
deploy to prod done
|
||||
```
|
||||
|
||||
Both are "CD." When someone says "we do CD," ask which one — the operational risk is completely
|
||||
different. Continuous deployment is not the more advanced/better option you graduate to; it's a
|
||||
different risk posture that's appropriate for some systems and reckless for others. A blog,
|
||||
internal dashboard, or stateless web service with good tests is a fine candidate. A billing engine,
|
||||
a database migration, or anything with a regulatory change-control requirement usually is not — and
|
||||
"a human clicks deploy" is a perfectly mature answer there, not a failure to automate.
|
||||
|
||||
The honest default for most teams adopting this: **start with continuous *delivery*.** Get the
|
||||
artifact and the deploy step fully automated and trustworthy, keep the human on the prod button, and
|
||||
remove that button only once you trust the gates more than you trust the click.
|
||||
|
||||
### The artifact is the unit of deploy
|
||||
|
||||
Here's the discipline that makes CD reliable, and it comes straight from Module 16: **you deploy a
|
||||
built image, not a Git ref.** "Deploy `main`" is ambiguous — it means "go to the prod box, pull,
|
||||
and rebuild," and that rebuild can pull a different base image or dependency version than CI tested.
|
||||
"Deploy `tasks-app:9f3a2c1`" is not ambiguous. It's the exact bytes CI built and tested.
|
||||
|
||||
So the build-and-publish stage does this once, centrally:
|
||||
|
||||
1. Build the image from the merged code.
|
||||
2. Tag it with something **immutable and traceable** — the Git commit SHA is the standard choice
|
||||
(`tasks-app:9f3a2c1`). Optionally also a moving tag like `:latest` or `:staging` for convenience,
|
||||
but the SHA tag is the one you trust.
|
||||
3. Push it to a container registry — the durable, shared home for images, the same way a Git remote
|
||||
(Module 8) is the durable home for commits.
|
||||
|
||||
Every later deploy — to staging, to prod, a rollback — just says "run *this* tag." Build once, run
|
||||
the identical artifact everywhere. That single property is what kills "works on my machine" at the
|
||||
deploy layer.
|
||||
|
||||
### The deploy step, provider-neutrally
|
||||
|
||||
The shape of a deploy is the same everywhere, whatever the target — a cloud platform, a Kubernetes
|
||||
cluster, a single VM, a PaaS:
|
||||
|
||||
1. **Pull** the specific image tag onto the target.
|
||||
2. **Inject runtime config and secrets** (Module 17) — environment variables, mounted secret files,
|
||||
a secrets-manager lookup. Never baked into the image; supplied at run time so the *same* image
|
||||
runs in staging and prod with different config.
|
||||
3. **Start the new version** alongside or in place of the old one.
|
||||
4. **Health-check** it before sending real traffic.
|
||||
5. **Cut over** if healthy; **roll back** if not.
|
||||
|
||||
This module is deliberately provider-agnostic on *where* — the same way Module 8 stayed neutral on
|
||||
hosts. The mechanics differ (a `kubectl` apply, a platform CLI, a `docker run`, a `compose up`), but
|
||||
the five steps don't. The lab does the simplest possible real version: a local container run. The
|
||||
logic is identical at scale.
|
||||
|
||||
### Health checks and rollback: the part beginners skip
|
||||
|
||||
A deploy that can't tell whether it worked isn't a deploy, it's a gamble. The single most important
|
||||
thing CD adds over "SSH in and restart" is that **the pipeline verifies the new version is alive
|
||||
before trusting it, and reverses itself when it isn't.**
|
||||
|
||||
A health check is a cheap, honest signal that the new version is actually serving — typically an
|
||||
endpoint like `/health` that returns `200` only when the app has started clean. The deploy step
|
||||
hits it after starting the new version and **waits for green before cutting over.**
|
||||
|
||||
Rollback is the other half: if the health check fails, the deploy stops the broken new version and
|
||||
brings the **previous known-good image tag** back up. Because you deploy immutable tags, rollback is
|
||||
trivial — you still have `tasks-app:<previous-sha>`, so "go back" is just "run the old tag again."
|
||||
No rebuild, no git revert race, no scramble. (Reverting the *source* is still Module 12's job for the
|
||||
code; rollback here is about the *running artifact*.) The strategies have names you'll meet —
|
||||
blue-green (run old and new side by side, flip a switch), canary (send 5% of traffic to new, watch,
|
||||
ramp) — but they're all variations on "keep the old one ready until the new one proves itself."
|
||||
|
||||
> **Reframe for the ops reader:** you already know this instinct. It's the deployment equivalent of
|
||||
> a maintenance window with a back-out plan — except the back-out plan is automated, tested on every
|
||||
> single deploy, and takes seconds instead of a panicked hour. CD doesn't remove the discipline you
|
||||
> already have; it encodes it so it runs every time instead of only when someone remembers.
|
||||
|
||||
---
|
||||
|
||||
## The AI angle
|
||||
|
||||
CI existed long before AI, and so did CD. What changed is the **rate**, and rate is everything for
|
||||
the merged-to-prod gate.
|
||||
|
||||
AI writes and ships changes dramatically faster. More PRs open, more merge, and they merge sooner.
|
||||
That's the upside — and it means the volume of code flowing toward production goes *up*, while the
|
||||
human attention available to babysit each deploy stays flat. The gap between "merged" and "in prod"
|
||||
stops being a quiet formality and becomes the place where the speed either pays off or hurts you.
|
||||
|
||||
Two consequences follow, and they pull in opposite directions:
|
||||
|
||||
- **Automating the deploy matters more.** If a human has to hand-deploy every AI-generated change,
|
||||
the manual last mile becomes the bottleneck that eats all the speed AI just gave you. CD is what
|
||||
lets the throughput actually reach users.
|
||||
- **The gate matters more.** Faster shipping of code that *looks right* (the recurring AI failure
|
||||
mode from Modules 1 and 14) means a bad change reaches prod faster too — unless something catches
|
||||
it. This is the crucial point: **continuous deployment is only survivable because of the gates in
|
||||
front of it.** Review (Module 10), CI tests (Module 14), and security scanning (Module 15) are not
|
||||
bureaucracy you tolerate — they are the *entire reason* you're allowed to remove the human from the
|
||||
deploy button. Take auto-deploy without those gates and you've built a machine that ships AI
|
||||
mistakes to production at full speed.
|
||||
|
||||
So the AI-era posture is specific: **strengthen the early gates, then automate the late ones.** The
|
||||
more you trust review + CI + scanning, the further right you can safely push automation — up to and
|
||||
including no human on the prod button. The strength of the gates is the dial that decides whether
|
||||
continuous *deployment* is responsible or reckless for a given repo. And when an agent itself is the
|
||||
one merging (Unit 5), this stops being theoretical: the deploy gate is the last thing standing
|
||||
between an autonomous contributor and your users.
|
||||
|
||||
---
|
||||
|
||||
## Hands-on lab
|
||||
|
||||
**Lab language:** shell, driving the container tooling from Module 16. You'll extend the `tasks-app`
|
||||
into a tiny running service, then build a deploy script that ships it locally with a health check and
|
||||
automatic rollback — the whole CD motion, simulated on your own machine.
|
||||
|
||||
This lab simulates deployment with a **local container run** so it works on any machine with no cloud
|
||||
account. The five deploy steps are real; only the *target* is your laptop instead of a server.
|
||||
|
||||
**You'll need:**
|
||||
|
||||
- A container runtime from Module 16 — Docker or Podman. (Commands below use `docker`; if you run
|
||||
Podman, `alias docker=podman` or substitute.)
|
||||
- The `tasks-app` from Modules 1–2, now a Git repo.
|
||||
- `curl` (for the health check) and a bash-capable shell. On Windows, use WSL or Git Bash.
|
||||
- Your AI assistant — by now, ideally editor-integrated (Module 4).
|
||||
|
||||
Starter files are in this module's `lab/` folder:
|
||||
|
||||
- `serve.py` — turns the `tasks-app` into a minimal HTTP service with a `/health` endpoint, using
|
||||
only the Python standard library (no dependencies). This is the long-running thing CD deploys.
|
||||
- `Dockerfile` — the Module 16 container image, adjusted to run the service.
|
||||
- `deploy.sh` — the deploy step: build, tag, run, health-check, cut over or roll back.
|
||||
- `cd-starter.yml` — the CD pipeline stages, written as GitHub Actions and extending the Module 14
|
||||
CI file. GitLab/other-forge notes are in the comments.
|
||||
|
||||
### Part A — Make something worth deploying
|
||||
|
||||
A CLI that exits immediately is awkward to "deploy." Give the app a long-running face.
|
||||
|
||||
1. Copy `lab/serve.py` and `lab/Dockerfile` into your `tasks-app` folder next to `tasks.py` and
|
||||
`cli.py`. Read `serve.py` — it's ~40 lines wrapping the `TaskList` you already have in a stdlib
|
||||
HTTP server with two routes: `/health` and `/tasks`.
|
||||
|
||||
2. Run it locally first, no container, to see it work:
|
||||
|
||||
```bash
|
||||
python serve.py # serves on http://localhost:8000
|
||||
```
|
||||
|
||||
In another terminal:
|
||||
|
||||
```bash
|
||||
curl localhost:8000/health # {"status": "ok", "version": "dev"}
|
||||
curl localhost:8000/tasks # your tasks as JSON
|
||||
```
|
||||
|
||||
Stop it with Ctrl-C. Commit this (`git add . && git commit -m "Add HTTP service + Dockerfile"`).
|
||||
|
||||
### Part B — Build and tag the artifact
|
||||
|
||||
3. Build the image and tag it with the current commit SHA — the immutable, traceable tag:
|
||||
|
||||
```bash
|
||||
SHA=$(git rev-parse --short HEAD)
|
||||
docker build -t tasks-app:$SHA -t tasks-app:latest .
|
||||
docker images tasks-app # see both tags pointing at one image
|
||||
```
|
||||
|
||||
That `:$SHA` tag is the unit of deploy. Everything downstream refers to *this exact image*.
|
||||
|
||||
### Part C — Deploy it (with a net)
|
||||
|
||||
4. Read `lab/deploy.sh`. It does the five steps: stops any running `tasks-app` container, starts the
|
||||
new image with runtime config injected as env vars (Module 17 — note the `APP_VERSION` and the
|
||||
*absence* of any secret baked into the image), polls `/health` until green, and on failure rolls
|
||||
back to the previous tag it recorded. Make it executable and run it:
|
||||
|
||||
```bash
|
||||
chmod +x deploy.sh
|
||||
./deploy.sh $SHA
|
||||
```
|
||||
|
||||
Watch it build, run, health-check, and report the deploy healthy. Hit it:
|
||||
|
||||
```bash
|
||||
curl localhost:8000/health # now reports the SHA you deployed
|
||||
```
|
||||
|
||||
Run `./deploy.sh` again after another commit and notice it records the prior version as the
|
||||
rollback target. You now have continuous *delivery* in miniature: one command turns a commit into
|
||||
a running, version-tagged service.
|
||||
|
||||
### Part D — Break a deploy and watch it roll back
|
||||
|
||||
5. Now prove the net works. The service honors a `BREAK=1` env var that makes `/health` return `500`
|
||||
— a stand-in for "this build starts but is actually broken." Deploy a healthy version first so
|
||||
there's a known-good to fall back to, then force a bad one:
|
||||
|
||||
```bash
|
||||
./deploy.sh $SHA # healthy baseline
|
||||
BREAK=1 ./deploy.sh $SHA # same image, but the new instance fails its health check
|
||||
```
|
||||
|
||||
The script starts the "new" version, the health check fails, and it **automatically stops the
|
||||
broken instance and brings the previous good one back up.** Confirm you're still serving:
|
||||
|
||||
```bash
|
||||
curl localhost:8000/health # ok — the bad deploy reverted itself
|
||||
```
|
||||
|
||||
That automatic reversal — not the build, not the run — is the part that makes auto-deploy
|
||||
something you can sleep through.
|
||||
|
||||
### Part E — Wire it into the pipeline (read + reason)
|
||||
|
||||
6. Open `lab/cd-starter.yml` and compare it to the Module 14 `ci-starter.yml`. It's the **same
|
||||
pipeline with stages appended**: the lint/test/scan gates run first (unchanged), and only `on:
|
||||
push` to `main` (a merge) do the build-publish-deploy stages run. Trace the `needs:`/dependency
|
||||
chain that makes deploy run *only after* the checks pass.
|
||||
|
||||
7. Find the one line that is the delivery-vs-deployment switch — the deploy-to-prod step gated behind
|
||||
a manual approval (`environment:` with a required reviewer, commented in the file). Decide, for
|
||||
the `tasks-app`, which side you'd choose and why, and ask your AI assistant to make the case for
|
||||
the *other* choice. The goal isn't a "right" answer; it's being able to articulate the risk
|
||||
posture either way.
|
||||
|
||||
> **A note on running the full pipeline:** actually executing `cd-starter.yml` end to end needs a
|
||||
> forge with a container registry and a deploy target wired up — that's environment-specific and
|
||||
> partly Module 19's territory (the runners and compute underneath). Parts A–D give you the deploy
|
||||
> *logic* runnable today on your own machine; the YAML shows how it slots into the automated
|
||||
> pipeline you already started in Module 14.
|
||||
|
||||
---
|
||||
|
||||
## Where it breaks
|
||||
|
||||
Be honest about the edges — this is where teams get burned.
|
||||
|
||||
- **The deploy is only as safe as the gates in front of it.** Continuous deployment with weak tests
|
||||
and no review isn't "moving fast," it's an automated mistake-shipping machine. If you haven't done
|
||||
the Module 10/14/15 work, do *delivery* (human on the button), not *deployment*. Auto-deploy is a
|
||||
reward you earn by trusting your gates, not a default you turn on.
|
||||
- **Health checks lie.** A `200` from `/health` means "the process started," not "the feature
|
||||
works." A shallow health check passes while the app returns garbage to users. Make the check
|
||||
meaningful (does it reach its database? can it serve a real request?) and lean on canary/gradual
|
||||
rollout for anything important — but know that no health check replaces real tests and real
|
||||
monitoring.
|
||||
- **Rollback isn't free, and some things don't roll back.** Reverting the *running image* is cheap.
|
||||
Reverting a **database migration**, a sent email, a charged credit card, or a published message is
|
||||
not — those are forward-only. The cleaner the separation between code deploys and irreversible
|
||||
state changes, the more rollback actually saves you. Don't assume "we can always roll back" covers
|
||||
data.
|
||||
- **This lab simulates the target.** A local `docker run` is the deploy logic, not the deploy
|
||||
reality. Real targets add networking, DNS cutover, load balancers, zero-downtime orchestration,
|
||||
and multiple instances. The five steps hold; the operational surface around them is larger. The
|
||||
*compute* that runs all of this — and why you might run your own — is Module 19.
|
||||
- **"Build once" only holds if you actually do.** The instant someone rebuilds on the prod box "just
|
||||
to be sure," you've lost the guarantee that prod runs what CI tested. Deploy the artifact CI built.
|
||||
No rebuilds downstream.
|
||||
|
||||
---
|
||||
|
||||
## Check for understanding
|
||||
|
||||
**You're done when:**
|
||||
|
||||
- You can state the difference between continuous delivery and continuous deployment in one sentence
|
||||
— *who clicks the prod button* — and say which one `tasks-app` should use and why.
|
||||
- `./deploy.sh` builds, tags by commit SHA, runs the container, and reports a healthy deploy you can
|
||||
`curl`.
|
||||
- You have **watched a bad deploy roll itself back** to the previous good version, and the service
|
||||
stayed up.
|
||||
- You can point at the line in `cd-starter.yml` that turns delivery into deployment, and explain what
|
||||
gates have to be trustworthy before you'd flip it.
|
||||
|
||||
When a deploy is one command, a bad one reverts itself, and you can argue the delivery-vs-deployment
|
||||
call for a given repo, you've closed the merged-to-running gap. Module 19 goes underneath all of
|
||||
this — the runners and compute actually executing your CI/CD, and why you'd own them.
|
||||
|
||||
---
|
||||
|
||||
## Verify-before-publish
|
||||
|
||||
This is expansion-zone material (Module 15+); some specifics drift. Re-check at build/publish time:
|
||||
|
||||
- [ ] **Action/runner versions** in `cd-starter.yml` (`actions/checkout`, `actions/setup-python`,
|
||||
any build/login/push actions) — pin to current major versions and confirm they still exist.
|
||||
- [ ] **Registry login + push syntax** — the standard build-and-push action names and auth flow
|
||||
change; verify against current forge docs rather than the comments here.
|
||||
- [ ] **Manual-approval mechanism** — the way a forge gates a job behind human approval
|
||||
(GitHub `environment` protection rules, GitLab `when: manual`, others) shifts in naming/UI.
|
||||
Confirm the delivery-vs-deployment switch still maps to the current feature.
|
||||
- [ ] **Container runtime commands** — confirm `docker`/`podman` flags used in `deploy.sh`
|
||||
(`run`, `--health-*`, `inspect`) match current CLI behavior.
|
||||
- [ ] **Cross-references** to Modules 16, 17, and 19 still match those modules' final content.
|
||||
@@ -0,0 +1,24 @@
|
||||
# The Module 16 container image for the tasks-app, set to run the HTTP service from serve.py.
|
||||
#
|
||||
# This is *what you ship* (Module 16). Continuous delivery/deployment (this module) builds this
|
||||
# once, tags it with the commit SHA, and runs that exact artifact everywhere.
|
||||
#
|
||||
# Note what is NOT here: no secrets, no environment-specific config. Those are injected at run time
|
||||
# (Module 17), which is why the same image can run in staging and prod unchanged.
|
||||
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# The app is dependency-free (stdlib only), so there is nothing to pip install. Copy the source.
|
||||
COPY tasks.py cli.py serve.py ./
|
||||
|
||||
# Document the port the service listens on.
|
||||
EXPOSE 8000
|
||||
|
||||
# A built-in container health check. The deploy step also checks /health from outside, but this
|
||||
# lets the runtime itself know whether the container is healthy.
|
||||
HEALTHCHECK --interval=5s --timeout=3s --retries=3 \
|
||||
CMD python -c "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://localhost:8000/health').status==200 else 1)"
|
||||
|
||||
CMD ["python", "serve.py"]
|
||||
@@ -0,0 +1,87 @@
|
||||
# Starter CD pipeline for the tasks-app — GitHub Actions flavor, extending the Module 14 CI file.
|
||||
#
|
||||
# The whole idea: CD is not a new system. It is MORE STAGES on the SAME pipeline, after the checks
|
||||
# pass. The lint/test gates below are the Module 14 pipeline, unchanged. Everything from the
|
||||
# `build-and-publish` job down is new in this module.
|
||||
#
|
||||
# Where this file goes: .github/workflows/cd.yml (or fold it into your existing ci.yml). On GitLab,
|
||||
# the same shape is stages in .gitlab-ci.yml with `needs:`/`rules:`; Forgejo/Gitea use Actions-
|
||||
# compatible YAML. The concept — gated stages from merge to running — is identical everywhere.
|
||||
#
|
||||
# VERIFY BEFORE PUBLISH: action versions, the registry login/build-push action names, and the
|
||||
# manual-approval mechanism all drift. Check current forge docs at build time (see README checklist).
|
||||
|
||||
name: CD
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main] # only a MERGE to main triggers a deploy
|
||||
pull_request: # PRs still run the gates, but never deploy
|
||||
|
||||
jobs:
|
||||
# ---- The Module 14 gates: nothing ships without passing these first. ----------------------------
|
||||
check:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.12"
|
||||
- run: pip install pytest ruff
|
||||
- run: ruff check . # lint
|
||||
- run: pytest -q # test
|
||||
# In a real pipeline a security-scan job (Module 15) would also gate here.
|
||||
|
||||
# ---- Build the artifact ONCE and publish it. The unit of deploy is an immutable, SHA-tagged image.
|
||||
build-and-publish:
|
||||
needs: check # only runs if the gates passed
|
||||
if: github.ref == 'refs/heads/main'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
# Log in to your container registry (Module 16's images need a durable home, like a Git remote
|
||||
# is for commits). Registry/credentials are provider-specific — supply them as secrets,
|
||||
# never inline (Module 17).
|
||||
# - uses: docker/login-action@v3
|
||||
# with:
|
||||
# registry: ${{ vars.REGISTRY }}
|
||||
# username: ${{ secrets.REGISTRY_USER }}
|
||||
# password: ${{ secrets.REGISTRY_TOKEN }}
|
||||
|
||||
# Build and push, tagging with the commit SHA (immutable + traceable) and :staging (moving).
|
||||
# - uses: docker/build-push-action@v6
|
||||
# with:
|
||||
# push: true
|
||||
# tags: |
|
||||
# ${{ vars.REGISTRY }}/tasks-app:${{ github.sha }}
|
||||
# ${{ vars.REGISTRY }}/tasks-app:staging
|
||||
- run: echo "build + push tasks-app:${{ github.sha }} (wire up the registry steps above)"
|
||||
|
||||
# ---- Deploy to a NON-prod environment automatically. Safe to do on every merge. ----------------
|
||||
deploy-staging:
|
||||
needs: build-and-publish
|
||||
if: github.ref == 'refs/heads/main'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
# The five deploy steps live in deploy.sh in this folder. On a real target this would run the
|
||||
# platform's deploy (kubectl / platform CLI / compose) against the SHA-tagged image, inject
|
||||
# runtime config + secrets (Module 17), health-check, and roll back on failure.
|
||||
- run: echo "deploy tasks-app:${{ github.sha }} to STAGING, health-check, roll back if red"
|
||||
|
||||
# ---- THIS JOB IS THE DELIVERY-vs-DEPLOYMENT SWITCH. ---------------------------------------------
|
||||
#
|
||||
# As written, `environment: production` requires a human to approve before this job runs (set a
|
||||
# required reviewer on the 'production' environment in the forge). That is CONTINUOUS DELIVERY:
|
||||
# the artifact is auto-built and staged; a person clicks to ship to prod.
|
||||
#
|
||||
# Delete the `environment:` block and this becomes CONTINUOUS DEPLOYMENT: merge -> prod, no human.
|
||||
# Only remove it once you trust your review + CI + security gates (Modules 10/14/15) more than you
|
||||
# trust the click. On GitLab the equivalent switch is `when: manual` vs. automatic.
|
||||
deploy-prod:
|
||||
needs: deploy-staging
|
||||
if: github.ref == 'refs/heads/main'
|
||||
runs-on: ubuntu-latest
|
||||
environment: production # <-- required-reviewer gate = delivery. Remove = deployment.
|
||||
steps:
|
||||
- run: echo "deploy tasks-app:${{ github.sha }} to PRODUCTION (gated on human approval)"
|
||||
@@ -0,0 +1,95 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy.sh — the deploy step of CD, simulated with a local container run.
|
||||
#
|
||||
# The five steps of any deploy, provider-neutral (see the module README):
|
||||
# 1. build/pull the specific image tag 4. health-check before trusting it
|
||||
# 2. inject runtime config + secrets 5. cut over if healthy, ROLL BACK if not
|
||||
# 3. start the new version
|
||||
#
|
||||
# The *target* here is your own machine instead of a server, but the logic is the real thing.
|
||||
#
|
||||
# Usage:
|
||||
# ./deploy.sh <tag> # e.g. ./deploy.sh $(git rev-parse --short HEAD)
|
||||
# BREAK=1 ./deploy.sh <tag># force the new version's health check to fail, to demo rollback
|
||||
#
|
||||
# Requires: docker (or `alias docker=podman`), curl.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
IMAGE="tasks-app"
|
||||
CONTAINER="tasks-app"
|
||||
PORT="8000"
|
||||
STATE_FILE=".deploy-state" # records the last good tag, for rollback
|
||||
TAG="${1:-$(git rev-parse --short HEAD)}"
|
||||
|
||||
say() { printf '\n=== %s\n' "$*"; }
|
||||
|
||||
# --- Step 1: build the artifact for this tag (in real CD this was already built+pushed by CI) -----
|
||||
say "Building ${IMAGE}:${TAG}"
|
||||
docker build -t "${IMAGE}:${TAG}" .
|
||||
|
||||
# Remember what is currently running so we can roll back to it.
|
||||
PREVIOUS=""
|
||||
if [ -f "${STATE_FILE}" ]; then
|
||||
PREVIOUS="$(cat "${STATE_FILE}")"
|
||||
fi
|
||||
|
||||
# --- Steps 2 + 3: start the new version with runtime config/secrets injected (Module 17) ----------
|
||||
# Note: APP_VERSION is config supplied at run time, NOT baked into the image. A real deploy would
|
||||
# also pass secrets here (e.g. --env-file, a mounted secret, or a secrets-manager lookup) — never
|
||||
# committed, never in the image.
|
||||
start_version() {
|
||||
local tag="$1"
|
||||
docker rm -f "${CONTAINER}" >/dev/null 2>&1 || true
|
||||
docker run -d --name "${CONTAINER}" \
|
||||
-p "${PORT}:8000" \
|
||||
-e "APP_VERSION=${tag}" \
|
||||
${BREAK:+-e "BREAK=${BREAK}"} \
|
||||
"${IMAGE}:${tag}" >/dev/null
|
||||
}
|
||||
|
||||
say "Starting ${IMAGE}:${TAG}"
|
||||
start_version "${TAG}"
|
||||
|
||||
# --- Step 4: health-check the new version before trusting it --------------------------------------
|
||||
healthy() {
|
||||
for _ in $(seq 1 10); do
|
||||
if curl -fs "http://localhost:${PORT}/health" >/dev/null 2>&1; then
|
||||
return 0
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
return 1
|
||||
}
|
||||
|
||||
say "Health-checking http://localhost:${PORT}/health"
|
||||
if healthy; then
|
||||
# --- Step 5a: cut over. Record this as the new known-good for the next deploy's rollback target.
|
||||
echo "${TAG}" > "${STATE_FILE}"
|
||||
say "DEPLOY OK — ${IMAGE}:${TAG} is live and healthy"
|
||||
curl -s "http://localhost:${PORT}/health"; echo
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# --- Step 5b: ROLLBACK. The new version failed its health check. ----------------------------------
|
||||
say "HEALTH CHECK FAILED for ${IMAGE}:${TAG} — rolling back"
|
||||
docker rm -f "${CONTAINER}" >/dev/null 2>&1 || true
|
||||
|
||||
if [ -z "${PREVIOUS}" ]; then
|
||||
echo "No previous known-good version to roll back to. Service is DOWN." >&2
|
||||
echo "(Deploy a healthy version first, then re-run the BREAK=1 deploy to see rollback work.)" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Rollback is trivial because we deploy immutable tags: just run the old one again. No rebuild.
|
||||
say "Restoring previous good version ${IMAGE}:${PREVIOUS}"
|
||||
BREAK="" start_version "${PREVIOUS}" # clear BREAK so the good version comes up clean
|
||||
if healthy; then
|
||||
say "ROLLED BACK — ${IMAGE}:${PREVIOUS} is live and healthy. The bad deploy reverted itself."
|
||||
curl -s "http://localhost:${PORT}/health"; echo
|
||||
exit 1 # exit non-zero: the deploy you asked for did NOT ship, even though service recovered
|
||||
else
|
||||
echo "Rollback FAILED — service is DOWN. Investigate ${IMAGE}:${PREVIOUS}." >&2
|
||||
exit 2
|
||||
fi
|
||||
@@ -0,0 +1,67 @@
|
||||
"""Minimal HTTP face for the tasks-app, so there is something long-running to *deploy*.
|
||||
|
||||
Standard library only — no pip install, so the container image stays tiny and the lab has no
|
||||
dependencies to drift. It reuses the TaskList from tasks.py (Modules 1-2) unchanged.
|
||||
|
||||
Run it:
|
||||
python serve.py # serves on http://localhost:8000
|
||||
|
||||
Endpoints:
|
||||
GET /health -> {"status": "ok", "version": <APP_VERSION>} (200)
|
||||
GET /tasks -> the current tasks as JSON
|
||||
|
||||
Two environment knobs make this realistic for the CD lab (config injected at run time, Module 17):
|
||||
APP_VERSION what /health reports as the running version (set by deploy.sh to the commit SHA)
|
||||
BREAK=1 force /health to return 500 — a stand-in for "this build starts but is broken",
|
||||
used in Part D to trigger an automatic rollback.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
|
||||
from pathlib import Path
|
||||
|
||||
from tasks import Task, TaskList
|
||||
|
||||
STATE = Path(__file__).parent / "tasks.json"
|
||||
PORT = int(os.environ.get("PORT", "8000"))
|
||||
APP_VERSION = os.environ.get("APP_VERSION", "dev")
|
||||
BREAK = os.environ.get("BREAK") == "1"
|
||||
|
||||
|
||||
def load() -> TaskList:
|
||||
if not STATE.exists():
|
||||
return TaskList()
|
||||
raw = json.loads(STATE.read_text())
|
||||
return TaskList(tasks=[Task(**t) for t in raw])
|
||||
|
||||
|
||||
class Handler(BaseHTTPRequestHandler):
|
||||
def _send(self, code: int, payload: dict) -> None:
|
||||
body = json.dumps(payload).encode()
|
||||
self.send_response(code)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.send_header("Content-Length", str(len(body)))
|
||||
self.end_headers()
|
||||
self.wfile.write(body)
|
||||
|
||||
def do_GET(self) -> None:
|
||||
if self.path == "/health":
|
||||
# A real health check would also confirm dependencies (db, etc.) are reachable.
|
||||
if BREAK:
|
||||
self._send(500, {"status": "unhealthy", "version": APP_VERSION})
|
||||
else:
|
||||
self._send(200, {"status": "ok", "version": APP_VERSION})
|
||||
elif self.path == "/tasks":
|
||||
tlist = load()
|
||||
self._send(200, {"tasks": [t.__dict__ for t in tlist.tasks]})
|
||||
else:
|
||||
self._send(404, {"error": "not found"})
|
||||
|
||||
def log_message(self, *args) -> None: # keep the lab output clean
|
||||
pass
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
print(f"serving tasks-app version={APP_VERSION} on http://localhost:{PORT}")
|
||||
ThreadingHTTPServer(("0.0.0.0", PORT), Handler).serve_forever()
|
||||
Reference in New Issue
Block a user