Build out all 27 modules + capstone (#1)

Co-authored-by: claude <claude@jpaul.io> Co-committed-by: claude <claude@jpaul.io>
2026-06-22 12:19:01 -04:00
parent 4bd586bbd0
commit 2684095e2f
117 changed files with 15131 additions and 1 deletions
@@ -0,0 +1,319 @@
+# Capstone — The Full Loop
+
+> **One feature, taken end to end, with every module doing its job in sequence.** This is the finale:
+> not new material, but proof that the twenty-seven pieces you learned separately are actually one
+> motion. By the end you'll have shipped a real change to `tasks-app` — prompt to running container —
+> and felt the thing the whole course was for: the model did the typing, but the *workflow* is what
+> made it safe and repeatable.
+
+---
+
+## This is a finale, not a module
+
+There's nothing to learn here that the modules didn't already teach. The capstone exists to **wire it
+together**. Every step below names the module it comes from, so you can see the dependency chain you
+climbed now collapse into a single fluent pass. If a step feels unfamiliar, that's a pointer back to
+the module to re-read — not new content to absorb.
+
+You'll do it twice:
+
+1. **The main loop** — you driving, the AI assisting. The full pipeline, by hand, once.
+2. **The stretch variant (optional)** — the *same* feature run the Unit 5 way, with agents inside the
+   pipeline, so you watch the workflow start to run itself.
+
+---
+
+## Prerequisites
+
+All of it. Concretely, you need the `tasks-app` repo in the state the course left it:
+
+- A Git repo (Module 2) with a committed AI instructions file at the root (Module 5), a remote on
+  your forge (Module 8), and a protected `main` that requires a PR to merge (Module 11).
+- `test_tasks.py` and a green test suite (Module 13).
+- A CI workflow that lints and tests on every push and PR (Module 14), with a security-scan step
+  wired in (Module 15), running on a runner you understand (Module 19).
+- A `Dockerfile` and `.dockerignore` (Module 16), `serve.py` exposing `/health` and `/tasks`
+  (Module 18), `.env`/`.env.example` for config (Module 17), and a `deploy.sh` that tags by commit
+  SHA, injects env, health-checks, and rolls back (Module 18).
+
+If any of those is missing, build it from its module first. The capstone assumes the machine is
+already standing; it doesn't re-pour the foundation.
+
+---
+
+## The feature we're shipping
+
+Pick something small enough to finish in one sitting and real enough to touch the whole stack. We'll
+add **due dates**:
+
+- A task can carry an optional due date: `python cli.py add "file taxes" --due 2026-07-15`.
+- A new `overdue` command lists pending tasks whose due date has already passed.
+- The deployed service grows a matching `GET /overdue` endpoint, so the change is visible in the
+  running container, not just the CLI.
+
+This deliberately spans the core (`tasks.py`), the CLI (`cli.py`), and the deployable service
+(`serve.py`) — one feature, three surfaces, exactly the kind of change that used to mean three
+copy-paste sessions and a prayer (Module 1). And it has a built-in trap for the review step: "is a
+task due *today* overdue?" is the kind of off-by-one an AI will answer confidently and wrongly.
+
+---
+
+## The loop, step by step
+
+Read this once as a map before you touch the keyboard. Each arrow is a module.
+
+**Prompt → issue (M9).** Don't start in your editor. Start with the work written down. File an issue:
+*"Add optional due dates to tasks, an `overdue` command, and a `/overdue` endpoint."* Acceptance
+criteria in the body. Label it. The issue is the contract the rest of the loop closes against.
+
+**Issue → branch (M6/M11).** Never work on `main`. Branch named after the issue:
+`git switch -c 47-due-dates`. The branch is a sandbox you can throw away wholesale (M6) — which is the
+only reason letting the AI loose on three files at once is a calm decision instead of a gamble.
+
+**Branch → AI implementation (M4), config already in place (M5).** Now the AI edits the files
+directly in your editor or CLI — no browser, no paste. It already knows your conventions because the
+committed instructions file has been in the repo since the first commit (M5): core logic in
+`tasks.py`, CLI wiring in `cli.py`, standard library only, run the tests before claiming done. You
+didn't re-explain any of that. That's the file earning its keep.
+
+**Implementation → tests (M13).** The feature isn't done when it runs; it's done when it's *pinned*.
+Have the AI extend `test_tasks.py` with cases for the new logic — and write the boundary cases
+yourself or demand them by name, because the boundary is exactly where the AI guesses: due yesterday
+(overdue), due tomorrow (not), **due today (not — yet)**, no due date at all (never overdue, never
+crashes).
+
+**Secrets stay clean (M17).** This feature needs no new secret — it reads the system clock. The
+discipline is that nothing got hardcoded *anyway*: the service still reads its config from the
+environment via `.env`, and `.env.example` documents any new keys. The win here is a non-event, which
+is the point — the failure mode (M17: AI hardcodes a value) simply didn't happen, because the pattern
+was already there.
+
+**Tests → PR (M10/M11).** Push the branch, open a PR, and put `Closes #47` in the description so the
+merge closes the issue automatically (M11). The PR is the review gate even though it's your own code —
+*especially* because an AI wrote most of it.
+
+**PR → CI → security scan (M14/M15/M19).** Opening the PR triggers the pipeline on your runner (M19):
+lint, build, tests (M14), then the security gate (M15) — dependency audit, secret scan, SAST. The
+feature added no dependencies, so SCA should be quiet; the secret scan confirms you didn't smuggle a
+key into a fixture. CI is the tireless reviewer that catches the code that *looks* right (M14); the
+security scan catches the failure classes a build check never would (M15).
+
+**Review (M10).** Green CI is necessary, not sufficient. Read the diff like you didn't write it
+(M10). Go straight for the plausibility trap: open `overdue()` and check the comparison. Did it use
+`<` or `<=`? Does a task due today show up as overdue? Does a task with no due date crash the
+comparison or get silently treated as overdue? This is the single least-automatable skill in the
+course, and the capstone is where you prove you have it.
+
+**Merge (M11).** Once CI is green and the diff is honest, squash-merge. Issue #47 closes itself. `main`
+is now ahead by one clean, tested, scanned commit.
+
+**Merge → containerized deploy (M16/M18).** The merge to `main` triggers delivery (M18): CI builds the
+image from your `Dockerfile` (M16), tags it with the new commit SHA (immutable, not `latest`), runs
+`deploy.sh` to start the container with env injected (M17), polls `/health`, and — if health fails —
+rolls back to the previous SHA. Hit `GET /overdue` on the running container. The feature is live, in a
+reproducible artifact, behind a health check that can undo itself.
+
+**If it goes wrong (M12).** Something slips past every gate eventually. A bad merge reverts cleanly
+with `git revert -m 1 <merge-sha>` — a new commit, safe on shared history, no rewriting what teammates
+pulled (M12). A bad deploy is already handled by `deploy.sh`'s rollback to the last good SHA. Recovery
+is a discipline you rehearsed, not a panic.
+
+That's the whole motion. Notice what carried it: not the model. **The model wrote the diff; the
+workflow is everything that made the diff safe to merge and trivial to undo.** Swap the model next
+quarter and every arrow above is unchanged. That's the Module 1 thesis — *the model is the cheap,
+swappable part; the workflow is the durable skill* — now demonstrated rather than asserted.
+
+---
+
+## Hands-on lab
+
+**Lab language:** shell + Python, on the `tasks-app` repo. You'll use your editor-integrated or CLI
+agent (M4) for the implementation; everything else is your normal toolchain.
+
+**You'll need:** the `tasks-app` repo in the prerequisite state above, your agentic tool, your forge
+account, and a working Docker install.
+
+### Part A — Issue and branch (M9, M6, M11)
+
+1. File the issue on your forge. Title: *"Task due dates + `overdue` command + `/overdue` endpoint."*
+   In the body, write the acceptance criteria as you'd hand them to a contributor you don't trust to
+   guess:
+
+   - `add` takes an optional `--due YYYY-MM-DD`.
+   - `overdue` lists pending tasks with a due date strictly before today.
+   - A task due **today** is **not** overdue. A task with **no** due date is **never** overdue.
+   - `serve.py` exposes `GET /overdue` returning the same set as the CLI.
+
+2. Branch off `main`, named for the issue:
+
+   ```bash
+   cd ~/workflow-course/tasks-app
+   git switch main && git pull
+   git switch -c 47-due-dates        # use your real issue number
+   ```
+
+### Part B — Implement with the AI (M4, M5)
+
+3. In your editor/CLI agent, give it the issue, not a vague wish:
+
+   > *"Implement issue #47. Add an optional due date to tasks (core in `tasks.py`), wire `--due` into
+   > the `add` command and a new `overdue` command in `cli.py`, and add a `GET /overdue` endpoint to
+   > `serve.py`. Follow the acceptance criteria exactly. Run the tests before you tell me it's done."*
+
+   You should *not* have to specify "stdlib only" or "don't touch `tasks.json`" — that's in the
+   committed instructions file (M5). If the agent reaches for a date library or hand-edits the JSON,
+   your file needs a line; that's signal, not failure.
+
+4. Run it by hand to confirm it's real:
+
+   ```bash
+   python cli.py add "file taxes" --due 2026-07-15
+   python cli.py add "renew domain" --due 2020-01-01
+   python cli.py overdue        # should list "renew domain", not "file taxes"
+   ```
+
+### Part C — Tests (M13)
+
+5. Have the AI extend `test_tasks.py`, then **read the test names** and confirm the boundaries are
+   actually covered. If "due today" and "no due date" aren't each their own test, add them — by hand
+   or by demanding them. Run the suite:
+
+   ```bash
+   pytest        # or: python -m unittest
+   ```
+
+   Commit only when it's green:
+
+   ```bash
+   git add -A && git commit -m "Add task due dates, overdue command, and /overdue endpoint"
+   ```
+
+### Part D — PR, CI, security, review (M10, M11, M14, M15, M19)
+
+6. Push and open the PR with the closing keyword:
+
+   ```bash
+   git push -u origin 47-due-dates
+   # open the PR on your forge; put "Closes #47" in the description
+   ```
+
+7. Watch the pipeline run on your runner (M19): lint + tests (M14), then the security scan (M15).
+   Don't proceed until it's green.
+
+8. **Review the diff as if a stranger wrote it** (M10). Open `overdue()` and answer, from the code:
+
+   - Is the comparison strict (`<` today) or inclusive (`<=`)? A task due today must **not** appear.
+   - What happens for a task with `due == None`? It must be skipped, not crash, not counted.
+
+   If either is wrong — and an AI gets at least one of these wrong more often than you'd like — request
+   the fix on the branch, let CI re-run, and review again. Catching this *here*, before merge, is the
+   entire point of the gate.
+
+### Part E — Merge and deploy (M11, M16, M18, M17)
+
+9. With CI green and the diff honest, squash-merge. Issue #47 closes itself.
+
+10. Let delivery run, or run it locally if that's your setup (M18):
+
+    ```bash
+    ./deploy.sh           # builds image tagged by commit SHA, injects env, health-checks, can roll back
+    curl localhost:8000/overdue
+    ```
+
+    You should see your overdue task served from the running container — the feature live in a
+    reproducible artifact (M16), configured from the environment (M17), behind a self-rolling-back
+    health check (M18).
+
+### Part F — Rehearse recovery (M12)
+
+11. Prove you can undo it. Find the merge commit and revert it on a throwaway branch, just to watch it
+    work, then delete the branch:
+
+    ```bash
+    git switch -c throwaway-revert-test
+    git revert -m 1 <merge-sha>     # clean undo of the whole feature, as a new commit
+    pytest && git switch main && git branch -D throwaway-revert-test
+    ```
+
+    You just confirmed the escape hatch is real *before* you ever need it in anger.
+
+---
+
+## Stretch variant — run the same feature the Unit 5 way (optional)
+
+Everything above had you in the driver's seat. Now run the **identical** feature with agents *inside*
+the pipeline and watch how much of the loop keeps running when you step back. Do this only after the
+main loop succeeded — you can't supervise a pipeline you haven't run by hand.
+
+The feature, the branch flow, the gates, and the deploy are unchanged. What changes is *who does each
+step*:
+
+1. **Issue-to-PR agent does the first pass (M25).** Assign the issue to an autonomous agent instead of
+   opening your editor. It reads issue #47, creates the branch, implements across `tasks.py`,
+   `cli.py`, and `serve.py`, writes tests, and opens the PR — all landing as a reviewable PR behind
+   CI, exactly like a human contributor's. It is allowed to *propose*, never to merge. The supervision
+   is structural: the same CI (M14) and security (M15) gates stand whether the author is a human or an
+   agent.
+
+2. **An assistive reviewer comments first (M24).** Before you look, an AI reviewer reads the diff
+   against your committed rubric and posts comments on the PR — flagging, ideally, the very `overdue()`
+   boundary you hunted by hand. It comments; it does not approve and does not merge (M24). A human
+   still decides. You read its comments, then read the diff yourself, and notice the reviewer caught
+   the off-by-one — or notice it *missed* it, which is its own lesson about not trusting the assistant
+   blindly.
+
+3. **Evals tell you whether to trust any of it (M27).** Turn the boundary cases from Part C into an
+   eval set — due yesterday, due today, due tomorrow, no due date — and score the agent's
+   implementation against it. Now do the thing the whole course was building to: **swap the model**
+   behind the agent and re-run the *same* eval. If the new model's `overdue()` regresses on the
+   "due today" case, the eval catches it before the PR ever merges. That's the close of the thesis —
+   evals are how you judge a model swap, so the swap you *will* make stays safe (M27).
+
+When this runs, look at what's left for you: filing a crisp issue, reading a diff the assistant
+already annotated, and reading an eval score. The agent drafted; the gates held; the eval judged. The
+workflow didn't just make AI safe to use — it started running itself, with you supervising instead of
+typing. That only works because every catch-net from Units 2–3 was already in place. Take those away
+and "let an agent open a PR" is reckless; with them, it's just another contributor (M11).
+
+---
+
+## Where it breaks
+
+- **A finale is not a shortcut.** The loop is fluent *because* you climbed the modules. Running the
+  capstone without the foundation — no protected `main`, no CI, no tests — isn't "the full loop," it's
+  the copy-paste problem with extra steps. The pipeline's value is entirely in the gates; skip them
+  and you've kept the ceremony and thrown away the safety.
+- **Green CI is not correctness.** Every gate in this loop is a filter, not a guarantee. CI proves the
+  tests pass; it can't prove the tests test the right thing. The `overdue()` boundary trap passes a
+  weak test suite happily. The human review step (M10) is load-bearing and stays load-bearing — the
+  automation raises the floor, it doesn't remove the ceiling.
+- **The stretch variant moves the work, it doesn't delete it.** An issue-to-PR agent doesn't reduce
+  the importance of a well-written issue — it *raises* it, because a vague issue now produces a vague
+  PR with no human in the authoring loop to course-correct. You trade typing for specifying and
+  judging. That's a better trade, not a free one.
+- **Evals are only as honest as their cases.** An eval set that omits the "due today" boundary will
+  bless a broken model swap. The eval doesn't know what you forgot to test (M27). It scales your
+  judgment; it doesn't supply it.
+
+---
+
+## Check for understanding
+
+**You're done when:**
+
+- You shipped the due-dates feature from a filed issue to a running container, and `curl
+  .../overdue` returns the right tasks from the deployed artifact.
+- Issue #47 closed itself on merge, `main` is one clean commit ahead, and you caught (or consciously
+  verified) the `overdue()` boundary in review rather than in production.
+- You can point at each step and name the module it came from without looking — and explain why the
+  *order* is the dependency chain, not an arbitrary checklist.
+- You can state, from what you just did rather than from the syllabus, why the model is the swappable
+  part: every step would survive replacing the model, and the stretch variant's eval is exactly how
+  you'd prove a swap was safe.
+
+If you ran the stretch variant, add one more: you watched an agent author the PR and an assistant
+review it, and you can say precisely which catch-nets from earlier units made handing that work to an
+agent a calm decision instead of a leap.
+
+That's the course. The model wrote the code. **You built the workflow that made the code matter** —
+and that's the part that's still yours when the next model ships.