Files
ai-workflow-course/blog/17-capstone-the-full-loop.md
T
2026-06-22 19:15:32 -04:00

11 KiB

The Full Loop: One Feature, End to End — and the End of the Copy-Paste Problem

We started this whole thing with a confession: the AI was never your problem. It writes good code. The problem was everything around the code — the copy, the paste, the hand-merge, the "wait, what did I change?", the no-undo, the cold-start every morning. That loop. I named it in the very first post and asked you to feel it on purpose, deliberately, until it itched.

This is the post where we close it.

Not with another tool. We're out of new tools. The capstone doesn't teach you anything — it takes the twenty-seven things you already learned, separately, in their own little modules, and runs them as one continuous motion. That's the whole payoff, and it's a payoff you can't get from any single lesson, because the point isn't any single lesson. The point is that they connect.

If you've been following the series here on the blog, this is the part where the pile of tips stops being a pile.

One feature. Three surfaces. Every gate.

Here's the trick that makes a capstone honest: pick something small enough to finish in one sitting but real enough to touch the whole stack. We're adding due dates to the running tasks-app:

  • A task can carry an optional due date: python cli.py add "file taxes" --due 2026-09-15.
  • A new overdue command lists pending tasks whose due date has already passed.
  • The deployed service grows a matching GET /overdue endpoint, so the change is visible in the running container — not just the CLI.

That's deliberately three surfaces — the core (tasks.py), the CLI (cli.py), and the deployable service (serve.py). One feature, three files. Which, if you remember the very first seam we ever named, is exactly the kind of change that used to mean three copy-paste sessions and a prayer. We're going to do it once, as a single fluent pass, and not paste anything anywhere.

And it has a trap baked in, which we'll get to.

The loop, as one breath

Read this once as a map before you touch the keyboard. Every arrow is a module you already climbed — I'll name them, because watching the dependency chain collapse into a single pass is the entire experience.

Prompt → issue. Don't start in your editor. Start with the work written down. File an issue — "Add optional due dates, an overdue command, and a /overdue endpoint" — with acceptance criteria in the body. The issue is the contract everything else closes against.

Issue → branch. Never work on main. git switch -c 47-due-dates. The branch is a sandbox you can throw away wholesale — which is the only reason turning an AI loose on three files at once is a calm decision instead of a gamble.

Branch → AI implementation, with the config already in place. Now the AI edits the files directly, in your editor or CLI. No browser. No paste. And here's the quiet hero of the whole loop: it already knows your conventions — stdlib only, core logic in tasks.py, run the tests before claiming done — because the committed instructions file has been sitting in the repo since the first commit. You don't re-explain a thing. That's the file we committed back in the Module 5 post earning its keep, silently, on a day you forgot it was even there.

Implementation → tests. The feature isn't done when it runs; it's done when it's pinned. Have the AI extend test_tasks.py — but write the boundary cases yourself, or demand them by name, because the boundary is exactly where the AI guesses: due yesterday (overdue), due tomorrow (not), due today (not — yet), no due date at all (never overdue, never crashes).

Tests → PR → CI → security scan. Push the branch, open a PR, put Closes #47 in the description. Opening it triggers the pipeline on your runner: lint, build, tests, then the security gate — dependency audit, secret scan, SAST. CI is the tireless reviewer that catches the code that looks right; the scan catches the failure classes a build check never would.

Review. Green CI is necessary, not sufficient. Read the diff like a stranger wrote it — and go straight for the trap. Open overdue(). Did it use < or <=? Does a task due today show up as overdue? Does a task with no due date crash the comparison, or get silently treated as overdue? This is the single least-automatable skill in the whole course, and the capstone is where you prove you've got it. (An AI gets one of these wrong more often than you'd like. That's not a knock on the AI — it's the reason the gate exists.)

Merge → containerized deploy. Squash-merge. Issue #47 closes itself. The merge to main triggers delivery: CI builds the image from your Dockerfile, tags it with the new commit SHA (immutable, not latest), runs deploy.sh to start the container with env injected, polls /health, and — if health fails — rolls itself back to the previous SHA. Then you curl localhost:8000/overdue and watch your overdue task come back from the running container.

The feature is live. In a reproducible artifact. Behind a health check that can undo itself.

[insert a screenshot referencing a green CI pipeline on the PR — lint, tests, and the security scan all passing — here]

What actually carried it

Stop and notice what just happened, because it's easy to miss when it goes smoothly: not one step of that loop depended on which model wrote the code.

The model wrote the diff. The workflow is everything that made the diff safe to merge and trivial to undo — the branch, the tests, the gate, the review, the immutable tag, the rollback. Swap the model next quarter and every arrow above is unchanged. That's the line this whole series hangs on, and now you've done it rather than read it: the model is the cheap, swappable part. The workflow around it is the skill that lasts.

That's also the answer to the copy-paste problem, all the way down. Seam one — more than one file? The AI touched three and you never hand-merged a thing. Seam two — more than one day? The issue and the committed config carry the context, so there's no cold-start to reconstruct. Seam three — no undo, no record, no safety? Every change is a commit, every commit is reviewed, every deploy can roll back, and you literally rehearsed the revert before you needed it. The loop that used to be a high-wire act with no net is now a pipeline with nets at every seam.

The stretch variant — watch it start running itself

Here's where it gets genuinely fun. Everything above had you in the driver's seat. Now run the identical feature the Unit 5 way, with agents inside the pipeline, and watch how much of the loop keeps running when you step back.

  • An issue-to-PR agent does the first pass. Assign issue #47 to an autonomous agent instead of opening your editor. It reads the issue, cuts the branch, implements across all three files, writes tests, and opens the PR — landing as a reviewable PR behind CI, exactly like a human contributor's. It's allowed to propose, never to merge.
  • An assistive reviewer comments first. Before you even look, an AI reviewer reads the diff against your rubric and posts comments — flagging, ideally, the very overdue() boundary you'd have hunted by hand. It comments; it does not approve. A human still decides. (Sometimes it catches the off-by-one. Sometimes it misses it — which is its own lesson about not trusting the assistant blindly.)
  • Evals tell you whether to trust any of it. Turn the boundary cases into an eval set, score the agent's implementation, then do the thing the whole course was building toward: swap the model and re-run the same eval. If the new model regresses on "due today," the eval catches it before the PR ever merges.

When this runs, look at what's left for you: filing a crisp issue, reading a diff the assistant already annotated, reading an eval score. The agent drafted. The gates held. The eval judged. The workflow didn't just make AI safe to use — it started running itself, with you supervising instead of typing.

And it only works because every catch-net from the earlier units was already in place. Take them away and "let an agent open a PR" is reckless. With them, it's just another contributor.

Where it breaks (one last honest section)

I'm not going to drop the honesty in the finale.

  • A finale is not a shortcut. The loop is fluent because you climbed the modules. Run the capstone without the foundation — no protected main, no CI, no tests — and it isn't "the full loop," it's the copy-paste problem with extra steps. All the value is in the gates; skip them and you've kept the ceremony and thrown away the safety.
  • Green CI is not correctness. Every gate is a filter, not a guarantee. CI proves the tests pass; it can't prove the tests test the right thing. That overdue() boundary sails through a weak test suite happily. The human review step is load-bearing and stays load-bearing — automation raises the floor, it doesn't remove the ceiling.
  • The stretch variant moves the work; it doesn't delete it. An issue-to-PR agent raises the importance of a well-written issue, because a vague issue now produces a vague PR with no human in the authoring loop to course-correct. You trade typing for specifying and judging. Better trade. Not a free one.

That's the course

We started seventeen posts ago with a loop that broke at three seams, and a promise that the fix was never a smarter model — it was the scaffolding around it. You've now built that scaffolding, one piece at a time, and in this last lab you watched the pieces stop being pieces. One feature went from a sentence you typed to a container serving traffic, and you can point at every step and name the module it came from.

The model wrote the code. You built the workflow that made the code matter — and that's the part that's still yours when the next model ships, and the one after that.

So here's my actual ask, and it's the last one. If you've only been reading along here on the blog: go take The Workflow. It's free, it's self-paced, every module ends at a concrete "you're done when," and the capstone above is waiting for you at the end of it. And when you've shipped your own version of this loop — your own feature, your own three surfaces, your own green pipeline — come back and tell me what you built. Drop it in the comments. I read every one of them, and watching people close their own copy-paste loop is genuinely the whole reason I made this.

Go build something. Then ship it the right way.