Reframe sweep M7-27 + capstone (AI drives git, lesson=theory, de-slop) (#93)
Sync course wiki / sync-wiki (push) Successful in 11s

Co-authored-by: claude <claude@jpaul.io>
Co-committed-by: claude <claude@jpaul.io>
This commit was merged in pull request #93.
This commit is contained in:
2026-06-22 21:58:36 -04:00
committed by Claude (agent)
parent a29823f4b3
commit 513d7e7ac8
38 changed files with 1735 additions and 1424 deletions
@@ -56,7 +56,7 @@ something that matters.** You're not asked to build it. You're asked to change o
without breaking the other thousand things you've never read.
This is where AI is simultaneously most tempting and most dangerous. Tempting, because "just ask the
AI to figure it out" feels like exactly the leverage you need against 200,000 lines you don't know.
AI to figure it out" feels like exactly the help you need against 200,000 lines you don't know.
Dangerous, because the AI's two default failure modes get *worse* the bigger and less familiar the
codebase is:
@@ -64,7 +64,7 @@ codebase is:
model whether or not the real auth lives there. It confidently describes structure it inferred
from names, not from reading. In a small repo you'd catch it. In a huge one you won't.
- **It rewrites instead of edits.** Ask for a small change and it hands you a "cleaned-up" version of
the whole file reformatted, renamed, restructured burying your one-line fix in a 300-line diff
the whole file (reformatted, renamed, restructured) burying your one-line fix in a 300-line diff
nobody can review. In code you wrote, that's annoying. In code you didn't, it's how an invisible
regression ships.
@@ -90,7 +90,7 @@ table — and crucially, a list of **open questions the code didn't answer.** A
trustworthy. A map with no gaps is fiction. This phase is **read-only**; nothing changes on disk.
**3. Change — the smallest scoped, tested, reviewable diff.** Only now do you edit. One change, one
branch (Module 6). Find the blast radius first every caller of what you're touching and if you
branch (Module 6). Find the blast radius first, every caller of what you're touching, and if you
can't enumerate them, you're not ready. Make the minimal edit, add a test that fails without it,
run the *full* existing suite, and self-review the diff like it's someone else's PR (Module 10). No
drive-by reformatting. No "while I was in here." The diff a reviewer sees should be exactly the
@@ -99,7 +99,7 @@ change and nothing else.
### Context is the bottleneck, not intelligence
A frontier model is plenty smart enough to understand any one file in your repo. What it *can't* do
is hold all 200,000 lines in its head at once — the context window is finite, and stuffing it full of
is hold all 200,000 lines in its head at once. The context window is finite, and stuffing it full of
irrelevant code makes the model worse, not better. So the skill here isn't "give the AI more." It's
**give the AI the right slice, and a way to fetch more on demand.**
@@ -116,7 +116,7 @@ of access that turn a guessing model into a grounded one:
- **The filesystem and code search** — so it can grep for every caller of a function instead of
assuming it found them all.
- **Language-server intelligence** go-to-definition, find-references, type info so "where is this
- **Language-server intelligence** (go-to-definition, find-references, type info) so "where is this
used?" is answered by the toolchain, not by the model's guess.
- **The surrounding systems** — the issue tracker (Module 9), CI results (Module 14), the running
app's logs — so the AI maps the code *and* the context it lives in.
@@ -146,16 +146,16 @@ in unfamiliar code," they encode *exactly* what careful means, as steps the AI f
Onboard a human to a legacy codebase and the advice is familiar: read the README, ask a senior dev.
What's specific here is that **the AI is both the thing reading the codebase and the thing most
likely to confidently misread it** — and the bigger the repo, the wider that gap between "sounds
likely to confidently misread it.** The bigger the repo, the wider that gap between "sounds
authoritative" and "is correct."
So the AI-specific discipline is verification, not exploration. The model is genuinely excellent at
the grunt work of orientation reading a hundred files, summarizing structure, tracing a call path
which is exactly the work that's tedious and slow for a human. But it will narrate a wrong map with
the grunt work of orientation: reading a hundred files, summarizing structure, tracing a call path.
That's exactly the work that's tedious and slow for a human. But it will narrate a wrong map with
the same fluent confidence as a right one. Your job shifts from "explore the code" (let the AI do
that) to "make the AI prove its map against real files, and keep its changes small enough that a
wrong map can't do much damage." The whole earlier toolchain version control, branches, review,
tests, recovery is what turns "the AI might be wrong about this huge system" from a catastrophe
wrong map can't do much damage." The whole earlier toolchain (version control, branches, review,
tests, recovery) is what turns "the AI might be wrong about this huge system" from a catastrophe
into a revertable diff.
---
@@ -167,7 +167,8 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
**You'll need:**
- Git, Python 3.10+, and your agentic AI tool from Module 4.
- Git, Python 3.10+, and the agentic AI tool from Module 4. The lab uses Claude Code as the worked
example (`claude --version # sub your own agent`); the steps survive a tool swap.
- A real, small-to-medium open-source repo to clone. Pick something with **tests** and a clear
build/test command, in a language you can at least read. Good traits: a few thousand lines, an
obvious entry point, a documented install (`pip install -e .`, `npm install`, `go mod download`,
@@ -208,38 +209,44 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
### Part C — One small, scoped, tested change
6. Pick a genuinely small change a clearer error message, a fixed edge case, a tiny missing
validation, a documented-but-unhandled input. Something a single function owns. First **install
the project's dependencies** the way its README says — typically `pip install -e .` (Python),
`npm install` (JS/TS), `go mod download` (Go), or the equivalent — *then* run the existing tests
to establish a green baseline (`python -m unittest`, `pytest`, `npm test`, `go test ./...` —
whatever `ORIENT.md` and the README confirmed). A fresh clone usually won't run green until its
deps are installed; if it still won't go green on a clean clone *after* a documented install,
that's a setup problem, not your baseline — pick another repo rather than change code on top of an
environment you can't trust.
6. Pick a genuinely small change: a clearer error message, a fixed edge case, a tiny missing
validation, a documented-but-unhandled input. Something a single function owns. Now load the
`safe-change` skill (`lab/skills/safe-change.md`) and let Claude Code (sub your own agent) do the
setup the skill assigns it. Tell it to install the project's dependencies the way the README says
(typically `pip install -e .` for Python, `npm install` for JS/TS, `go mod download` for Go) and
run the existing tests to establish a green baseline. **Your job is to verify the result**, not to
type the commands. Confirm the suite is actually green, and apply the judgment the skill leaves to
you: a fresh clone usually won't run green until its deps are installed, but if it still won't go
green on a clean clone *after* a documented install, that's a setup problem rather than your
baseline. Pick another repo before you change code on top of an environment you can't trust.
7. Branch, then load the `safe-change` skill (`lab/skills/safe-change.md`) and work the change with
the AI:
7. Direct the AI through the change with the `safe-change` skill loaded. Its first action is to
create the branch (Step 1 of the skill), so you don't type `git switch` yourself; **verify** it
did by running:
```bash
git switch -c scoped-change
git status # confirm you're on e.g. scoped-change, not the default branch
```
Make it find the blast radius (every caller) before editing. Keep the edit minimal. Add a test
that fails without the change and passes with it. Run the **full** suite.
Then direct the rest: make it find the blast radius (every caller) before editing, keep the edit
minimal, and add a test that fails without the change and passes with it. Have it run the **full**
suite and confirm green.
8. **Review the diff like it's a stranger's PR (Module 10):**
8. **Review the diff like it's a stranger's PR (Module 10).** This part you do by hand; reviewing
what the AI wrote is the skill that doesn't transfer to the AI:
```bash
git diff
```
Every changed line should be necessary and explainable. If the AI snuck in a reformat or a
rename, revert it — that's the sprawl this whole module exists to prevent. Commit only when the
diff is exactly the change and nothing more.
rename, tell it to revert that and keep only the scoped change. Once the diff is exactly the
change and nothing more, instruct the AI to commit it, then verify the result with
`git show` so the commit holds only what you approved.
9. Write the PR description the `safe-change` skill asks for: what changed, why, the blast radius,
how you tested it, and what you deliberately did *not* touch.
9. Have the AI draft the PR description the `safe-change` skill asks for (what changed, why, the
blast radius, how it was tested, and what it deliberately did *not* touch), then edit it into your
own words before it goes up.
---
@@ -247,7 +254,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
- **A confident map is still just a hypothesis.** The AI will produce a fluent, plausible
architecture summary for a repo it half-read. Fluency is not correctness. The citation-checking in
Part B isn't optional ceremony it's the only thing standing between you and changing code based on
Part B isn't optional ceremony; it's the only thing standing between you and changing code based on
a fiction. Verify at least a few claims by hand, every time.
- **The context window is a hard ceiling.** On a truly large monorepo, the AI cannot see everything,
and it usually won't *tell* you what it didn't read. Its map is only as good as the slice it
@@ -256,7 +263,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
a claim to distrust.
- **"Small change" can hide a big blast radius.** A one-line edit to a heavily-called function can
ripple through code you never opened. The blast-radius search in the `safe-change` skill is the
defense, but it's only as good as the AI's ability to find *every* caller dynamic dispatch,
defense, but it's only as good as the AI's ability to find *every* caller: dynamic dispatch,
reflection, config-driven wiring, and string-based lookups all defeat naive search. When in doubt,
the tests are your backstop, which is why a repo *without* tests is genuinely dangerous to change
this way.
@@ -287,7 +294,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
one-off heroics session.
If your change is a clean, tested, reviewable one-liner in a system you couldn't have described an
hour ago and you trust it you've got the motion.
hour ago, and you trust it, you've got the motion.
---
@@ -1,7 +1,7 @@
# Skill: Map this repo
A navigation playbook (a Module 21 skill) for orienting in a codebase you didn't write.
Point your agentic tool at this file as a skill, or paste it in as instructions. The goal is a
Point Claude Code (or sub your own agent) at this file as a skill, or paste it in as instructions. The goal is a
**read-only** mental model — no edits happen here.
## When to use
@@ -11,7 +11,7 @@ At the start of any session on an unfamiliar repo, before any change is discusse
- **Read only.** Do not edit, create, or delete files while mapping. No exceptions.
- **Cite real paths.** Every claim about the code must point to a file and, ideally, a line range.
If you can't cite it, say "unverified" instead of guessing.
- **Breadth before depth.** Establish the whole shape before diving into any one area.
- **Breadth before depth.** Establish the whole shape before going deep on any one area.
- **No conclusions from file names alone.** A file called `auth.py` may not be where auth lives.
## Steps