Reframe sweep M7-27 + capstone (AI drives git, lesson=theory, de-slop) (#93)
Sync course wiki / sync-wiki (push) Successful in 11s
Sync course wiki / sync-wiki (push) Successful in 11s
Co-authored-by: claude <claude@jpaul.io> Co-committed-by: claude <claude@jpaul.io>
This commit was merged in pull request #93.
This commit is contained in:
@@ -56,7 +56,7 @@ something that matters.** You're not asked to build it. You're asked to change o
|
||||
without breaking the other thousand things you've never read.
|
||||
|
||||
This is where AI is simultaneously most tempting and most dangerous. Tempting, because "just ask the
|
||||
AI to figure it out" feels like exactly the leverage you need against 200,000 lines you don't know.
|
||||
AI to figure it out" feels like exactly the help you need against 200,000 lines you don't know.
|
||||
Dangerous, because the AI's two default failure modes get *worse* the bigger and less familiar the
|
||||
codebase is:
|
||||
|
||||
@@ -64,7 +64,7 @@ codebase is:
|
||||
model whether or not the real auth lives there. It confidently describes structure it inferred
|
||||
from names, not from reading. In a small repo you'd catch it. In a huge one you won't.
|
||||
- **It rewrites instead of edits.** Ask for a small change and it hands you a "cleaned-up" version of
|
||||
the whole file — reformatted, renamed, restructured — burying your one-line fix in a 300-line diff
|
||||
the whole file (reformatted, renamed, restructured) burying your one-line fix in a 300-line diff
|
||||
nobody can review. In code you wrote, that's annoying. In code you didn't, it's how an invisible
|
||||
regression ships.
|
||||
|
||||
@@ -90,7 +90,7 @@ table — and crucially, a list of **open questions the code didn't answer.** A
|
||||
trustworthy. A map with no gaps is fiction. This phase is **read-only**; nothing changes on disk.
|
||||
|
||||
**3. Change — the smallest scoped, tested, reviewable diff.** Only now do you edit. One change, one
|
||||
branch (Module 6). Find the blast radius first — every caller of what you're touching — and if you
|
||||
branch (Module 6). Find the blast radius first, every caller of what you're touching, and if you
|
||||
can't enumerate them, you're not ready. Make the minimal edit, add a test that fails without it,
|
||||
run the *full* existing suite, and self-review the diff like it's someone else's PR (Module 10). No
|
||||
drive-by reformatting. No "while I was in here." The diff a reviewer sees should be exactly the
|
||||
@@ -99,7 +99,7 @@ change and nothing else.
|
||||
### Context is the bottleneck, not intelligence
|
||||
|
||||
A frontier model is plenty smart enough to understand any one file in your repo. What it *can't* do
|
||||
is hold all 200,000 lines in its head at once — the context window is finite, and stuffing it full of
|
||||
is hold all 200,000 lines in its head at once. The context window is finite, and stuffing it full of
|
||||
irrelevant code makes the model worse, not better. So the skill here isn't "give the AI more." It's
|
||||
**give the AI the right slice, and a way to fetch more on demand.**
|
||||
|
||||
@@ -116,7 +116,7 @@ of access that turn a guessing model into a grounded one:
|
||||
|
||||
- **The filesystem and code search** — so it can grep for every caller of a function instead of
|
||||
assuming it found them all.
|
||||
- **Language-server intelligence** — go-to-definition, find-references, type info — so "where is this
|
||||
- **Language-server intelligence** (go-to-definition, find-references, type info) so "where is this
|
||||
used?" is answered by the toolchain, not by the model's guess.
|
||||
- **The surrounding systems** — the issue tracker (Module 9), CI results (Module 14), the running
|
||||
app's logs — so the AI maps the code *and* the context it lives in.
|
||||
@@ -146,16 +146,16 @@ in unfamiliar code," they encode *exactly* what careful means, as steps the AI f
|
||||
|
||||
Onboard a human to a legacy codebase and the advice is familiar: read the README, ask a senior dev.
|
||||
What's specific here is that **the AI is both the thing reading the codebase and the thing most
|
||||
likely to confidently misread it** — and the bigger the repo, the wider that gap between "sounds
|
||||
likely to confidently misread it.** The bigger the repo, the wider that gap between "sounds
|
||||
authoritative" and "is correct."
|
||||
|
||||
So the AI-specific discipline is verification, not exploration. The model is genuinely excellent at
|
||||
the grunt work of orientation — reading a hundred files, summarizing structure, tracing a call path —
|
||||
which is exactly the work that's tedious and slow for a human. But it will narrate a wrong map with
|
||||
the grunt work of orientation: reading a hundred files, summarizing structure, tracing a call path.
|
||||
That's exactly the work that's tedious and slow for a human. But it will narrate a wrong map with
|
||||
the same fluent confidence as a right one. Your job shifts from "explore the code" (let the AI do
|
||||
that) to "make the AI prove its map against real files, and keep its changes small enough that a
|
||||
wrong map can't do much damage." The whole earlier toolchain — version control, branches, review,
|
||||
tests, recovery — is what turns "the AI might be wrong about this huge system" from a catastrophe
|
||||
wrong map can't do much damage." The whole earlier toolchain (version control, branches, review,
|
||||
tests, recovery) is what turns "the AI might be wrong about this huge system" from a catastrophe
|
||||
into a revertable diff.
|
||||
|
||||
---
|
||||
@@ -167,7 +167,8 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
||||
|
||||
**You'll need:**
|
||||
|
||||
- Git, Python 3.10+, and your agentic AI tool from Module 4.
|
||||
- Git, Python 3.10+, and the agentic AI tool from Module 4. The lab uses Claude Code as the worked
|
||||
example (`claude --version # sub your own agent`); the steps survive a tool swap.
|
||||
- A real, small-to-medium open-source repo to clone. Pick something with **tests** and a clear
|
||||
build/test command, in a language you can at least read. Good traits: a few thousand lines, an
|
||||
obvious entry point, a documented install (`pip install -e .`, `npm install`, `go mod download`,
|
||||
@@ -208,38 +209,44 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
||||
|
||||
### Part C — One small, scoped, tested change
|
||||
|
||||
6. Pick a genuinely small change — a clearer error message, a fixed edge case, a tiny missing
|
||||
validation, a documented-but-unhandled input. Something a single function owns. First **install
|
||||
the project's dependencies** the way its README says — typically `pip install -e .` (Python),
|
||||
`npm install` (JS/TS), `go mod download` (Go), or the equivalent — *then* run the existing tests
|
||||
to establish a green baseline (`python -m unittest`, `pytest`, `npm test`, `go test ./...` —
|
||||
whatever `ORIENT.md` and the README confirmed). A fresh clone usually won't run green until its
|
||||
deps are installed; if it still won't go green on a clean clone *after* a documented install,
|
||||
that's a setup problem, not your baseline — pick another repo rather than change code on top of an
|
||||
environment you can't trust.
|
||||
6. Pick a genuinely small change: a clearer error message, a fixed edge case, a tiny missing
|
||||
validation, a documented-but-unhandled input. Something a single function owns. Now load the
|
||||
`safe-change` skill (`lab/skills/safe-change.md`) and let Claude Code (sub your own agent) do the
|
||||
setup the skill assigns it. Tell it to install the project's dependencies the way the README says
|
||||
(typically `pip install -e .` for Python, `npm install` for JS/TS, `go mod download` for Go) and
|
||||
run the existing tests to establish a green baseline. **Your job is to verify the result**, not to
|
||||
type the commands. Confirm the suite is actually green, and apply the judgment the skill leaves to
|
||||
you: a fresh clone usually won't run green until its deps are installed, but if it still won't go
|
||||
green on a clean clone *after* a documented install, that's a setup problem rather than your
|
||||
baseline. Pick another repo before you change code on top of an environment you can't trust.
|
||||
|
||||
7. Branch, then load the `safe-change` skill (`lab/skills/safe-change.md`) and work the change with
|
||||
the AI:
|
||||
7. Direct the AI through the change with the `safe-change` skill loaded. Its first action is to
|
||||
create the branch (Step 1 of the skill), so you don't type `git switch` yourself; **verify** it
|
||||
did by running:
|
||||
|
||||
```bash
|
||||
git switch -c scoped-change
|
||||
git status # confirm you're on e.g. scoped-change, not the default branch
|
||||
```
|
||||
|
||||
Make it find the blast radius (every caller) before editing. Keep the edit minimal. Add a test
|
||||
that fails without the change and passes with it. Run the **full** suite.
|
||||
Then direct the rest: make it find the blast radius (every caller) before editing, keep the edit
|
||||
minimal, and add a test that fails without the change and passes with it. Have it run the **full**
|
||||
suite and confirm green.
|
||||
|
||||
8. **Review the diff like it's a stranger's PR (Module 10):**
|
||||
8. **Review the diff like it's a stranger's PR (Module 10).** This part you do by hand; reviewing
|
||||
what the AI wrote is the skill that doesn't transfer to the AI:
|
||||
|
||||
```bash
|
||||
git diff
|
||||
```
|
||||
|
||||
Every changed line should be necessary and explainable. If the AI snuck in a reformat or a
|
||||
rename, revert it — that's the sprawl this whole module exists to prevent. Commit only when the
|
||||
diff is exactly the change and nothing more.
|
||||
rename, tell it to revert that and keep only the scoped change. Once the diff is exactly the
|
||||
change and nothing more, instruct the AI to commit it, then verify the result with
|
||||
`git show` so the commit holds only what you approved.
|
||||
|
||||
9. Write the PR description the `safe-change` skill asks for: what changed, why, the blast radius,
|
||||
how you tested it, and what you deliberately did *not* touch.
|
||||
9. Have the AI draft the PR description the `safe-change` skill asks for (what changed, why, the
|
||||
blast radius, how it was tested, and what it deliberately did *not* touch), then edit it into your
|
||||
own words before it goes up.
|
||||
|
||||
---
|
||||
|
||||
@@ -247,7 +254,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
||||
|
||||
- **A confident map is still just a hypothesis.** The AI will produce a fluent, plausible
|
||||
architecture summary for a repo it half-read. Fluency is not correctness. The citation-checking in
|
||||
Part B isn't optional ceremony — it's the only thing standing between you and changing code based on
|
||||
Part B isn't optional ceremony; it's the only thing standing between you and changing code based on
|
||||
a fiction. Verify at least a few claims by hand, every time.
|
||||
- **The context window is a hard ceiling.** On a truly large monorepo, the AI cannot see everything,
|
||||
and it usually won't *tell* you what it didn't read. Its map is only as good as the slice it
|
||||
@@ -256,7 +263,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
||||
a claim to distrust.
|
||||
- **"Small change" can hide a big blast radius.** A one-line edit to a heavily-called function can
|
||||
ripple through code you never opened. The blast-radius search in the `safe-change` skill is the
|
||||
defense, but it's only as good as the AI's ability to find *every* caller — dynamic dispatch,
|
||||
defense, but it's only as good as the AI's ability to find *every* caller: dynamic dispatch,
|
||||
reflection, config-driven wiring, and string-based lookups all defeat naive search. When in doubt,
|
||||
the tests are your backstop, which is why a repo *without* tests is genuinely dangerous to change
|
||||
this way.
|
||||
@@ -287,7 +294,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
||||
one-off heroics session.
|
||||
|
||||
If your change is a clean, tested, reviewable one-liner in a system you couldn't have described an
|
||||
hour ago — and you trust it — you've got the motion.
|
||||
hour ago, and you trust it, you've got the motion.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# Skill: Map this repo
|
||||
|
||||
A navigation playbook (a Module 21 skill) for orienting in a codebase you didn't write.
|
||||
Point your agentic tool at this file as a skill, or paste it in as instructions. The goal is a
|
||||
Point Claude Code (or sub your own agent) at this file as a skill, or paste it in as instructions. The goal is a
|
||||
**read-only** mental model — no edits happen here.
|
||||
|
||||
## When to use
|
||||
@@ -11,7 +11,7 @@ At the start of any session on an unfamiliar repo, before any change is discusse
|
||||
- **Read only.** Do not edit, create, or delete files while mapping. No exceptions.
|
||||
- **Cite real paths.** Every claim about the code must point to a file and, ideally, a line range.
|
||||
If you can't cite it, say "unverified" instead of guessing.
|
||||
- **Breadth before depth.** Establish the whole shape before diving into any one area.
|
||||
- **Breadth before depth.** Establish the whole shape before going deep on any one area.
|
||||
- **No conclusions from file names alone.** A file called `auth.py` may not be where auth lives.
|
||||
|
||||
## Steps
|
||||
|
||||
Reference in New Issue
Block a user