fix(M7-27+capstone): apply AI-drives-git reframe, lesson=theory, de-slop course-wide
Phase 2 sweep — all modules are post-pivot, so the learner directs the AI agent
(Claude Code as the worked example) to do the git/setup work and verifies, instead
of typing commands by hand; no re-teaching basics. Lesson sections are theory with
example output; all execution lives in the labs. De-slopped ("prose" etc. gone
course-wide, em-dash density thinned). /path/to placeholders -> ~/ai-workflow-course.
Every deliberate teaching device verified intact: M10 ai-change.patch trap,
M12 bad-clear-snippet, M13/M27 planted pending_count bug, M15 secret+typosquat+MD5,
M18 BREAK=1, M21 absent-.gitignore, M22 poisoned skill, M24 no-op patch, M25 --simulate.
Labs compile/parse (py/sh/yaml/json); no junk.
Closes #83
Closes #86
Closes #89
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
This commit is contained in:
@@ -1,8 +1,8 @@
|
||||
# Module 10 — Reviewing Code You Didn't Write
|
||||
|
||||
> **The AI wrote a diff that reads beautifully and is wrong in one line you'll skim right past.**
|
||||
> Reviewing for *plausibility traps* — not just bugs — is the highest-leverage, least-taught skill
|
||||
> in this whole space. This module gives you a gate to run it at and a checklist to run.
|
||||
> Reviewing for *plausibility traps*, not just bugs, is a skill almost nobody teaches. This module
|
||||
> gives you a gate to run it at and a checklist to run.
|
||||
|
||||
---
|
||||
|
||||
@@ -11,13 +11,13 @@
|
||||
- **Module 2 — Version Control as a Safety Net.** You read changes with `git diff`. This module
|
||||
turns that one-off habit into a disciplined review pass over a whole change.
|
||||
- **Module 8 — Remotes and Hosting.** Your repo lives on a host now, and a change arrives as a
|
||||
*pull request* (GitHub/Gitea/Forgejo) or *merge request* (GitLab) — same thing, different name.
|
||||
*pull request* (GitHub/Gitea/Forgejo) or *merge request* (GitLab): same thing, different name.
|
||||
We'll write "PR" throughout; it's the unit of review.
|
||||
- **Module 9 — Issues and the Task Layer** (helpful, not required). A PR usually answers an issue;
|
||||
the issue is the "what I asked for" you review the diff against.
|
||||
|
||||
If you only have Modules 1–2, you can still do the core skill of this module locally — reviewing a
|
||||
diff between two branches with `git diff` — and skip the part where you open it as a PR on a host.
|
||||
If you only have Modules 1–2, you can still do the core skill of this module locally (reviewing a
|
||||
diff between two branches with `git diff`) and skip the part where you open it as a PR on a host.
|
||||
|
||||
---
|
||||
|
||||
@@ -26,11 +26,11 @@ diff between two branches with `git diff` — and skip the part where you open i
|
||||
By the end of this module you can:
|
||||
|
||||
1. Use a pull request as a **review gate**: nothing reaches the main branch without passing through
|
||||
a diff someone (or something) signed off on — even on a solo repo.
|
||||
a diff someone (or something) signed off on, even on a solo repo.
|
||||
2. Read an AI-generated diff the right way: against the request, deletions first, the diff over the
|
||||
AI's own description of it.
|
||||
3. Name and spot the four **plausibility traps** — invented APIs, silent scope creep, deleted
|
||||
edge-case handling, and convincing-but-wrong logic — that pass a human skim and a quick run.
|
||||
3. Name and spot the four **plausibility traps** (invented APIs, silent scope creep, deleted
|
||||
edge-case handling, convincing-but-wrong logic) that pass a human skim and a quick run.
|
||||
4. Run a repeatable **AI-diff review checklist** and end every review with an explicit
|
||||
*approve* / *request changes* decision you can defend.
|
||||
|
||||
@@ -42,7 +42,7 @@ By the end of this module you can:
|
||||
|
||||
A pull request proposes merging a branch into another (usually `main`) and pauses there so the
|
||||
change can be looked at *before* it lands. On a team that pause is where review happens. The trap
|
||||
is treating it as a rubber stamp — "looks good, merge" — which is exactly how bad changes get the
|
||||
is treating it as a rubber stamp ("looks good, merge"), which is exactly how bad changes get the
|
||||
institutional blessing of "it was reviewed."
|
||||
|
||||
Reframe it the way you already think about change control: **a PR is a change gate, and merge is a
|
||||
@@ -51,7 +51,7 @@ The cheapest place to catch a problem is in the diff, before the door closes. Yo
|
||||
(that's Module 12), but recovery is always more expensive than the review you skipped.
|
||||
|
||||
This holds **even when you're the only human on the repo.** That's not bureaucracy for its own
|
||||
sake — the syllabus's own course repo opens a PR for every module for exactly two reasons that
|
||||
sake. The syllabus's own course repo opens a PR for every module for exactly two reasons that
|
||||
apply to you solo:
|
||||
|
||||
- **Traceability.** The PR is a durable record of *what changed and why*, linked to the issue it
|
||||
@@ -65,23 +65,23 @@ When the author is an AI, both reasons get sharper. The AI produced the change w
|
||||
confidence and no memory of why; the PR is where a human supplies the judgment and the record the
|
||||
AI can't.
|
||||
|
||||
### Why this is a genuinely new skill
|
||||
### Why this is a new skill
|
||||
|
||||
You already know how to review human code. Reviewing AI code is *not the same activity*, and
|
||||
assuming it is gets people burned.
|
||||
|
||||
When a human writes a function, the bugs cluster where the human was uncertain — the gnarly edge,
|
||||
When a human writes a function, the bugs cluster where the human was uncertain: the gnarly edge,
|
||||
the bit they rushed, the TODO they meant to come back to. You can often *feel* the soft spots, and
|
||||
the code's roughness is a signal: confusing code is suspicious code.
|
||||
|
||||
AI output inverts that signal. It is **uniformly fluent.** The variable names are good, the
|
||||
structure is clean, the comment above the broken line confidently states the correct intention,
|
||||
and the one wrong line looks exactly as polished as the forty right ones. The fluency is constant;
|
||||
the correctness is not — and your eye has spent a career using fluency as a proxy for correctness.
|
||||
the correctness is not, and your eye has spent a career using fluency as a proxy for correctness.
|
||||
That proxy is now actively misleading.
|
||||
|
||||
So the question shifts. With human code you mostly ask *"is this good code?"* With AI code you have
|
||||
to ask *"is this code true?"* — does it do what it claims, against the request I actually made,
|
||||
to ask *"is this code true?"*: does it do what it claims, against the request I actually made,
|
||||
using things that actually exist. That's reviewing for **plausibility traps**: code engineered (by
|
||||
a process optimizing for plausible-looking output) to pass exactly the skim you're tempted to give
|
||||
it.
|
||||
@@ -92,15 +92,15 @@ These are the failure modes to hunt for specifically. They're not random bugs; t
|
||||
characteristic ways fluent-but-untrue code goes wrong.
|
||||
|
||||
**1. Invented APIs.** The model reaches for a function, method, keyword argument, flag, config key,
|
||||
or endpoint that *should* exist by analogy — and doesn't, or exists with a different signature.
|
||||
or endpoint that *should* exist by analogy, and doesn't, or exists with a different signature.
|
||||
It's the same generative move behind hallucinated package names (the supply-chain version of this
|
||||
gets its own treatment in Module 15). The tell is that it reads *more* natural than the real API,
|
||||
because it was generated to be plausible rather than recalled from docs. Classic shape: assuming
|
||||
`list.pop(i, default)` works because `dict.pop(k, default)` does. Verify every unfamiliar
|
||||
symbol against real docs or source — confidence in the surrounding prose is not evidence.
|
||||
symbol against real docs or source. Confidence in the surrounding words is not evidence.
|
||||
|
||||
**2. Silent scope creep.** You asked for one thing; the diff does that thing *and* quietly
|
||||
"improves" three others it was never asked to touch — reformatting a file, reshuffling imports,
|
||||
"improves" three others it was never asked to touch: reformatting a file, reshuffling imports,
|
||||
renaming a variable across the module, "simplifying" an unrelated function. Each extra edit is an
|
||||
unrequested change you now have to review with no stated intent behind it, and it's where
|
||||
regressions hide. The discipline: **every hunk must trace back to the request.** Anything that
|
||||
@@ -109,7 +109,7 @@ own PR."
|
||||
|
||||
**3. Deleted edge-case handling.** The most dangerous trap, because it lives in the `-` lines you
|
||||
skim. While implementing the feature, the model drops a bounds check, removes a `None` guard,
|
||||
collapses a `try/except` into the happy path, or — worst — *replaces a real error with a silent
|
||||
collapses a `try/except` into the happy path, or, worst, *replaces a real error with a silent
|
||||
swallow* (`except: pass`) under the banner of "making it robust." The code now looks cleaner and
|
||||
passes every test you'd casually run, because you'd test the path that works. The bad input that
|
||||
the deleted guard existed to catch now fails silently. **Read every deletion. Deletions are where
|
||||
@@ -118,29 +118,35 @@ behavior disappears.**
|
||||
**4. Convincing-but-wrong logic.** An inverted condition (`if not x` where it meant `if x`), an
|
||||
off-by-one, `<` where it meant `<=`, `and` where it meant `or`, a filter quietly dropped from a
|
||||
comprehension. On the happy path it often produces a believable-enough result, and the comment
|
||||
above it cheerfully describes the *correct* behavior — so the comment actively vouches for the bug.
|
||||
above it cheerfully describes the *correct* behavior, so the comment actively vouches for the bug.
|
||||
The defense is to **trace one real call through the changed code yourself** instead of trusting the
|
||||
narration.
|
||||
|
||||
A real AI diff usually has *most lines correct* and one trap buried in legitimate work — which is
|
||||
what makes it dangerous. The feature genuinely works when you try it; the trap is somewhere you
|
||||
A real AI diff usually has *most lines correct* and one trap buried in legitimate work, which is
|
||||
what makes it dangerous. The feature really does work when you try it; the trap is somewhere you
|
||||
didn't look.
|
||||
|
||||
### How to actually read the diff
|
||||
|
||||
Mechanics first. You want the change as one reviewable unit, separate from the code you wrote it in:
|
||||
You want the change as one reviewable unit, separate from the editor you generated it in. On your
|
||||
host's PR page that's the default view: the whole change as a diff, with line comments,
|
||||
file-by-file navigation, and CI results attached. The same change reads as a block of `+`/`-`
|
||||
lines, for example a hunk that quietly drops a guard:
|
||||
|
||||
```bash
|
||||
git fetch # get the branch the PR is built from
|
||||
git diff main..feature-branch # the whole change, as one diff
|
||||
```diff
|
||||
def charge(amount):
|
||||
- if amount <= 0:
|
||||
- raise ValueError("amount must be positive")
|
||||
gateway.charge(amount)
|
||||
```
|
||||
|
||||
On your host's PR page you get the same diff with line comments, file-by-file navigation, and the
|
||||
CI results attached — use it. But the content of the review is the same whether you read it in the
|
||||
browser or the terminal.
|
||||
That block is the unit of review, whether you read it in the browser or have the agent pull it up
|
||||
in the terminal. You already know the git for this from Module 2, and from Module 4 on the agent
|
||||
fetches the branch and surfaces the diff for you. Your job is the reading, and reading the `-`
|
||||
lines first: the deleted guard above is exactly the kind of thing a skim sails past.
|
||||
|
||||
Then run the pass in this order (the full version is in
|
||||
[`lab/ai-diff-review-checklist.md`](lab/ai-diff-review-checklist.md) — keep it open while you work):
|
||||
Run the pass in this order (the full version is in
|
||||
[`lab/ai-diff-review-checklist.md`](lab/ai-diff-review-checklist.md), keep it open while you work):
|
||||
|
||||
1. **State the request in one sentence.** This is your scope yardstick. If it answers an issue
|
||||
(Module 9), that's your sentence.
|
||||
@@ -148,14 +154,14 @@ Then run the pass in this order (the full version is in
|
||||
what it *did*. Only the diff is real.
|
||||
3. **Scope check.** Every hunk maps to the request. Flag everything that doesn't.
|
||||
4. **Deletions first.** Read every `-` line and ask what behavior just left the codebase.
|
||||
5. **Verify the unfamiliar.** Every API, flag, and key you don't personally know exists —
|
||||
5. **Verify the unfamiliar.** Every API, flag, and key you don't personally know exists:
|
||||
check it.
|
||||
6. **Trace one real call**, including a failure case. Not the happy path — the bad input.
|
||||
6. **Trace one real call**, including a failure case. Not the happy path, the bad input.
|
||||
7. **Decide.** Approve only if you can explain every hunk. Otherwise request changes. The burden of
|
||||
proof is on the diff, not on you.
|
||||
|
||||
That last point is the whole posture: **a diff is guilty until proven correct.** "It runs" is the
|
||||
weakest evidence there is — the traps above are *designed* to run.
|
||||
weakest evidence there is; the traps above are *designed* to run.
|
||||
|
||||
---
|
||||
|
||||
@@ -164,20 +170,20 @@ weakest evidence there is — the traps above are *designed* to run.
|
||||
Every other module here makes a tool more valuable because of AI. This module is the one where the
|
||||
*human stays in the loop on purpose*, and it's worth being precise about why.
|
||||
|
||||
The thing AI is best at — producing fluent, confident, well-structured output — is precisely the
|
||||
The thing AI is best at, producing fluent, confident, well-structured output, is precisely the
|
||||
thing that defeats the review reflex you built reviewing humans. You learned to trust clean code
|
||||
and distrust messy code; AI produces uniformly clean code regardless of whether it's correct, so
|
||||
that heuristic now points the wrong way. Reviewing AI diffs means consciously *overriding* an
|
||||
instinct that served you well for years.
|
||||
|
||||
And the volume cuts against you. AI makes generating a 300-line PR almost free, which quietly
|
||||
shifts the bottleneck from *writing* to *reviewing* — and tempts everyone to review at the speed
|
||||
they generate. The economics of the team now hinge on review being the gate that writing no longer
|
||||
is. The fluent-but-wrong line costs nothing to produce and everything to miss.
|
||||
And the volume cuts against you. AI makes generating a 300-line PR almost free, which shifts the
|
||||
bottleneck from *writing* to *reviewing* and tempts everyone to review at the speed they generate.
|
||||
Review is now the gate that writing no longer is. The fluent-but-wrong line costs nothing to
|
||||
produce and everything to miss.
|
||||
|
||||
This is the human half of a loop you'll keep building. Module 11 wires this review gate into the
|
||||
full issue → branch → PR → review → merge motion with humans *and* agents as contributors. Much
|
||||
later, Module 24 looks at AI *reviewers* that comment on PRs automatically — but an automated
|
||||
later, Module 24 looks at AI *reviewers* that comment on PRs automatically, but an automated
|
||||
reviewer is an assistant to this skill, not a replacement for it. You can't supervise a review bot
|
||||
you couldn't do yourself.
|
||||
|
||||
@@ -190,28 +196,41 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
|
||||
**You'll need:**
|
||||
|
||||
- Git, Python 3.10+, and your AI assistant.
|
||||
- Git, Python 3.10+, and your coding agent (Claude Code in the examples; sub your own).
|
||||
- The starter base app in [`lab/tasks-app/`](lab/tasks-app/) (`tasks.py`, `cli.py`). It's the
|
||||
Module 1/2 app with one addition: `complete()` validates the index and `done` turns a bad index
|
||||
into a clean error. Note that behavior — the trap will mess with it.
|
||||
into a clean error. Note that behavior; the trap will mess with it.
|
||||
- The planted AI change in [`lab/ai-change.patch`](lab/ai-change.patch).
|
||||
- The review checklist in [`lab/ai-diff-review-checklist.md`](lab/ai-diff-review-checklist.md).
|
||||
- **Optional (Part A as a real PR):** the repo you pushed to a host in Module 8. If you don't have
|
||||
one, do Part A locally as a branch — the review skill in Parts B–C is identical either way.
|
||||
one, do Part A locally as a branch; the review skill in Parts B–C is identical either way.
|
||||
|
||||
### Part A — Open a PR as a gate
|
||||
|
||||
1. Set up the base app as a repo and confirm its baseline behavior. This `review-lab` is a
|
||||
throwaway repo *separate* from the `tasks-app` you've built up across earlier modules — you can
|
||||
delete it when you're done, and nothing here touches your main app. (Use your real course path in
|
||||
place of `/path/to/`, the same copy-it-in move from Module 5.)
|
||||
1. Have your agent set up the base app as a throwaway `review-lab` repo, then confirm the baseline
|
||||
behavior yourself. This `review-lab` is *separate* from the `tasks-app` you've built up across
|
||||
earlier modules; you can delete it when you're done, and nothing here touches your main app. From
|
||||
Module 4 on the agent drives the git and setup, so direct Claude Code (sub your own agent) to
|
||||
scaffold it:
|
||||
|
||||
> *"Make a new directory `~/ai-workflow-course/review-lab` and copy the two Python files from
|
||||
> `~/ai-workflow-course/the-workflow-course/modules/10-reviewing-code-you-didnt-write/lab/tasks-app/`
|
||||
> into it. Add a `.gitignore` that ignores `tasks.json` and `__pycache__/` so runtime state stays
|
||||
> out of the diffs. Initialize a git repo on a branch named `main`, stage everything, and make one
|
||||
> commit: `base: tasks-app`."*
|
||||
|
||||
The branch name is load-bearing: the steps below diff against `main` and switch back to it, so
|
||||
verify the agent actually used `main` (not whatever its default is). Confirm the result:
|
||||
|
||||
```bash
|
||||
mkdir -p ~/ai-workflow-course/review-lab && cd ~/ai-workflow-course/review-lab
|
||||
cp /path/to/modules/10-reviewing-code-you-didnt-write/lab/tasks-app/*.py .
|
||||
printf 'tasks.json\n__pycache__/\n' > .gitignore # keep generated runtime state out of your review diffs (Module 2)
|
||||
git init -qb main && git add . && git commit -qm "base: tasks-app" # -b main so the git switch main / git diff main.. steps below resolve
|
||||
cd ~/ai-workflow-course/review-lab
|
||||
git log --oneline # one commit, "base: tasks-app", on branch main
|
||||
git status # clean tree; tasks.json ignored, not tracked
|
||||
```
|
||||
|
||||
Then see the baseline behavior with your own eyes, because the trap is going to change it:
|
||||
|
||||
```bash
|
||||
python cli.py add "write the review module"
|
||||
python cli.py done 99 # baseline: prints "error: no task at index 99", exits non-zero
|
||||
echo "exit code: $?"
|
||||
@@ -219,36 +238,35 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
|
||||
Remember that last result. A bad index is a clean, loud error today.
|
||||
|
||||
2. Make a small honest change of your own on a branch — ask your AI for a one-line tweak, e.g.
|
||||
*"make the empty-list message say '(nothing to do)' instead of '(no tasks yet)'"* — apply it,
|
||||
commit it, and open it as a PR:
|
||||
2. Now practice the gate on a trivial, honest change. Tell the agent to make a one-line tweak on
|
||||
its own branch and put it up for review:
|
||||
|
||||
```bash
|
||||
git switch -c tweak-empty-message
|
||||
# apply the AI's one-line change to tasks.py, then:
|
||||
git add . && git commit -m "Friendlier empty-list message"
|
||||
```
|
||||
> *"On a new branch `tweak-empty-message`, change the empty-list message in `tasks.py` from
|
||||
> '(no tasks yet)' to '(nothing to do)'. Commit it as 'Friendlier empty-list message'. If this
|
||||
> repo has a remote, push the branch and open a pull request; otherwise leave it on the branch."*
|
||||
|
||||
If you have a Module 8 remote: `git push -u origin tweak-empty-message`, then open the PR on
|
||||
your host and read your own diff in the PR view. If you're local-only:
|
||||
`git diff main..tweak-empty-message`. Either way, **review your own one-line change as a diff
|
||||
before merging it.** Get used to the gate on a trivial change so it's a reflex on a dangerous
|
||||
one. Merge it when you're satisfied (`git switch main && git merge tweak-empty-message`).
|
||||
Your job is the review, not the plumbing. Read the resulting diff before it lands: on the PR page
|
||||
if the agent opened one, or with `git diff main..tweak-empty-message` if you're local-only. It's
|
||||
one line, and that's the point. Make reading-before-merging a reflex on a trivial change so it's
|
||||
automatic on a dangerous one. Once you've read it and it's exactly what you asked for, tell the
|
||||
agent to merge it into `main`.
|
||||
|
||||
### Part B — Review the AI's diff (the real exercise)
|
||||
|
||||
3. Now a teammate-who-is-an-AI has opened a PR. The prompt it was given was exactly:
|
||||
**"Add a `delete <index>` command to the tasks app."** Bring its change in on its own branch.
|
||||
`git apply` lays the AI's proposed change onto this branch as if it were its PR, so you can read
|
||||
it before deciding whether to keep it — exactly what you'd be doing in a real PR review. (Again,
|
||||
use your real course path in place of `/path/to/`.)
|
||||
**"Add a `delete <index>` command to the tasks app."** The change is captured as a patch in the
|
||||
lab so the review is reproducible. Have the agent stage it as that teammate's PR, on its own
|
||||
branch:
|
||||
|
||||
```bash
|
||||
git switch main
|
||||
git switch -c ai-delete-command
|
||||
git apply /path/to/modules/10-reviewing-code-you-didnt-write/lab/ai-change.patch
|
||||
git add . && git commit -m "Add delete command"
|
||||
```
|
||||
> *"From `main`, create a branch `ai-delete-command`. Apply the patch at
|
||||
> `~/ai-workflow-course/the-workflow-course/modules/10-reviewing-code-you-didnt-write/lab/ai-change.patch`
|
||||
> to the working tree, then commit it as 'Add delete command'. Don't review or 'fix' it; just
|
||||
> land it on the branch so I can review it."*
|
||||
|
||||
`git apply` is how the lab injects the incoming change so you can read it before deciding whether
|
||||
to keep it, exactly what you'd do in a real PR review. Telling the agent not to clean it up
|
||||
matters: left to its own judgment it might "helpfully" repair the planted problem before you
|
||||
ever see it.
|
||||
|
||||
4. **Review it before you run it.** Open the checklist and read the diff as one unit:
|
||||
|
||||
@@ -275,15 +293,15 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
```
|
||||
|
||||
In the base app, `done 99` was a clean error with a non-zero exit. After this "add a delete
|
||||
command" change, it prints `updated` and exits `0` — silently claiming success while marking
|
||||
command" change, it prints `updated` and exits `0`, silently claiming success while marking
|
||||
nothing. The diff *only said* it was adding `delete`. While in the file it also rewrote
|
||||
`complete()` to swallow the `IndexError` "for robustness," deleting the edge-case handling and
|
||||
turning a loud failure into a silent lie. That's three traps in one small hunk: **scope creep**
|
||||
(it touched `complete`, which the request never mentioned), **deleted edge-case handling**, and
|
||||
**convincing-but-wrong logic** wearing a reassuring comment.
|
||||
|
||||
6. Play it out. On your host's PR you'd leave a line comment on the `complete()` hunk —
|
||||
*"out of scope, and this swallows the error `done` relied on; please drop it"* — and **request
|
||||
6. Play it out. On your host's PR you'd leave a line comment on the `complete()` hunk
|
||||
(*"out of scope, and this swallows the error `done` relied on; please drop it"*) and **request
|
||||
changes** rather than approve. The feature you were asked for was fine; the PR still doesn't
|
||||
merge. That's the gate doing its job.
|
||||
|
||||
@@ -293,11 +311,11 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
|
||||
- **A checklist is a floor, not a ceiling.** It catches the characteristic traps reliably; it will
|
||||
not catch a deep logic error that requires understanding the whole system. For changes in code
|
||||
you don't know, reviewing the diff in isolation isn't enough — that harder case (pointing AI at
|
||||
you don't know, reviewing the diff in isolation isn't enough; that harder case (pointing AI at
|
||||
an unfamiliar codebase, and reviewing safely there) is Module 23.
|
||||
- **Tests catch what review misses, and vice versa.** This module is human review; it pairs with
|
||||
automated testing and CI (Modules 13–14), which catch the regressions a tired reviewer skims
|
||||
past. Neither replaces the other — the trap in this lab passes a casual run *and* would pass a
|
||||
past. Neither replaces the other: the trap in this lab passes a casual run *and* would pass a
|
||||
test suite that only tests the happy path. Review is what notices the test you *should* have.
|
||||
- **Review fatigue is real and AI makes it worse.** Twenty fluent PRs in a day will wear down the
|
||||
exact attention this skill needs, and a rubber-stamped review is worse than none because it
|
||||
@@ -305,7 +323,7 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
small and single-purpose so each one is reviewable in full. A PR too big to review honestly
|
||||
should be sent back to be split, not skimmed.
|
||||
- **You can't review what you don't understand.** If a diff uses an API or a corner of the language
|
||||
you don't know, "looks fine" is not a review — that's the moment to verify it exists and does
|
||||
you don't know, "looks fine" is not a review; that's the moment to verify it exists and does
|
||||
what it claims, or to pull in someone who knows. The honest output of a review is sometimes
|
||||
"I'm not qualified to approve this," and that's a valid result.
|
||||
|
||||
@@ -315,17 +333,17 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
|
||||
**You're done when:**
|
||||
|
||||
- You've opened (or branched) a change and reviewed it as a diff *before* merging — the gate is a
|
||||
reflex, even on a one-liner.
|
||||
- You've opened (or branched) a change and reviewed it as a diff *before* merging, so the gate is a
|
||||
reflex even on a one-liner.
|
||||
- You found the planted trap in `ai-change.patch` by reading the diff against the one-sentence
|
||||
request, and named *why* it's a trap (it changed `complete()`, which the request never mentioned,
|
||||
and swallowed the error `done` depended on).
|
||||
- You confirmed it by running the **failure** case (`done 99`) and seeing the silent `updated` +
|
||||
exit `0`, instead of trusting the happy path (`delete 0`) that worked fine.
|
||||
- You can name the four plausibility traps from memory — invented APIs, silent scope creep, deleted
|
||||
edge-case handling, convincing-but-wrong logic — and you treat a diff as guilty until proven
|
||||
- You can name the four plausibility traps from memory (invented APIs, silent scope creep, deleted
|
||||
edge-case handling, convincing-but-wrong logic) and you treat a diff as guilty until proven
|
||||
correct.
|
||||
|
||||
When "it runs" stops feeling like sufficient evidence and "I read every `-` line" starts feeling
|
||||
mandatory, you've got the skill. Module 11 takes this gate and wires it into the full collaboration
|
||||
loop — issues, branches, PRs, and merges — with both humans and agents as contributors.
|
||||
loop (issues, branches, PRs, and merges) with both humans and agents as contributors.
|
||||
|
||||
@@ -7,7 +7,7 @@ file; it's to interrogate **the change** against the prompt you gave. Work top t
|
||||
|
||||
- [ ] **What did I actually ask for?** Write the request in one sentence. Every changed line
|
||||
should trace back to it.
|
||||
- [ ] **Read the diff, not the prose.** Ignore the AI's summary of what it did; the diff is the
|
||||
- [ ] **Read the diff, not the summary.** Ignore the AI's account of what it did; the diff is the
|
||||
only ground truth. (`git diff main..<branch>`)
|
||||
|
||||
## 1. Scope — did it change only what was asked?
|
||||
|
||||
Reference in New Issue
Block a user