ai-workflow-course/blog/11-reviewing-code-you-didnt-write.md

<!--
Suggested title:    The AI's Code Looks Right. That's the Problem.
Alt title:          Reviewing Code You Didn't Write: Plausibility Traps and the PR as a Gate
Slug:               the-workflow-reviewing-ai-code
Meta description:   AI writes uniformly clean code whether it's correct or not — which breaks the
                    review instinct you spent years building. Here's how to read an AI diff for
                    plausibility traps, and why the pull request is the gate that catches them.
Tags:               AI, code review, pull requests, git, developer workflow, plausibility traps
-->

# The AI's Code Looks Right. That's the Problem.

Here's a thing I had to unlearn the hard way: I'd spent years using how *clean* code looks as a shortcut for whether it's *correct*. Messy, confusing code made me slow down and squint. Tidy code with good variable names and a clear comment on top got a nod and a merge. That heuristic served me well for a long time.

Then I started reviewing code an AI wrote, and that instinct walked me straight into a wall.

This is the eleventh post in my walk through [The Workflow](https://git.jpaul.io/justin/ai-workflow-course), my free course on the toolchain *around* AI coding. And I'll say this plainly, the way the course does: reviewing a diff you didn't write is one of the most important and least-taught skills in this whole space. If you take one habit from the entire series, I'd be tempted to point at this one. So this post gets the weight it deserves.

## Why your review instinct is now lying to you

Think about where bugs live in code a *human* wrote. They cluster where the human was uncertain — the gnarly edge case, the bit they rushed, the function with the TODO they meant to come back to. You can often *feel* the soft spots. The roughness is a signal. Confusing code is suspicious code, and your eye learned to slow down right where it mattered.

AI output inverts that signal completely. It is **uniformly fluent.** The variable names are good. The structure is clean. The comment above the broken line confidently states the *correct* intention. And the one wrong line looks exactly as polished as the forty right ones around it. The fluency is constant; the correctness is not — and you've spent a career using fluency as a proxy for correctness. That proxy is now actively misleading you.

So the question you're asking has to change. With human code, you mostly ask *"is this good code?"* With AI code, you have to ask something colder: *"is this code true?"* Does it actually do what it claims? Against the request I actually made? Using things that actually exist? That's a different activity, and assuming it's the same one is how people get burned.

## The four plausibility traps

I call these plausibility traps because that's exactly what they are — code produced by a process optimizing for *plausible-looking output*, engineered (not on purpose, but effectively) to pass the quick skim you're tempted to give it. They're not random bugs. They're the characteristic ways fluent-but-untrue code goes wrong, and once you can name them you start seeing them.

**1. Invented APIs.** The model reaches for a function, a keyword argument, a config key, a flag, an endpoint that *should* exist by analogy — and doesn't, or exists with a different signature. The tell is that it reads *more* natural than the real API, because it was generated to be plausible rather than recalled from docs. Classic shape: assuming `list.pop(i, default)` works because `dict.pop(k, default)` does. The fix is unglamorous — verify every unfamiliar symbol against real docs or source. Confidence in the surrounding prose is not evidence.

**2. Silent scope creep.** You asked for one thing. The diff does that thing *and* quietly "improves" three others it was never asked to touch — reformats a file, reshuffles imports, renames a variable across the module, "simplifies" an unrelated function. Each extra edit is an unrequested change you now have to review with no stated intent behind it, and it's exactly where regressions hide. The discipline: every hunk must trace back to the request. Anything that doesn't is guilty until proven innocent.

**3. Deleted edge-case handling.** This is the most dangerous one, because it lives in the `-` lines you skim. While building the feature, the model drops a bounds check, removes a `None` guard, or — the worst version — replaces a real error with a silent swallow (`except: pass`) under the banner of "making it robust." The code now looks *cleaner* and passes every test you'd casually run, because you'd test the path that works. The bad input the deleted guard existed to catch now fails silently. **Read every deletion.** Deletions are where behavior disappears.

**4. Convincing-but-wrong logic.** An inverted condition (`if not x` where it meant `if x`), an off-by-one, `<` where it meant `<=`, `and` where it meant `or`, a filter quietly dropped from a comprehension. On the happy path it produces a believable-enough result, and the comment above it cheerfully narrates the *correct* behavior — so the comment actively vouches for the bug. The defense is to trace one real call through the changed code yourself instead of trusting the narration.

A real AI diff usually has *most lines correct* and one trap buried in legitimate work. That's the whole danger. The feature genuinely works when you try it. The trap is somewhere you didn't look.

## The pull request is a gate, not a formality

So where do you run this review? At a gate. And the gate already has a name you know: the **pull request** (or merge request, if you're on GitLab — same thing).

A PR proposes merging a branch into `main` and *pauses there* so the change can be looked at before it lands. The trap is treating that pause as a rubber stamp — "looks good, merge" — which is exactly how bad changes get the institutional blessing of "well, it was reviewed."

Reframe it the way you already think about change control: **a PR is a change gate, and merge is a one-way door.** Once it's on `main`, it's in everyone's next clone, in CI, on its way to a deploy. The cheapest place to catch a problem is in the diff, before the door closes.

And here's the part people resist: this holds **even when you're the only human on the repo.** Not for bureaucracy's sake. For two reasons that genuinely pay off solo. *Traceability* — the PR is a durable record of what changed and why, linked to the issue it answers; `git log` tells you the change happened, the PR tells you the reasoning. And *a forced read* — opening the PR makes you look at the whole change as one diff, away from the chat you generated it in. That context switch is where you catch the thing you were too close to see. When the author is an AI with total confidence and zero memory of why, both reasons get sharper.

[insert a screenshot referencing a pull request diff view on GitHub/Gitea with a line comment on a deletion here]

## Let me show you a trap

Talk is cheap, so here's the lab the course runs, compressed. You've got a tiny `tasks-app` — a command-line to-do list. In the base version, `complete()` validates the index, so `done 99` on a list with three tasks gives you a clean, loud error and a non-zero exit code:

```bash
python cli.py done 99        # prints "error: no task at index 99", exits non-zero
echo "exit code: $?"
```

Remember that. A bad index is a clean failure today.

Now a teammate-who-is-an-AI opens a PR. The prompt it was given was exactly one sentence: **"Add a `delete <index>` command to the tasks app."** You bring its change in on its own branch and read it as one unit before deciding anything:

```bash
git switch -c ai-delete-command
git apply /path/to/lab/ai-change.patch
git diff main..ai-delete-command
```

The diff adds a `delete` command. It works — try `delete 0`, the task goes away, clean exit. If you stopped there, you'd approve it. The feature you asked for is genuinely fine.

But run the *failure* path, not the happy one:

```bash
python cli.py done 99         # the trap
echo "exit code: $?"
```

In the base app that was a loud error. After this "add a delete command" change, it prints `updated` and exits `0` — silently claiming success while marking nothing. Why? Because while it was in the file, the AI also rewrote `complete()` to swallow the `IndexError` "for robustness." That's *three* traps in one small hunk: **scope creep** (it touched `complete()`, which the request never mentioned), **deleted edge-case handling** (the guard `done` relied on is gone), and **convincing-but-wrong logic** wearing a reassuring comment. The diff *said* it was adding `delete`. It quietly turned a loud failure into a silent lie.

That's the whole lesson in one hunk. The feature works. The trap is in the part the description didn't mention and you didn't run.

## How to actually read the diff

Mechanically, you want the change as one reviewable unit, separate from the chat you generated it in — `git diff main..feature-branch` in the terminal, or the PR page on your host (which gives you the same diff plus line comments and CI results). The content of the review is the same either way. The pass goes in this order:

1. **State the request in one sentence.** That's your scope yardstick. If it answers an issue, that's your sentence.
2. **Read the diff, not the AI's summary.** The summary is what it *intended*. The diff is what it *did*. Only the diff is real.
3. **Scope check.** Every hunk maps to the request. Flag everything that doesn't.
4. **Deletions first.** Read every `-` line and ask what behavior just left the codebase.
5. **Verify the unfamiliar.** Every API, flag, and key you don't personally know exists — check it.
6. **Trace one real call,** including a failure case. Not the happy path. The bad input.
7. **Decide.** Approve only if you can explain every hunk. Otherwise request changes.

That last point is the whole posture: **a diff is guilty until proven correct.** "It runs" is the weakest evidence there is — the traps above are *designed* to run.

## The AI angle

Every other tool in this course gets *more* valuable because of AI. This is the one module where the human stays in the loop on purpose, and it's worth being precise about why.

The thing AI is best at — fluent, confident, well-structured output — is precisely the thing that defeats the review reflex you built reviewing humans. You learned to trust clean code and distrust messy code; AI produces uniformly clean code regardless of correctness, so that heuristic now points the wrong way. Reviewing AI diffs means *consciously overriding* an instinct that served you well for years.

And the volume cuts against you. AI makes generating a 300-line PR almost free, which quietly moves the bottleneck from *writing* to *reviewing* — and tempts everyone to review at the speed they generate. The whole economics of a team now hinge on review being the gate that writing no longer is. The fluent-but-wrong line costs nothing to produce and everything to miss.

## Where it breaks (because I like to be honest)

A few caveats, because I'd rather you trust me than oversell you:

- **A checklist is a floor, not a ceiling.** It reliably catches the characteristic traps. It will *not* catch a deep logic error that needs you to understand the whole system. Reviewing an isolated diff in code you don't know is a harder case (a later module's problem).
- **Tests catch what review misses, and vice versa.** This is *human* review; it pairs with testing and CI, not replaces them. The trap in that lab passes a casual run *and* would pass a test suite that only tests the happy path. Review is what notices the test you *should* have written.
- **Review fatigue is real, and AI makes it worse.** Twenty fluent PRs in a day will wear down the exact attention this skill needs, and a rubber-stamped review is worse than none — it launders the change as "reviewed." The mitigation is small PRs. A change too big to review honestly should be sent back to be split, not skimmed.
- **You can't review what you don't understand.** If a diff uses a corner of the language you don't know, "looks fine" isn't a review. Verify it, or pull in someone who can. "I'm not qualified to approve this" is a valid and honest result.

## You're done when

"It runs" stops feeling like sufficient evidence, and "I read every `-` line" starts feeling mandatory. You can name the four traps from memory — invented APIs, silent scope creep, deleted edge-case handling, convincing-but-wrong logic — and you treat every diff as guilty until proven correct. That's the skill.

Next up, I take this review gate and wire it into the full collaboration loop — issue to branch to PR to review to merge — with both humans *and* agents as contributors. The gate you just learned is what makes letting an agent open PRs survivable.

If you've been burned by a clean-looking AI diff that turned out to be quietly wrong — I want to hear that story. Drop it in the comments. I read them, and the traps you've hit are exactly what makes this lesson sharper.