Build out all 27 modules + capstone (#1)

Co-authored-by: claude <claude@jpaul.io>
Co-committed-by: claude <claude@jpaul.io>
This commit was merged in pull request #1.
This commit is contained in:
2026-06-22 12:19:01 -04:00
committed by Claude (agent)
parent 4bd586bbd0
commit 2684095e2f
117 changed files with 15131 additions and 1 deletions
@@ -0,0 +1,299 @@
# Module 23 — Working with Existing Codebases
> **Every module so far quietly assumed you started the project. Most of your real work won't be
> like that.** This module is about pointing AI at a large codebase you *didn't* write — and making
> changes that don't break a system nobody fully understands.
---
## Prerequisites
This module needs only the **Module 4** tooling to *attempt* — an agentic, editor-integrated AI that
can read and edit your files. But it's placed at the back on purpose, because the basics are exactly
what make changing unfamiliar code survivable. Lean on:
- **Module 2 — Version control as a safety net.** You're about to let an AI touch code you don't
understand. The commit you can return to is the only reason that's not reckless.
- **Module 6 — Branches.** Every change here happens on a branch, isolated from working code.
- **Module 10 — Reviewing code you didn't write.** The core skill of this whole course, now aimed at
a diff in a codebase you *also* didn't write. Double the unfamiliarity, double the discipline.
- **Module 12 — Revert, reset, and recovery.** When a change in a system you don't understand goes
wrong, recovery is how you get out clean.
- **Module 13 — Testing.** The existing test suite is your contract for "did I break anything I
can't see?"
- **Module 20 — MCP servers.** Real, structured access to the code and the tools around it, instead
of pasting fragments.
- **Module 21 — Skills.** Where you codify the navigation and safe-change playbooks this module
teaches, so you don't re-explain them every session.
---
## Learning objectives
By the end of this module you can:
1. Give an AI enough **factual, verifiable context** about a large repo to be useful in it, instead
of letting it work from a few pasted fragments.
2. Have the AI **map and explain** an unfamiliar area — architecture, entry points, where things
live — and verify that map against the actual files *before* anything is touched.
3. Scope a change down to the **smallest reviewable diff** that solves the problem, and refuse the
sweeping rewrite the AI will happily offer.
4. Use **MCP (Module 20)** to give the AI real access to the code and surrounding tools, and
**skills (Module 21)** to make your navigation and safe-change process repeatable.
5. Make one **small, scoped, tested, reviewable** change to a codebase you didn't write — and know
why it's safe.
---
## Key concepts
### The greenfield assumption, and why it was a lie
Everything up to now used `tasks-app`: a tiny project you stood up, understood completely, and grew.
That made the lessons clean. It also made them unrepresentative. The dominant reality for an IT pro
is the opposite: a codebase that's **large, old, written by people who've left, and load-bearing for
something that matters.** You're not asked to build it. You're asked to change one thing in it
without breaking the other thousand things you've never read.
This is where AI is simultaneously most tempting and most dangerous. Tempting, because "just ask the
AI to figure it out" feels like exactly the leverage you need against 200,000 lines you don't know.
Dangerous, because the AI's two default failure modes get *worse* the bigger and less familiar the
codebase is:
- **It maps from vibes.** A file named `auth.py` becomes "the authentication module" in its mental
model whether or not the real auth lives there. It confidently describes structure it inferred
from names, not from reading. In a small repo you'd catch it. In a huge one you won't.
- **It rewrites instead of edits.** Ask for a small change and it hands you a "cleaned-up" version of
the whole file — reformatted, renamed, restructured — burying your one-line fix in a 300-line diff
nobody can review. In code you wrote, that's annoying. In code you didn't, it's how an invisible
regression ships.
The entire job of this module is to deny the AI both of those defaults: **force it to map from the
real files, and force every change to stay small and reviewable.**
### The motion: orient, map, then change
Three phases, strictly in order. Skipping ahead is the mistake.
**1. Orient — establish ground truth before any opinion.** Before the AI gets to reason about the
codebase, give it facts it can't hallucinate: the actual file list, the real entry points, the
languages by volume, the build and test commands, the biggest files (often the spine of the system),
the recent commit history. This is mechanical and cheap — a script produces it (the lab's `orient.py`
does exactly this). It anchors everything that follows in reality. You're not asking the AI "what is
this project?" cold; you're handing it the facts and asking it to *interpret* them.
**2. Map — explain the area before touching it.** Now the AI builds a mental model, and the only
acceptable model is one **traced through real files with citations.** Don't accept "the request
flows through the controller layer." Demand: "trace one request from entry point to response, naming
each file it passes through." The deliverable is an architecture summary plus a "where things live"
table — and crucially, a list of **open questions the code didn't answer.** A map with honest gaps is
trustworthy. A map with no gaps is fiction. This phase is **read-only**; nothing changes on disk.
**3. Change — the smallest scoped, tested, reviewable diff.** Only now do you edit. One change, one
branch (Module 6). Find the blast radius first — every caller of what you're touching — and if you
can't enumerate them, you're not ready. Make the minimal edit, add a test that fails without it,
run the *full* existing suite, and self-review the diff like it's someone else's PR (Module 10). No
drive-by reformatting. No "while I was in here." The diff a reviewer sees should be exactly the
change and nothing else.
### Context is the bottleneck, not intelligence
A frontier model is plenty smart enough to understand any one file in your repo. What it *can't* do
is hold all 200,000 lines in its head at once — the context window is finite, and stuffing it full of
irrelevant code makes the model worse, not better. So the skill here isn't "give the AI more." It's
**give the AI the right slice, and a way to fetch more on demand.**
That reframes the orientation pack: its job is to be a small, high-signal index that lets the AI
decide what to read next, not a dump of the whole tree. And it's exactly why the next two tools
matter so much in this module.
### Where MCP earns its place (Module 20)
Pasting files into a chat doesn't scale past a handful of them, and it makes the AI work blind
between pastes. **MCP (Module 20) gives the AI real, structured access to the codebase and the tools
around it** so it can navigate on its own instead of waiting for you to feed it fragments. The kinds
of access that turn a guessing model into a grounded one:
- **The filesystem and code search** — so it can grep for every caller of a function instead of
assuming it found them all.
- **Language-server intelligence** — go-to-definition, find-references, type info — so "where is this
used?" is answered by the toolchain, not by the model's guess.
- **The surrounding systems** — the issue tracker (Module 9), CI results (Module 14), the running
app's logs — so the AI maps the code *and* the context it lives in.
The orientation pack is the cold-start. MCP is how the AI keeps the map accurate as it digs, by
pulling real answers from real tools instead of inferring them.
### Where skills earn their place (Module 21)
The orient/map/change motion is the same on every repo. That makes it a perfect candidate for a
**skill (Module 21)** — a committed, reusable playbook so you don't re-explain "map before you touch,
cite real files, keep the diff small" every single session. This module ships two starter skills in
`lab/skills/`:
- **`map-this-repo`** — the read-only navigation playbook: orient, find entry points, trace one path
end to end, produce a cited architecture summary with honest open questions.
- **`safe-change`** — the safe-change playbook: branch first, find the blast radius, baseline the
tests, make the minimal edit, cover it, self-review, and a set of **stop conditions** that tell the
AI to escalate to a human instead of pushing on.
These are the structured big siblings of the committed config from Module 5: instead of "be careful
in unfamiliar code," they encode *exactly* what careful means, as steps the AI follows every time.
---
## The AI angle
A generic "onboarding to a legacy codebase" guide would tell a human to read the README and ask a
senior dev. What's specific here is that **the AI is both the thing reading the codebase and the
thing most likely to confidently misread it** — and the bigger the repo, the wider that gap between
"sounds authoritative" and "is correct."
So the AI-specific discipline is verification, not exploration. The model is genuinely excellent at
the grunt work of orientation — reading a hundred files, summarizing structure, tracing a call path —
which is exactly the work that's tedious and slow for a human. But it will narrate a wrong map with
the same fluent confidence as a right one. Your job shifts from "explore the code" (let the AI do
that) to "make the AI prove its map against real files, and keep its changes small enough that a
wrong map can't do much damage." The whole earlier toolchain — version control, branches, review,
tests, recovery — is what turns "the AI might be wrong about this huge system" from a catastrophe
into a revertable diff.
---
## Hands-on lab
**Lab language:** shell + the provided Python script (`orient.py`); you run it, you don't write it.
This lab does **not** use `tasks-app` — the entire point is a codebase you *didn't* write.
**You'll need:**
- Git, Python 3.10+, and your agentic AI tool from Module 4.
- A real, small-to-medium open-source repo to clone. Pick something with **tests** and a clear
build/test command, in a language you can at least read. Good traits: a few thousand lines, an
obvious entry point, a green test suite. (Avoid giant frameworks for a first run — you want a
system you can't fully hold in your head, but whose test suite finishes in under a minute.)
- The starter files from this module's `lab/` folder: `orient.py` and `skills/`.
### Part A — Clone and orient
1. Clone your chosen repo and copy `orient.py` into its root:
```bash
git clone <repo-url> unfamiliar-repo
cd unfamiliar-repo
# copy modules/23-working-with-existing-codebases/lab/orient.py into this folder
python orient.py > ORIENT.md
```
2. Read `ORIENT.md` yourself first. In 30 seconds you should know the language, the likely entry
point, the probable test command, and which files are biggest. These are **facts** — the AI can't
argue with them. (Don't commit `ORIENT.md`; it's scratch context.)
### Part B — Map before you touch (read-only)
3. Start a fresh AI session, load the `map-this-repo` skill (`lab/skills/map-this-repo.md`) or paste
it as instructions, and give it `ORIENT.md` as the opening context.
4. Ask it to produce the architecture summary: what the project does, a "where things live" table,
the confirmed build/test command, and a traced path for one real operation end to end —
**with every claim citing a real file.** Demand the list of open questions it couldn't resolve.
5. **Verify the map.** Open two or three files it cited and confirm they say what it claimed. This is
the step everyone wants to skip and the one that catches the confident-but-wrong map. If a
citation doesn't hold up, the map is suspect — push back and make it re-trace.
### Part C — One small, scoped, tested change
6. Pick a genuinely small change — a clearer error message, a fixed edge case, a tiny missing
validation, a documented-but-unhandled input. Something a single function owns. Run the existing
tests first to establish a green baseline (`pytest`, `npm test`, `go test ./...` — whatever
`ORIENT.md` and the README confirmed).
7. Branch, then load the `safe-change` skill (`lab/skills/safe-change.md`) and work the change with
the AI:
```bash
git switch -c scoped-change
```
Make it find the blast radius (every caller) before editing. Keep the edit minimal. Add a test
that fails without the change and passes with it. Run the **full** suite.
8. **Review the diff like it's a stranger's PR (Module 10):**
```bash
git diff
```
Every changed line should be necessary and explainable. If the AI snuck in a reformat or a
rename, revert it — that's the sprawl this whole module exists to prevent. Commit only when the
diff is exactly the change and nothing more.
9. Write the PR description the `safe-change` skill asks for: what changed, why, the blast radius,
how you tested it, and what you deliberately did *not* touch.
---
## Where it breaks
- **A confident map is still just a hypothesis.** The AI will produce a fluent, plausible
architecture summary for a repo it half-read. Fluency is not correctness. The citation-checking in
Part B isn't optional ceremony — it's the only thing standing between you and changing code based on
a fiction. Verify at least a few claims by hand, every time.
- **The context window is a hard ceiling.** On a truly large monorepo, the AI cannot see everything,
and it usually won't *tell* you what it didn't read. Its map is only as good as the slice it
actually loaded. MCP-backed search and language-server tools (Module 20) shrink this problem by
letting it fetch on demand, but they don't erase it — treat "I've reviewed the whole codebase" as
a claim to distrust.
- **"Small change" can hide a big blast radius.** A one-line edit to a heavily-called function can
ripple through code you never opened. The blast-radius search in the `safe-change` skill is the
defense, but it's only as good as the AI's ability to find *every* caller — dynamic dispatch,
reflection, config-driven wiring, and string-based lookups all defeat naive search. When in doubt,
the tests are your backstop, which is why a repo *without* tests is genuinely dangerous to change
this way.
- **The AI doesn't respect house style by default.** It writes in *its* idiom, not the repo's. In an
existing codebase that's a tell that screams "an outsider touched this" and quietly degrades
consistency. The committed instructions file (Module 5) and the `safe-change` skill's
"match local conventions" rule help, but you'll still catch drift in review.
- **Some changes shouldn't be a small diff.** A genuine architectural problem won't be fixed by the
smallest-possible edit, and forcing it to be makes things worse. This module's discipline is for
the common case — a scoped change in a system you don't own. Recognizing when a change is actually
a *project* (and escalating it as one) is its own judgment call the tooling won't make for you.
---
## Check for understanding
**You're done when:**
- You can hand an AI a factual orientation pack and get back an architecture summary whose citations
you've **personally verified** against the real files — including the open questions it couldn't
resolve.
- You've made one change to a codebase you didn't write that is on its own branch, covered by a test
that fails without it, passing the full existing suite, and whose `git diff` is *exactly* the
change with no drive-by edits.
- You can explain why the orient -> map -> change order is non-negotiable, and name the two AI
failure modes (mapping from vibes, rewriting instead of editing) this module is built to deny.
- You can point to where MCP (Module 20) and skills (Module 21) make this repeatable rather than a
one-off heroics session.
If your change is a clean, tested, reviewable one-liner in a system you couldn't have described an
hour ago — and you trust it — you've got the motion.
---
## Verify-before-publish
This is an expansion-zone module; the durable motion is stable, but the tooling around it moves.
- [ ] Confirm `orient.py` runs unchanged on current Python (3.10+) and a freshly cloned repo on
macOS, Linux, and Windows (git-bash / PowerShell).
- [ ] Re-check the MCP capabilities cited (filesystem, code search, language-server intelligence,
issue/CI/log access) against what's actually common in the current MCP ecosystem — the menu of
available servers changes fast. Keep it described as capabilities, not specific products.
- [ ] Verify the cross-references still point to the right modules if any renumbering happened
(4, 6, 9, 10, 12, 13, 20, 21).
- [ ] Re-confirm the `SIGNALS`/`TEST_HINTS` tables in `orient.py` still reflect common manifests and
test runners; add any that have become standard, but keep it language-agnostic.
- [ ] Sanity-check the suggested "small-to-medium repo with a fast test suite" lab guidance still
lands — recommend nothing by name that could rot.
@@ -0,0 +1,191 @@
#!/usr/bin/env python3
"""orient.py — build a factual orientation pack for a repo you didn't write.
Run it from the root of a cloned repo. It prints a Markdown summary of *ground truth*
about the codebase — size, languages, project signals, the biggest (often most central)
files, the top-level layout, and likely build/test commands — that you can paste in as the
opening context for an AI session before asking it to map or change anything.
The point is NOT to replace the AI's own exploration. It's to anchor that exploration in
facts the model can't hallucinate: real file names, real counts, real entry points. The AI
then verifies and deepens this; you never let it map from vibes alone.
No dependencies. Standard library only. Works on any OS with Python 3.10+ and git.
python orient.py # print the pack
python orient.py > ORIENT.md # save it to hand to the AI (don't commit it)
"""
from __future__ import annotations
import subprocess
import sys
from collections import Counter
from pathlib import Path
# Files whose mere presence tells you how the project is built, tested, shipped, and configured.
# (key file/dir -> what its presence means). Kept tool- and language-agnostic on purpose.
SIGNALS: dict[str, str] = {
"pyproject.toml": "Python project (PEP 621 / poetry / hatch)",
"setup.py": "Python project (legacy setuptools)",
"requirements.txt": "Python dependencies (pip)",
"package.json": "Node/JS project",
"pnpm-lock.yaml": "Node project (pnpm)",
"yarn.lock": "Node project (yarn)",
"go.mod": "Go module",
"Cargo.toml": "Rust crate",
"pom.xml": "Java/Maven project",
"build.gradle": "Java/Kotlin/Gradle project",
"Gemfile": "Ruby project",
"composer.json": "PHP project",
"Makefile": "Make targets (often the real entry point for build/test)",
"Dockerfile": "Containerized (Module 16)",
"docker-compose.yml": "Multi-service local stack (Module 16)",
"compose.yaml": "Multi-service local stack (Module 16)",
".github": "GitHub Actions / project meta",
".gitea": "Gitea Actions",
".gitlab-ci.yml": "GitLab CI",
"tox.ini": "Python test matrix",
"README.md": "Has a README — read it first",
"CONTRIBUTING.md": "Has contributor guidance — read before changing",
"ARCHITECTURE.md": "Has an architecture doc — rare and valuable",
"AGENTS.md": "Has a committed AI instructions file (Module 5)",
"CLAUDE.md": "Has a committed AI instructions file (Module 5)",
}
# Common test-runner hints keyed off a present signal file.
TEST_HINTS: dict[str, str] = {
"pyproject.toml": "pytest (or: python -m pytest)",
"tox.ini": "tox",
"package.json": "npm test (check the \"scripts\" block for the real command)",
"go.mod": "go test ./...",
"Cargo.toml": "cargo test",
"Makefile": "make test (if a 'test' target exists)",
"pom.xml": "mvn test",
"Gemfile": "bundle exec rspec (or rake test)",
}
CODE_EXTS = {
".py", ".js", ".ts", ".jsx", ".tsx", ".go", ".rs", ".java", ".kt", ".rb",
".php", ".c", ".h", ".cc", ".cpp", ".hpp", ".cs", ".swift", ".scala", ".sh",
}
def git(*args: str) -> str:
"""Run a git command, return stdout (stripped), or "" on failure."""
try:
out = subprocess.run(
["git", *args],
capture_output=True, text=True, check=True,
)
return out.stdout.strip()
except (subprocess.CalledProcessError, FileNotFoundError):
return ""
def tracked_files() -> list[str]:
listing = git("ls-files")
return [line for line in listing.splitlines() if line]
def line_count(path: str) -> int:
try:
with open(path, "rb") as fh:
return sum(1 for _ in fh)
except OSError:
return 0
def main() -> int:
if not Path(".git").exists() and not git("rev-parse", "--is-inside-work-tree"):
print("Not inside a git repository. cd into a cloned repo first.", file=sys.stderr)
return 1
files = tracked_files()
if not files:
print("No tracked files found (is this an empty or non-git repo?).", file=sys.stderr)
return 1
out: list[str] = []
w = out.append
# --- identity -----------------------------------------------------------
remote = git("remote", "get-url", "origin") or "(no origin remote)"
branch = git("rev-parse", "--abbrev-ref", "HEAD") or "(unknown)"
total_commits = git("rev-list", "--count", "HEAD") or "?"
w("# Repo orientation pack\n")
w(f"- **Origin:** {remote}")
w(f"- **Branch:** {branch}")
w(f"- **Total commits:** {total_commits}")
w(f"- **Tracked files:** {len(files)}")
# --- languages ----------------------------------------------------------
ext_counts: Counter[str] = Counter()
for f in files:
ext = Path(f).suffix.lower() or "(none)"
ext_counts[ext] += 1
w("\n## Languages / file types (top 15 by file count)\n")
for ext, n in ext_counts.most_common(15):
marker = " <- code" if ext in CODE_EXTS else ""
w(f"- `{ext}`: {n}{marker}")
# --- project signals ----------------------------------------------------
present = {name for name in SIGNALS if Path(name).exists()}
w("\n## Project signals (what's present at the root)\n")
if present:
for name in SIGNALS:
if name in present:
w(f"- `{name}` — {SIGNALS[name]}")
else:
w("- (none of the usual manifests/CI/docs at the root — look one level down)")
# --- likely test command ------------------------------------------------
hints = [TEST_HINTS[name] for name in TEST_HINTS if name in present]
w("\n## Likely build/test command (verify before trusting)\n")
if hints:
for h in hints:
w(f"- `{h}`")
else:
w("- No obvious runner detected. Search the README and CI config for the real command.")
# --- biggest files (often the spine) ------------------------------------
sized = sorted(
((line_count(f), f) for f in files if Path(f).suffix.lower() in CODE_EXTS),
reverse=True,
)[:15]
w("\n## Largest code files (often where the core logic lives)\n")
if sized:
for n, f in sized:
w(f"- {n:>6} lines `{f}`")
else:
w("- (no recognized source files)")
# --- top-level layout ---------------------------------------------------
top_dirs: Counter[str] = Counter()
for f in files:
head = f.split("/", 1)[0]
top_dirs[head] += 1
w("\n## Top-level layout (entries by tracked-file count)\n")
for name, n in sorted(top_dirs.items(), key=lambda kv: (-kv[1], kv[0])):
kind = "dir" if "/" in next(p for p in files if p.split("/", 1)[0] == name) else "file"
w(f"- `{name}`{'/' if kind == 'dir' else ''}{n}")
# --- recent activity ----------------------------------------------------
recent = git("log", "--oneline", "-10")
w("\n## Last 10 commits (the project's recent direction)\n")
w("```")
w(recent or "(no history)")
w("```")
w("\n---")
w("> Generated by orient.py. These are *facts*, not conclusions. Hand them to the AI as the")
w("> opening context, then make it verify and map the areas you actually care about before")
w("> it changes anything.")
print("\n".join(out))
return 0
if __name__ == "__main__":
raise SystemExit(main())
@@ -0,0 +1,32 @@
# Skill: Map this repo
A navigation playbook (a Module 21 skill) for orienting in a codebase you didn't write.
Point your agentic tool at this file as a skill, or paste it in as instructions. The goal is a
**read-only** mental model — no edits happen here.
## When to use
At the start of any session on an unfamiliar repo, before any change is discussed.
## Rules
- **Read only.** Do not edit, create, or delete files while mapping. No exceptions.
- **Cite real paths.** Every claim about the code must point to a file and, ideally, a line range.
If you can't cite it, say "unverified" instead of guessing.
- **Breadth before depth.** Establish the whole shape before diving into any one area.
- **No conclusions from file names alone.** A file called `auth.py` may not be where auth lives.
## Steps
1. Read the orientation pack (from `orient.py`), the README, and any `CONTRIBUTING`,
`ARCHITECTURE`, or committed AI-instructions file. Treat these as claims to verify, not truth.
2. Identify the **entry points**: how does this thing start? (CLI `main`, web server, library
exports.) Name the exact file(s).
3. Trace **one representative request/command end to end** — from entry point to where it does its
real work and back. List the files it passes through, in order.
4. Produce an **architecture summary** (max ~1 page):
- One paragraph: what this project does and how it's structured.
- A "where things live" table: concern -> directory/file.
- The build/test/run commands, confirmed against the README or CI config.
- 3-5 things that surprised you or look risky to touch.
5. List **open questions** you could not resolve from the code. Do not paper over them.
## Output
A single Markdown summary. End with: "Verified against: <list of files actually read>."
@@ -0,0 +1,39 @@
# Skill: Safe scoped change
A safe-change playbook (a Module 21 skill) for modifying a codebase you don't fully understand.
Use it only **after** `map-this-repo` has produced an architecture summary. The whole bet of this
skill is: small, scoped, tested, reviewable — never a sweeping rewrite.
## When to use
When making a concrete change to an unfamiliar repo.
## Rules
- **One change, one branch.** Create a branch first (Module 6). Never work on the default branch.
- **Smallest diff that solves it.** Touch the fewest files possible. If the change wants to sprawl,
stop and re-scope — sprawl in code you don't understand is how you break things invisibly.
- **No drive-by edits.** Do not reformat, rename, or "clean up" unrelated code. Those bury the real
change and make the diff unreviewable (Module 10).
- **Match local conventions.** Mirror the surrounding code's style, naming, and patterns — not your
own defaults.
- **Tests are the contract.** A change isn't done until it's covered (Module 13) and the existing
suite still passes.
## Steps
1. **State the change in one sentence** and the acceptance criterion ("done when X").
2. **Find the blast radius first:** search for every caller/usage of what you're about to touch.
List them. If you can't enumerate them, you're not ready to change it.
3. **Run the existing tests before touching anything** — establish a green baseline. If they were
already red, note it; don't let a pre-existing failure get blamed on you.
4. **Make the minimal edit.** Keep it to the files identified in step 2.
5. **Add or extend a test** that fails without your change and passes with it.
6. **Run the full suite.** All green, including the baseline tests.
7. **Self-review the diff** as if reviewing someone else's PR (Module 10): is every changed line
necessary and explained? Revert anything that isn't.
8. **Write the PR description:** what changed, why, blast radius, how it was tested, what you did
NOT touch and why.
## Stop conditions (escalate to a human instead of pushing on)
- The change requires touching more than ~3 files or a "core" file from the architecture summary.
- You can't enumerate the callers of what you're changing.
- A test you don't understand starts failing.
- The fix needs a design decision the existing code doesn't settle.