|
|
|
@@ -1,29 +1,29 @@
|
|
|
|
|
# Module 23 — Working with Existing Codebases
|
|
|
|
|
# Module 23: Working with Existing Codebases
|
|
|
|
|
|
|
|
|
|
> **Every module so far quietly assumed you started the project. Most of your real work won't be
|
|
|
|
|
> like that.** This module is about pointing AI at a large codebase you *didn't* write — and making
|
|
|
|
|
> like that.** This module is about pointing AI at a large codebase you *didn't* write, and making
|
|
|
|
|
> changes that don't break a system nobody fully understands.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Prerequisites
|
|
|
|
|
|
|
|
|
|
This module needs only the **Module 4** tooling to *attempt* — an agentic, editor-integrated AI that
|
|
|
|
|
This module needs only the **Module 4** tooling to *attempt*: an agentic, editor-integrated AI that
|
|
|
|
|
can read and edit your files. But it's placed at the back on purpose, because the basics are exactly
|
|
|
|
|
what make changing unfamiliar code survivable. Lean on:
|
|
|
|
|
|
|
|
|
|
- **Module 2 — Version control as a safety net.** You're about to let an AI touch code you don't
|
|
|
|
|
- **Module 2: Version control as a safety net.** You're about to let an AI touch code you don't
|
|
|
|
|
understand. The commit you can return to is the only reason that's not reckless.
|
|
|
|
|
- **Module 6 — Branches.** Every change here happens on a branch, isolated from working code.
|
|
|
|
|
- **Module 10 — Reviewing code you didn't write.** The core skill of this whole course, now aimed at
|
|
|
|
|
- **Module 6: Branches.** Every change here happens on a branch, isolated from working code.
|
|
|
|
|
- **Module 10: Reviewing code you didn't write.** The core skill of this whole course, now aimed at
|
|
|
|
|
a diff in a codebase you *also* didn't write. Double the unfamiliarity, double the discipline.
|
|
|
|
|
- **Module 12 — Revert, reset, and recovery.** When a change in a system you don't understand goes
|
|
|
|
|
- **Module 12: Revert, reset, and recovery.** When a change in a system you don't understand goes
|
|
|
|
|
wrong, recovery is how you get out clean.
|
|
|
|
|
- **Module 13 — Testing.** The existing test suite is your contract for "did I break anything I
|
|
|
|
|
- **Module 13: Testing.** The existing test suite is your contract for "did I break anything I
|
|
|
|
|
can't see?"
|
|
|
|
|
- **Module 20 — MCP servers.** Real, structured access to the code and the tools around it, instead
|
|
|
|
|
- **Module 20: MCP servers.** Real, structured access to the code and the tools around it, instead
|
|
|
|
|
of pasting fragments.
|
|
|
|
|
- **Module 21 — Skills.** Where you codify the navigation and safe-change playbooks this module
|
|
|
|
|
- **Module 21: Skills.** Where you codify the navigation and safe-change playbooks this module
|
|
|
|
|
teaches, so you don't re-explain them every session.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
@@ -34,13 +34,13 @@ By the end of this module you can:
|
|
|
|
|
|
|
|
|
|
1. Give an AI enough **factual, verifiable context** about a large repo to be useful in it, instead
|
|
|
|
|
of letting it work from a few pasted fragments.
|
|
|
|
|
2. Have the AI **map and explain** an unfamiliar area — architecture, entry points, where things
|
|
|
|
|
live — and verify that map against the actual files *before* anything is touched.
|
|
|
|
|
2. Have the AI **map and explain** an unfamiliar area (architecture, entry points, where things
|
|
|
|
|
live) and verify that map against the actual files *before* anything is touched.
|
|
|
|
|
3. Scope a change down to the **smallest reviewable diff** that solves the problem, and refuse the
|
|
|
|
|
sweeping rewrite the AI will happily offer.
|
|
|
|
|
4. Use **MCP (Module 20)** to give the AI real access to the code and surrounding tools, and
|
|
|
|
|
**skills (Module 21)** to make your navigation and safe-change process repeatable.
|
|
|
|
|
5. Make one **small, scoped, tested, reviewable** change to a codebase you didn't write — and know
|
|
|
|
|
5. Make one **small, scoped, tested, reviewable** change to a codebase you didn't write, and know
|
|
|
|
|
why it's safe.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
@@ -75,21 +75,21 @@ real files, and force every change to stay small and reviewable.**
|
|
|
|
|
|
|
|
|
|
Three phases, strictly in order. Skipping ahead is the mistake.
|
|
|
|
|
|
|
|
|
|
**1. Orient — establish ground truth before any opinion.** Before the AI gets to reason about the
|
|
|
|
|
**1. Orient: establish ground truth before any opinion.** Before the AI gets to reason about the
|
|
|
|
|
codebase, give it facts it can't hallucinate: the actual file list, the real entry points, the
|
|
|
|
|
languages by volume, the build and test commands, the biggest files (often the spine of the system),
|
|
|
|
|
the recent commit history. This is mechanical and cheap — a script produces it (the lab's `orient.py`
|
|
|
|
|
the recent commit history. This is mechanical and cheap; a script produces it (the lab's `orient.py`
|
|
|
|
|
does exactly this). It anchors everything that follows in reality. You're not asking the AI "what is
|
|
|
|
|
this project?" cold; you're handing it the facts and asking it to *interpret* them.
|
|
|
|
|
|
|
|
|
|
**2. Map — explain the area before touching it.** Now the AI builds a mental model, and the only
|
|
|
|
|
**2. Map: explain the area before touching it.** Now the AI builds a mental model, and the only
|
|
|
|
|
acceptable model is one **traced through real files with citations.** Don't accept "the request
|
|
|
|
|
flows through the controller layer." Demand: "trace one request from entry point to response, naming
|
|
|
|
|
each file it passes through." The deliverable is an architecture summary plus a "where things live"
|
|
|
|
|
table — and crucially, a list of **open questions the code didn't answer.** A map with honest gaps is
|
|
|
|
|
table, and crucially a list of **open questions the code didn't answer.** A map with honest gaps is
|
|
|
|
|
trustworthy. A map with no gaps is fiction. This phase is **read-only**; nothing changes on disk.
|
|
|
|
|
|
|
|
|
|
**3. Change — the smallest scoped, tested, reviewable diff.** Only now do you edit. One change, one
|
|
|
|
|
**3. Change: the smallest scoped, tested, reviewable diff.** Only now do you edit. One change, one
|
|
|
|
|
branch (Module 6). Find the blast radius first, every caller of what you're touching, and if you
|
|
|
|
|
can't enumerate them, you're not ready. Make the minimal edit, add a test that fails without it,
|
|
|
|
|
run the *full* existing suite, and self-review the diff like it's someone else's PR (Module 10). No
|
|
|
|
@@ -114,12 +114,12 @@ between pastes. **MCP (Module 20) gives the AI real, structured access to the co
|
|
|
|
|
around it** so it can navigate on its own instead of waiting for you to feed it fragments. The kinds
|
|
|
|
|
of access that turn a guessing model into a grounded one:
|
|
|
|
|
|
|
|
|
|
- **The filesystem and code search** — so it can grep for every caller of a function instead of
|
|
|
|
|
- **The filesystem and code search**, so it can grep for every caller of a function instead of
|
|
|
|
|
assuming it found them all.
|
|
|
|
|
- **Language-server intelligence** (go-to-definition, find-references, type info) so "where is this
|
|
|
|
|
used?" is answered by the toolchain, not by the model's guess.
|
|
|
|
|
- **The surrounding systems** — the issue tracker (Module 9), CI results (Module 14), the running
|
|
|
|
|
app's logs — so the AI maps the code *and* the context it lives in.
|
|
|
|
|
- **The surrounding systems**: the issue tracker (Module 9), CI results (Module 14), the running
|
|
|
|
|
app's logs, so the AI maps the code *and* the context it lives in.
|
|
|
|
|
|
|
|
|
|
The orientation pack is the cold-start. MCP is how the AI keeps the map accurate as it digs, by
|
|
|
|
|
pulling real answers from real tools instead of inferring them.
|
|
|
|
@@ -127,13 +127,13 @@ pulling real answers from real tools instead of inferring them.
|
|
|
|
|
### Where skills earn their place (Module 21)
|
|
|
|
|
|
|
|
|
|
The orient/map/change motion is the same on every repo. That makes it a perfect candidate for a
|
|
|
|
|
**skill (Module 21)** — a committed, reusable playbook so you don't re-explain "map before you touch,
|
|
|
|
|
**skill (Module 21)**: a committed, reusable playbook so you don't re-explain "map before you touch,
|
|
|
|
|
cite real files, keep the diff small" every single session. This module ships two starter skills in
|
|
|
|
|
`lab/skills/`:
|
|
|
|
|
|
|
|
|
|
- **`map-this-repo`** — the read-only navigation playbook: orient, find entry points, trace one path
|
|
|
|
|
- **`map-this-repo`**: the read-only navigation playbook: orient, find entry points, trace one path
|
|
|
|
|
end to end, produce a cited architecture summary with honest open questions.
|
|
|
|
|
- **`safe-change`** — the safe-change playbook: branch first, find the blast radius, baseline the
|
|
|
|
|
- **`safe-change`**: the safe-change playbook: branch first, find the blast radius, baseline the
|
|
|
|
|
tests, make the minimal edit, cover it, self-review, and a set of **stop conditions** that tell the
|
|
|
|
|
AI to escalate to a human instead of pushing on.
|
|
|
|
|
|
|
|
|
@@ -163,7 +163,7 @@ into a revertable diff.
|
|
|
|
|
## Hands-on lab
|
|
|
|
|
|
|
|
|
|
**Lab language:** shell + the provided Python script (`orient.py`); you run it, you don't write it.
|
|
|
|
|
This lab does **not** use `tasks-app` — the entire point is a codebase you *didn't* write.
|
|
|
|
|
This lab does **not** use `tasks-app`; the entire point is a codebase you *didn't* write.
|
|
|
|
|
|
|
|
|
|
**You'll need:**
|
|
|
|
|
|
|
|
|
@@ -172,14 +172,14 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
|
|
|
|
- A real, small-to-medium open-source repo to clone. Pick something with **tests** and a clear
|
|
|
|
|
build/test command, in a language you can at least read. Good traits: a few thousand lines, an
|
|
|
|
|
obvious entry point, a documented install (`pip install -e .`, `npm install`, `go mod download`,
|
|
|
|
|
…), and a test suite that **goes green on a clean clone after that documented install** — confirm
|
|
|
|
|
that before you rely on it as a baseline. (Avoid giant frameworks for a first run — you want a
|
|
|
|
|
…), and a test suite that **goes green on a clean clone after that documented install**. Confirm
|
|
|
|
|
that before you rely on it as a baseline. (Avoid giant frameworks for a first run; you want a
|
|
|
|
|
system you can't fully hold in your head, but whose test suite finishes in under a minute.)
|
|
|
|
|
**First time? Pick a small Python repo**, so the Module 13 testing toolchain you already have
|
|
|
|
|
transfers with the least friction.
|
|
|
|
|
- The starter files from this module's `lab/` folder: `orient.py` and `skills/`.
|
|
|
|
|
|
|
|
|
|
### Part A — Clone and orient
|
|
|
|
|
### Part A: Clone and orient
|
|
|
|
|
|
|
|
|
|
1. Clone your chosen repo and copy `orient.py` into its root:
|
|
|
|
|
|
|
|
|
@@ -191,23 +191,23 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
2. Read `ORIENT.md` yourself first. In 30 seconds you should know the language, the likely entry
|
|
|
|
|
point, the probable test command, and which files are biggest. These are **facts** — the AI can't
|
|
|
|
|
point, the probable test command, and which files are biggest. These are **facts**; the AI can't
|
|
|
|
|
argue with them. (Don't commit `ORIENT.md`; it's scratch context.)
|
|
|
|
|
|
|
|
|
|
### Part B — Map before you touch (read-only)
|
|
|
|
|
### Part B: Map before you touch (read-only)
|
|
|
|
|
|
|
|
|
|
3. Start a fresh AI session, load the `map-this-repo` skill (`lab/skills/map-this-repo.md`) or paste
|
|
|
|
|
it as instructions, and give it `ORIENT.md` as the opening context.
|
|
|
|
|
|
|
|
|
|
4. Ask it to produce the architecture summary: what the project does, a "where things live" table,
|
|
|
|
|
the confirmed build/test command, and a traced path for one real operation end to end —
|
|
|
|
|
the confirmed build/test command, and a traced path for one real operation end to end,
|
|
|
|
|
**with every claim citing a real file.** Demand the list of open questions it couldn't resolve.
|
|
|
|
|
|
|
|
|
|
5. **Verify the map.** Open two or three files it cited and confirm they say what it claimed. This is
|
|
|
|
|
the step everyone wants to skip and the one that catches the confident-but-wrong map. If a
|
|
|
|
|
citation doesn't hold up, the map is suspect — push back and make it re-trace.
|
|
|
|
|
citation doesn't hold up, the map is suspect; push back and make it re-trace.
|
|
|
|
|
|
|
|
|
|
### Part C — One small, scoped, tested change
|
|
|
|
|
### Part C: One small, scoped, tested change
|
|
|
|
|
|
|
|
|
|
6. Pick a genuinely small change: a clearer error message, a fixed edge case, a tiny missing
|
|
|
|
|
validation, a documented-but-unhandled input. Something a single function owns. Now load the
|
|
|
|
@@ -256,10 +256,10 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
|
|
|
|
architecture summary for a repo it half-read. Fluency is not correctness. The citation-checking in
|
|
|
|
|
Part B isn't optional ceremony; it's the only thing standing between you and changing code based on
|
|
|
|
|
a fiction. Verify at least a few claims by hand, every time.
|
|
|
|
|
- **The context window is a hard ceiling.** On a truly large monorepo, the AI cannot see everything,
|
|
|
|
|
- **The context window is a hard ceiling.** On a genuinely large monorepo, the AI cannot see everything,
|
|
|
|
|
and it usually won't *tell* you what it didn't read. Its map is only as good as the slice it
|
|
|
|
|
actually loaded. MCP-backed search and language-server tools (Module 20) shrink this problem by
|
|
|
|
|
letting it fetch on demand, but they don't erase it — treat "I've reviewed the whole codebase" as
|
|
|
|
|
letting it fetch on demand, but they don't erase it; treat "I've reviewed the whole codebase" as
|
|
|
|
|
a claim to distrust.
|
|
|
|
|
- **"Small change" can hide a big blast radius.** A one-line edit to a heavily-called function can
|
|
|
|
|
ripple through code you never opened. The blast-radius search in the `safe-change` skill is the
|
|
|
|
@@ -273,7 +273,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
|
|
|
|
"match local conventions" rule help, but you'll still catch drift in review.
|
|
|
|
|
- **Some changes shouldn't be a small diff.** A genuine architectural problem won't be fixed by the
|
|
|
|
|
smallest-possible edit, and forcing it to be makes things worse. This module's discipline is for
|
|
|
|
|
the common case — a scoped change in a system you don't own. Recognizing when a change is actually
|
|
|
|
|
the common case: a scoped change in a system you don't own. Recognizing when a change is actually
|
|
|
|
|
a *project* (and escalating it as one) is its own judgment call the tooling won't make for you.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
@@ -283,7 +283,7 @@ This lab does **not** use `tasks-app` — the entire point is a codebase you *di
|
|
|
|
|
**You're done when:**
|
|
|
|
|
|
|
|
|
|
- You can hand an AI a factual orientation pack and get back an architecture summary whose citations
|
|
|
|
|
you've **personally verified** against the real files — including the open questions it couldn't
|
|
|
|
|
you've **personally verified** against the real files, including the open questions it couldn't
|
|
|
|
|
resolve.
|
|
|
|
|
- You've made one change to a codebase you didn't write that is on its own branch, covered by a test
|
|
|
|
|
that fails without it, passing the full existing suite, and whose `git diff` is *exactly* the
|
|
|
|
@@ -305,11 +305,11 @@ This is an expansion-zone module; the durable motion is stable, but the tooling
|
|
|
|
|
- [ ] Confirm `orient.py` runs unchanged on current Python (3.10+) and a freshly cloned repo on
|
|
|
|
|
macOS, Linux, and Windows (git-bash / PowerShell).
|
|
|
|
|
- [ ] Re-check the MCP capabilities cited (filesystem, code search, language-server intelligence,
|
|
|
|
|
issue/CI/log access) against what's actually common in the current MCP ecosystem — the menu of
|
|
|
|
|
issue/CI/log access) against what's actually common in the current MCP ecosystem; the menu of
|
|
|
|
|
available servers changes fast. Keep it described as capabilities, not specific products.
|
|
|
|
|
- [ ] Verify the cross-references still point to the right modules if any renumbering happened
|
|
|
|
|
(4, 6, 9, 10, 12, 13, 20, 21).
|
|
|
|
|
- [ ] Re-confirm the `SIGNALS`/`TEST_HINTS` tables in `orient.py` still reflect common manifests and
|
|
|
|
|
test runners; add any that have become standard, but keep it language-agnostic.
|
|
|
|
|
- [ ] Sanity-check the suggested "small-to-medium repo with a fast test suite" lab guidance still
|
|
|
|
|
lands — recommend nothing by name that could rot.
|
|
|
|
|
lands; recommend nothing by name that could rot.
|
|
|
|
|