fix(testing/ci/tooling): consistent unittest, venv guidance, runnable lab commands

- #9: standardize the test chain on stdlib unittest (nothing-to-install, which
  keeps M13's claims true and its planted bug intact). Aligned M5/M14/M16 prose,
  M14 lab/test_tasks.py, and ci/gitlab starters; ruff stays the only pip install.
- #20: add venv / PEP 668 / which-python guidance to M20 (+ M14/M15 local
  installs); point MCP config at the venv's absolute python.
- #21: replace M21 Part D's empty `git diff HEAD~1` with `git log -p` (no
  .gitignore added — device preserved).
- #22: add a dependency-install step before M23's green baseline on a fresh clone.
- #23: M24 reviewer/triage now tolerate code-fence-wrapped JSON (stdlib only);
  feature.patch trap untouched.
- #28: fix M27 Part D CI snippet path (working-directory) and require the gate to
  target a varying candidate; swapped_model regression kept as the fixture.

Closes #9
Closes #20
Closes #21
Closes #22
Closes #23
Closes #28

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
This commit is contained in:
2026-06-22 16:07:47 -04:00
parent a6a3cfdc50
commit f98eacb196
17 changed files with 216 additions and 82 deletions
+11 -10
View File
@@ -47,8 +47,8 @@ committed instructions file from the repo, and you control what's in it.**
> repo-root config file). Some tools even read more than one filename — point them all at the same
> content if so. The principle outlives any one vendor's filename.
Without this file, you re-explain your project every session: "we use 4-space indent," "the tests are
`pytest`, run them before you say you're done," "don't touch the generated `tasks.json`." You say it,
Without this file, you re-explain your project every session: "we use 4-space indent," "run the tests
with `python -m unittest` before you say you're done," "don't touch the generated `tasks.json`." You say it,
the AI complies, the session ends, the memory evaporates (Module 1's second seam), and tomorrow you
say it all again. The instructions file is where that knowledge stops being something you retype and
becomes something the project *carries*.
@@ -62,8 +62,8 @@ a briefing for an agent that will edit this code. Keep it to what changes the AI
uses. "Core logic lives in `tasks.py`; the CLI front end is `cli.py`; state persists to
`tasks.json`."
- **Build and test commands** — the exact commands, copy-pasteable. "Run the app with
`python cli.py <command>`. Run tests with `pytest`. Don't claim a change works until the tests
pass." This single line stops the AI from inventing a test runner you don't use.
`python cli.py <command>`. Run tests with `python -m unittest`. Don't claim a change works until
the tests pass." This single line stops the AI from inventing a test runner you don't use.
- **Coding standards** — formatting, typing, error handling, the libraries you do and don't want.
"Use the standard library only — no third-party packages. Type-hint public functions."
- **"Don't touch these files."** — the off-limits list. Generated files, vendored code, secrets,
@@ -83,7 +83,7 @@ useful for personal preferences, but it's the wrong home for project knowledge,
lives: on *your* laptop, invisible to everyone else.
Picture a two-person project with no committed instructions file. You've trained your local setup to
run `pytest` and avoid `tasks.json`. Your teammate's setup hasn't — their agent reformats whole files
run `python -m unittest` and avoid `tasks.json`. Your teammate's setup hasn't — their agent reformats whole files
and hand-edits the generated JSON. You're both "using AI on the same repo," but you're getting
different behavior, and neither of you can see the other's configuration. That's **drift**: the same
codebase, diverging because the rules live in two heads instead of one file.
@@ -176,7 +176,8 @@ editor-integrated AI (Module 4) for the part where the AI obeys the file.
- The `tasks-app` repo from Module 2 (already a Git repo with some history).
- Your agentic coding tool from Module 4, and knowledge of which filename it reads for repo-level
instructions (check its docs — see the note in *Key concepts*).
- Optionally `pytest` (`pip install pytest`) so the AI has a real test command to honor.
- Optionally, a test command for the AI to honor — Python's built-in `python -m unittest` works with
nothing to install (you'll write a real suite in Module 13; until then it simply reports no tests).
### Part A — Write the instructions file
@@ -192,8 +193,8 @@ editor-integrated AI (Module 4) for the part where the AI obeys the file.
2. Open it in your editor and make it true for *your* project. The starter is filled in for the
`tasks-app`, but read every line and confirm it matches reality — wrong instructions are worse
than none. At minimum, set the real test command (or delete the line if you didn't install
`pytest`).
than none. At minimum, set the real test command (or delete the line if you don't have tests
yet).
3. Commit it. This is the point of the whole module:
@@ -272,8 +273,8 @@ Be honest about what a committed instructions file does and doesn't buy you:
- **Bloat kills it.** A 300-line instructions file is read the way *you* read a 300-line terms-of-
service: not really. Every line you add dilutes the rest. Keep it to what actually changes behavior,
and prune lines the model already honors without being told.
- **Stale instructions are worse than none.** A file that says "tests are `pytest`" after you've moved
to something else will actively misdirect the AI. The file is code-adjacent — it has to be
- **Stale instructions are worse than none.** A file that says "run the tests with `python -m
unittest`" after you've switched to a different runner will actively misdirect the AI. The file is code-adjacent — it has to be
maintained like code, and reviewed like code. That's exactly why committing it (so changes are
visible) matters.
- **The team payoff isn't here yet.** On a solo local repo, the "no more drift between teammates"
@@ -26,7 +26,7 @@ minute but real enough to have more than one file. Keep it that way — don't gr
## Build and test commands
- Run the app: `python cli.py <command>` (e.g. `python cli.py list`).
- Run the tests: `pytest` <!-- EDIT: set this to your real test command, or delete if you have no tests yet -->
- Run the tests: `python -m unittest` <!-- EDIT: set this to your real test command, or delete if you have no tests yet -->
- Do not claim a change works until you have actually run it. If tests exist, they must pass first.
## Coding standards