fix(testing/ci/tooling): consistent unittest, venv guidance, runnable lab commands

- #9: standardize the test chain on stdlib unittest (nothing-to-install, which
  keeps M13's claims true and its planted bug intact). Aligned M5/M14/M16 prose,
  M14 lab/test_tasks.py, and ci/gitlab starters; ruff stays the only pip install.
- #20: add venv / PEP 668 / which-python guidance to M20 (+ M14/M15 local
  installs); point MCP config at the venv's absolute python.
- #21: replace M21 Part D's empty `git diff HEAD~1` with `git log -p` (no
  .gitignore added — device preserved).
- #22: add a dependency-install step before M23's green baseline on a fresh clone.
- #23: M24 reviewer/triage now tolerate code-fence-wrapped JSON (stdlib only);
  feature.patch trap untouched.
- #28: fix M27 Part D CI snippet path (working-directory) and require the gate to
  target a varying candidate; swapped_model regression kept as the fixture.

Closes #9
Closes #20
Closes #21
Closes #22
Closes #23
Closes #28

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
This commit is contained in:
2026-06-22 16:07:47 -04:00
parent a6a3cfdc50
commit f98eacb196
17 changed files with 216 additions and 82 deletions
+22 -13
View File
@@ -78,8 +78,8 @@ Almost every CI configuration, on every forge, is the same four moves:
4. **Run the checks** — lint, then test. Any check that exits non-zero fails the whole run.
That last point is the load-bearing one. CI's entire enforcement mechanism is the **exit code**.
Every tool you'd run in a terminal returns 0 for success and non-zero for failure. `pytest` exits
non-zero if a test fails. `ruff check` exits non-zero if it finds a lint problem. CI runs your
Every tool you'd run in a terminal returns 0 for success and non-zero for failure. `python -m
unittest` exits non-zero if a test fails. `ruff check` exits non-zero if it finds a lint problem. CI runs your
commands and watches those exit codes; one failure turns the run red. You're not learning a new
testing system — you're wiring the tools you already have to a trigger.
@@ -125,18 +125,19 @@ jobs:
with:
python-version: "3.12"
- name: Install tools
run: pip install pytest ruff
run: pip install ruff
- name: Lint
run: ruff check .
- name: Test
run: pytest -q
run: python -m unittest
```
Reading it top to bottom: `on:` is the trigger (push and pull request). `runs-on:` picks the clean
machine. The `steps:` are the four moves — checkout, set up Python, install the tools, then the two
checks. `uses:` pulls in a pre-built action (someone else's reusable step); `run:` is just a shell
command. The linter runs first because it's cheap; the tests run last because they're the
expensive, decisive check.
expensive, decisive check. Only the linter needs a `pip install` here — the tests run on Python's
standard-library `unittest` runner from Module 13, so there's nothing to install for them.
This file lives *in the repo*, committed and versioned like everything else. That's deliberate and
on-thesis: your pipeline is code, it's reviewed as a diff in a PR (Module 10), and a teammate or an
@@ -151,9 +152,9 @@ When CI goes red, the skill is triage, and it's fast once you know the shape:
2. **The first red step is the cause.** Steps run in order and stop at the first failure; everything
after it is skipped, not broken. Don't get distracted by the skipped steps.
3. **Read that step's log.** It's the same output the tool prints in your terminal — a failing
`pytest` assertion, a `ruff` finding with a file and line number. CI didn't invent a new error
`unittest` assertion, a `ruff` finding with a file and line number. CI didn't invent a new error
format; it's showing you the command's own output.
4. **Reproduce it locally.** Run the exact command from the failed step (`pytest -q` or
4. **Reproduce it locally.** Run the exact command from the failed step (`python -m unittest` or
`ruff check .`) on your machine. It will fail the same way, because CI ran the same command. Fix
it locally, confirm it's green locally, push again.
@@ -225,14 +226,21 @@ your machine first.
```bash
cd ~/workflow-course/tasks-app
pip install pytest ruff
pytest -q # should report all tests passing
ruff check . # should report no issues (or fix what it flags)
pip install ruff
python -m unittest # should report all tests passing
ruff check . # should report no issues (or fix what it flags)
```
If both are clean locally, CI will be green. If not, fix it here — it's faster than waiting on a
runner.
> **If `pip install` is refused** with "externally-managed-environment" (PEP 668 — common on
> recent Debian/Ubuntu and Homebrew Python), install into a per-project virtual environment
> instead: `python3 -m venv .venv && source .venv/bin/activate` (Windows:
> `.venv\Scripts\activate`), then re-run `pip install ruff`. Only the linter needs installing — the
> stdlib `unittest` runner needs nothing. (`pipx` or `pip install --break-system-packages` also
> work; a venv is the clean default.)
### Part B — Add the workflow and watch it pass
2. Put the workflow where your forge looks for it:
@@ -288,7 +296,7 @@ and watch CI stop it.
bad one, instead of rewriting history other people may have pulled.
```bash
pytest -q # fails locally too — same command, same failure
python -m unittest # fails locally too — same command, same failure
git revert HEAD # new commit that undoes "Simplify pending()" (Module 12)
git push # CI re-runs on the fixed code and goes green again
```
@@ -371,5 +379,6 @@ Re-check at build time:
- [ ] **Forge UI labels.** The tab names in the lab ("Actions," "CI/CD," "Pipelines") and the
workflow file locations (`.github/workflows/`, `.gitlab-ci.yml`, `.forgejo/`, `.gitea/`) match
what the current forge versions actually use.
- [ ] **Tool names.** The example linter and test runner (`ruff`, `pytest`) are current, installable,
and still behave as described — or swap in the equivalents the rest of the course uses.
- [ ] **Tool names.** The example linter (`ruff`) is current, installable, and still behaves as
described — or swap in the equivalent the rest of the course uses. (The test runner is Python's
standard-library `unittest`, which ships with Python — no install, nothing to drift.)
@@ -33,14 +33,16 @@ jobs:
with:
python-version: "3.12"
# Step 3: install the tools the checks need — the test runner and the linter from Module 13.
# Step 3: install the linter (ruff), the new tool this module adds. The test runner is
# Python's standard-library unittest from Module 13 — nothing to install for it.
- name: Install tools
run: pip install pytest ruff
run: pip install ruff
# Step 4: lint. Style and obvious-mistake check. Fails the job on any finding (non-zero exit).
- name: Lint
run: ruff check .
# Step 5: test. The Module 13 tests. A single failing assertion fails the whole job.
# Step 5: test. The Module 13 tests, run with the stdlib unittest runner. A single failing
# assertion fails the whole job.
- name: Test
run: pytest -q
run: python -m unittest
@@ -17,6 +17,6 @@ check:
# of "runs-on: ubuntu-latest" plus "set up Python".
image: python:3.12
script:
- pip install pytest ruff
- ruff check . # lint
- pytest -q # test
- pip install ruff
- ruff check . # lint
- python -m unittest # test (stdlib runner from Module 13 — nothing to install)
@@ -1,36 +1,41 @@
"""Tests for the tasks-app core logic — the kind of suite Module 13 has you write.
Reproduced here so this module's lab is self-contained: if you already wrote tests in Module 13,
use those instead. Run locally with `pytest -q` from the project folder. CI runs exactly this.
use those instead. Standard-library `unittest`, exactly like Module 13 — nothing to install.
Run locally with `python -m unittest` from the project folder. CI runs exactly this.
"""
import unittest
from tasks import TaskList
def test_add_appends_a_task():
tl = TaskList()
tl.add("write the CI lesson")
assert len(tl.tasks) == 1
assert tl.tasks[0].title == "write the CI lesson"
assert tl.tasks[0].done is False
class TestTaskList(unittest.TestCase):
def test_add_appends_a_task(self):
tl = TaskList()
tl.add("write the CI lesson")
self.assertEqual(len(tl.tasks), 1)
self.assertEqual(tl.tasks[0].title, "write the CI lesson")
self.assertFalse(tl.tasks[0].done)
def test_complete_marks_a_task_done(self):
tl = TaskList()
tl.add("ship it")
tl.complete(0)
self.assertTrue(tl.tasks[0].done)
def test_pending_excludes_completed_tasks(self):
tl = TaskList()
tl.add("a")
tl.add("b")
tl.complete(0)
pending = tl.pending()
self.assertEqual(len(pending), 1)
self.assertEqual(pending[0].title, "b")
def test_render_is_friendly_when_empty(self):
self.assertEqual(TaskList().render(), "(no tasks yet)")
def test_complete_marks_a_task_done():
tl = TaskList()
tl.add("ship it")
tl.complete(0)
assert tl.tasks[0].done is True
def test_pending_excludes_completed_tasks():
tl = TaskList()
tl.add("a")
tl.add("b")
tl.complete(0)
pending = tl.pending()
assert len(pending) == 1
assert pending[0].title == "b"
def test_render_is_friendly_when_empty():
assert TaskList().render() == "(no tasks yet)"
if __name__ == "__main__":
unittest.main()