Use python3 as the canonical command name course-wide (#104)
CI / check (pull_request) Successful in 7s

Most current systems (default Debian/Ubuntu, recent macOS) install Python
only as `python3`, with no bare `python` on PATH, so learners who copied
`python cli.py ...` into their host shell hit "command not found".

Convert host-shell `python <cmd>` -> `python3 <cmd>` across module/lab
READMEs, lab `.py` docstrings & usage strings, blog posts, lab prompt and
instruction files, the M04 verify.sh message, and the M10/M24 lab patches.
Module 01's convention note (and its blog/02 mirror) is rewritten so
`python3` is canonical and `python` is the documented fallback.

Stop-lines respected: Docker image tags (`python:3.12-slim`), `.venv/.../python`
and `...\.venv\Scripts\python.exe` paths, the M20 `"command": "python"`
teaching example and surrounding venv prose, container-internal invocations
(M16/M18 Dockerfiles, M16 README `docker run` examples), and CI-workflow
`run:` steps fed by `actions/setup-python` / `image: python:3.12` are left
as `python` on purpose.

pip was left out of scope: most occurrences are prose or CI/container-internal,
and `pip3` does not fix the PEP 668 externally-managed-environment refusal that
the course already addresses with venvs. The M01 note is worded to stay
consistent with bare `pip` (use whichever pip pairs with your Python).

Build (tools/build_wiki.py) and tools/check.sh both pass.

Closes #104

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01GAEzanEoGJT5o1VizQar47
This commit is contained in:
2026-06-23 20:18:04 -04:00
parent 7f439212ac
commit 3221f7abe8
102 changed files with 380 additions and 378 deletions
+6 -6
View File
@@ -245,7 +245,7 @@ output.
```bash
cd modules/27-evals/lab
python run_eval.py candidates/current_model
python3 run_eval.py candidates/current_model
echo "exit code: $?"
```
@@ -258,7 +258,7 @@ output.
2. Now simulate the swap: run the *exact same eval set* against the other candidate:
```bash
python run_eval.py candidates/swapped_model
python3 run_eval.py candidates/swapped_model
echo "exit code: $?"
```
@@ -275,7 +275,7 @@ output.
yourself and read the scorecard:
```bash
python run_eval.py candidates/my_run_1
python3 run_eval.py candidates/my_run_1
```
4. Now actually swap something. Either change the model Claude Code uses, or change the *prompt* (ask
@@ -309,10 +309,10 @@ output.
`working-directory:` line makes the CI job `cd` into the lab folder first, so the
`candidates/...` path and `run_eval.py`'s own `from eval_set import CASES` resolve exactly as they
did on your machine. (Drop it and point a repo-root job straight at
`python modules/27-evals/lab/run_eval.py candidates/current_model`, and `candidates/`
`python3 modules/27-evals/lab/run_eval.py candidates/current_model`, and `candidates/`
won't exist from the repo root: the gate crashes with a *false* failure, which is worse than no
gate. If the agent prefers a single line, it can spell both paths out from the repo root:
`python modules/27-evals/lab/run_eval.py modules/27-evals/lab/candidates/current_model
`python3 modules/27-evals/lab/run_eval.py modules/27-evals/lab/candidates/current_model
--threshold 1.0`.)
Below threshold exits non-zero and the pipeline blocks, exactly like a failing test. The guardrail
@@ -388,5 +388,5 @@ This is an expansion-zone module over fast-moving ground. Re-check at build/publ
- [ ] **Module cross-references.** Confirm Modules 13, 14, 10, and 2426 still carry the
responsibilities referenced here (tests, CI gating, review, the agent autonomy ladder) and that
none were renumbered.
- [ ] **Lab still runs.** `python run_eval.py candidates/current_model` exits 0 at 100%, and
- [ ] **Lab still runs.** `python3 run_eval.py candidates/current_model` exits 0 at 100%, and
`candidates/swapped_model` exits 1 below threshold, on a current Python 3.x.
+1 -1
View File
@@ -14,7 +14,7 @@ than pretending. NOTHING here pins a provider.
EVAL_JUDGE_MODEL # the model name to ask for
Run it standalone to grade one sample:
python llm_judge.py "Add count command" "fix"
python3 llm_judge.py "Add count command" "fix"
"""
import json
+3 -3
View File
@@ -1,9 +1,9 @@
"""Run the eval set against one candidate and print a scorecard.
Usage:
python run_eval.py candidates/current_model
python run_eval.py candidates/swapped_model
python run_eval.py candidates/current_model --threshold 0.9
python3 run_eval.py candidates/current_model
python3 run_eval.py candidates/swapped_model
python3 run_eval.py candidates/current_model --threshold 0.9
A "candidate" is a directory containing a tasks.py that an agent produced. The
runner imports that tasks.py, runs every case in eval_set.py against it, prints