Use python3 as the canonical command name course-wide (#104)
CI / check (pull_request) Successful in 7s
CI / check (pull_request) Successful in 7s
Most current systems (default Debian/Ubuntu, recent macOS) install Python only as `python3`, with no bare `python` on PATH, so learners who copied `python cli.py ...` into their host shell hit "command not found". Convert host-shell `python <cmd>` -> `python3 <cmd>` across module/lab READMEs, lab `.py` docstrings & usage strings, blog posts, lab prompt and instruction files, the M04 verify.sh message, and the M10/M24 lab patches. Module 01's convention note (and its blog/02 mirror) is rewritten so `python3` is canonical and `python` is the documented fallback. Stop-lines respected: Docker image tags (`python:3.12-slim`), `.venv/.../python` and `...\.venv\Scripts\python.exe` paths, the M20 `"command": "python"` teaching example and surrounding venv prose, container-internal invocations (M16/M18 Dockerfiles, M16 README `docker run` examples), and CI-workflow `run:` steps fed by `actions/setup-python` / `image: python:3.12` are left as `python` on purpose. pip was left out of scope: most occurrences are prose or CI/container-internal, and `pip3` does not fix the PEP 668 externally-managed-environment refusal that the course already addresses with venvs. The M01 note is worded to stay consistent with bare `pip` (use whichever pip pairs with your Python). Build (tools/build_wiki.py) and tools/check.sh both pass. Closes #104 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01GAEzanEoGJT5o1VizQar47
This commit is contained in:
@@ -148,11 +148,13 @@ purpose** so you recognize it later.
|
||||
- Python 3.10 or newer (`python --version` or `python3 --version` to check).
|
||||
- Your usual AI chat assistant, open in a browser tab.
|
||||
|
||||
> **One command name, the whole course through:** whichever of `python` / `python3` just printed a
|
||||
> 3.10+ version is the command to use in *every* lab from here on. The labs are written with
|
||||
> `python`; if that's "command not found" on your machine (common on current macOS and default
|
||||
> Debian/Ubuntu, where Python is installed only as `python3`), read it as `python3` (and `pip3`
|
||||
> wherever a lab uses `pip`). This note holds course-wide; we won't repeat it.
|
||||
> **One command name, the whole course through:** the labs are written with `python3`, the command
|
||||
> name current macOS and default Debian/Ubuntu actually ship (they install Python only as `python3`,
|
||||
> with no bare `python` on PATH). Run `python3 --version`; if it prints a 3.10+ version, use `python3`
|
||||
> in *every* lab from here on. If `python3` is "command not found" but `python --version` shows a
|
||||
> 3.10+ version (older or some Windows setups), read every `python3` in the labs as `python` instead.
|
||||
> Where a lab runs `pip`, use whichever pairs with your Python (`pip3` commonly goes with `python3`).
|
||||
> This note holds course-wide; we won't repeat it.
|
||||
|
||||
### Get the course materials
|
||||
|
||||
@@ -193,8 +195,8 @@ You now have every module's files locally, including this one's under
|
||||
3. Run it in your terminal to confirm it works:
|
||||
|
||||
```bash
|
||||
python cli.py add "finish module 1"
|
||||
python cli.py list
|
||||
python3 cli.py add "finish module 1"
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
You should see your task listed. **This is your "real local project, an editor, and a terminal."**
|
||||
@@ -205,14 +207,14 @@ You now have every module's files locally, including this one's under
|
||||
Now reproduce each failure deliberately. Keep the AI strictly in the **browser chat**; no
|
||||
editor-integrated tools yet (those arrive in Module 4). This is the "before" picture on purpose.
|
||||
|
||||
1. **Seam 1 (multiple files).** First mark a task done so there's something to hide. Run `python
|
||||
cli.py done 0`, then `python cli.py list` shows it as `[x]`. Now paste *only* `cli.py` into your
|
||||
1. **Seam 1 (multiple files).** First mark a task done so there's something to hide. Run `python3
|
||||
cli.py done 0`, then `python3 cli.py list` shows it as `[x]`. Now paste *only* `cli.py` into your
|
||||
chat and ask: *"Make the `list` command hide tasks that are already done."* Apply whatever it
|
||||
gives you and run `python cli.py list`. The clean version of this change lives in `tasks.py`, the
|
||||
gives you and run `python3 cli.py list`. The clean version of this change lives in `tasks.py`, the
|
||||
file you *didn't* paste: open it and you'll see `render()` already owns the `[x]`/`[ ]`
|
||||
box-and-index formatting, and a `pending()` helper already returns exactly the not-done tasks. But
|
||||
the chat never saw that file, so it had to do one of two things. Either it guessed at methods it
|
||||
couldn't see (and `python cli.py list` errors out), or it reached into the raw task list and
|
||||
couldn't see (and `python3 cli.py list` errors out), or it reached into the raw task list and
|
||||
*re-created* that box-and-index formatting inside `cli.py`, duplicating logic that already existed
|
||||
one file over. Either way, *you* had to be the one who knew the change really belonged in the
|
||||
other file.
|
||||
@@ -251,7 +253,7 @@ Be honest about the limits of this module's claims:
|
||||
|
||||
**You're done when:**
|
||||
|
||||
- You can run `python cli.py list` in your terminal and see output; your project, editor, and
|
||||
- You can run `python3 cli.py list` in your terminal and see output; your project, editor, and
|
||||
terminal are working together.
|
||||
- You can name the three seams where copy-paste breaks (more than one file, more than one day, no
|
||||
undo) without looking back at the lesson.
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -204,7 +204,7 @@ and your AI assistant.
|
||||
|
||||
This is the habit that replaces "paste it back and hope." You're reading exactly what changed,
|
||||
nothing more, nothing less. Confirm it does what you asked and didn't touch anything it shouldn't.
|
||||
Run it (`python cli.py count`), then commit:
|
||||
Run it (`python3 cli.py count`), then commit:
|
||||
|
||||
```bash
|
||||
git add .
|
||||
@@ -223,7 +223,7 @@ and your AI assistant.
|
||||
git status # shows tasks.py as modified
|
||||
git restore tasks.py # discard the change; back to your last commit, byte for byte
|
||||
git diff # empty: nothing changed. you're clean.
|
||||
python cli.py list # works again
|
||||
python3 cli.py list # works again
|
||||
```
|
||||
|
||||
You just recovered from a bad AI change in one command, with zero retyping and zero guesswork.
|
||||
@@ -258,7 +258,7 @@ and your AI assistant.
|
||||
|
||||
9. Close the loop and leave the repo clean. The cold session just told you what's in progress and
|
||||
what to do next: finish the `delete <index>` command. Do that with the AI (paste in `cli.py` the
|
||||
same way as Part B), run it to confirm it works (`python cli.py delete 1`), then commit:
|
||||
same way as Part B), run it to confirm it works (`python3 cli.py delete 1`), then commit:
|
||||
|
||||
```bash
|
||||
git add .
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -71,7 +71,7 @@ echo "Running delete-command check with: $PY"
|
||||
|
||||
# Delete the middle task (index 1 = "beta").
|
||||
if ! "$PY" cli.py delete 1 >/dev/null 2>&1; then
|
||||
echo "FAIL: 'python cli.py delete 1' errored. Is the delete command wired up in cli.py?" >&2
|
||||
echo "FAIL: 'python3 cli.py delete 1' errored. Is the delete command wired up in cli.py?" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
|
||||
@@ -48,7 +48,7 @@ committed instructions file from the repo, and you control what's in it.**
|
||||
> content if so. The principle outlives any one vendor's filename.
|
||||
|
||||
Without this file, you re-explain your project every session: "we use 4-space indent," "run the tests
|
||||
with `python -m unittest` before you say you're done," "don't touch the generated `tasks.json`." You say it,
|
||||
with `python3 -m unittest` before you say you're done," "don't touch the generated `tasks.json`." You say it,
|
||||
the AI complies, the session ends, the memory evaporates (Module 1's second seam), and tomorrow you
|
||||
say it all again. The instructions file is where that knowledge stops being something you retype and
|
||||
becomes something the project *carries*.
|
||||
@@ -62,7 +62,7 @@ a briefing for an agent that will edit this code. Keep it to what changes the AI
|
||||
uses. "Core logic lives in `tasks.py`; the CLI front end is `cli.py`; state persists to
|
||||
`tasks.json`."
|
||||
- **Build and test commands**: the exact commands, copy-pasteable. "Run the app with
|
||||
`python cli.py <command>`. Run tests with `python -m unittest`. Don't claim a change works until
|
||||
`python3 cli.py <command>`. Run tests with `python3 -m unittest`. Don't claim a change works until
|
||||
the tests pass." This single line stops the AI from inventing a test runner you don't use.
|
||||
- **Coding standards**: formatting, typing, error handling, the libraries you do and don't want.
|
||||
"Use the standard library only, no third-party packages. Type-hint public functions."
|
||||
@@ -83,7 +83,7 @@ useful for personal preferences, but it's the wrong home for project knowledge,
|
||||
lives: on *your* laptop, invisible to everyone else.
|
||||
|
||||
Picture a two-person project with no committed instructions file. You've trained your local setup to
|
||||
run `python -m unittest` and avoid `tasks.json`. Your teammate's setup hasn't, so their agent reformats whole files
|
||||
run `python3 -m unittest` and avoid `tasks.json`. Your teammate's setup hasn't, so their agent reformats whole files
|
||||
and hand-edits the generated JSON. You're both "using AI on the same repo," but you're getting
|
||||
different behavior, and neither of you can see the other's configuration. That's **drift**: the same
|
||||
codebase, diverging because the rules live in two heads instead of one file.
|
||||
@@ -215,7 +215,7 @@ editor-integrated AI (Module 4) for the part where the AI obeys the file.
|
||||
- The `tasks-app` repo from Module 2 (already a Git repo with some history).
|
||||
- Your agentic coding tool from Module 4, and knowledge of which filename it reads for repo-level
|
||||
instructions (check its docs; see the note in *Key concepts*).
|
||||
- Optionally, a test command for the AI to honor; Python's built-in `python -m unittest` works with
|
||||
- Optionally, a test command for the AI to honor; Python's built-in `python3 -m unittest` works with
|
||||
nothing to install (you'll write a real suite in Module 13; until then it simply reports no tests).
|
||||
|
||||
### Part A: Write the instructions file and let the AI commit the config
|
||||
@@ -321,7 +321,7 @@ Be honest about what a committed instructions file does and doesn't buy you:
|
||||
- **Bloat kills it.** A 300-line instructions file is read the way *you* read a 300-line terms-of-
|
||||
service: not really. Every line you add dilutes the rest. Keep it to what actually changes behavior,
|
||||
and prune lines the model already honors without being told.
|
||||
- **Stale instructions are worse than none.** A file that says "run the tests with `python -m
|
||||
- **Stale instructions are worse than none.** A file that says "run the tests with `python3 -m
|
||||
unittest`" after you've switched to a different runner will actively misdirect the AI. The file is
|
||||
code-adjacent: it has to be maintained like code, and reviewed like code. That's exactly why
|
||||
committing it (so changes are
|
||||
|
||||
@@ -25,8 +25,8 @@ minute but real enough to have more than one file. Keep it that way; don't grow
|
||||
|
||||
## Build and test commands
|
||||
|
||||
- Run the app: `python cli.py <command>` (e.g. `python cli.py list`).
|
||||
- Run the tests: `python -m unittest` <!-- EDIT: set this to your real test command, or delete if you have no tests yet -->
|
||||
- Run the app: `python3 cli.py <command>` (e.g. `python3 cli.py list`).
|
||||
- Run the tests: `python3 -m unittest` <!-- EDIT: set this to your real test command, or delete if you have no tests yet -->
|
||||
- Do not claim a change works until you have actually run it. If tests exist, they must pass first.
|
||||
|
||||
## Coding standards
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -164,9 +164,9 @@ decide:
|
||||
|
||||
```python
|
||||
<<<<<<< HEAD
|
||||
print("usage: python cli.py [add <title> | list | done <index> | stats]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | stats]")
|
||||
=======
|
||||
print("usage: python cli.py [add <title> | list | done <index> | purge]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | purge]")
|
||||
>>>>>>> experiment
|
||||
```
|
||||
|
||||
@@ -295,9 +295,9 @@ the one job that's still yours: verify the result.
|
||||
|
||||
```bash
|
||||
git diff # read what it actually changed
|
||||
python cli.py add "ship module 6" --priority high
|
||||
python cli.py add "water plants" --priority low
|
||||
python cli.py list # see if priorities work and sort
|
||||
python3 cli.py add "ship module 6" --priority high
|
||||
python3 cli.py add "water plants" --priority low
|
||||
python3 cli.py list # see if priorities work and sort
|
||||
```
|
||||
|
||||
Once the diff looks right and the feature runs, tell the agent:
|
||||
@@ -312,7 +312,7 @@ the one job that's still yours: verify the result.
|
||||
> *"Switch back to `main`."*
|
||||
|
||||
```bash
|
||||
python cli.py list # no priorities; main is exactly as you left it
|
||||
python3 cli.py list # no priorities; main is exactly as you left it
|
||||
```
|
||||
|
||||
Your bold change exists only on the branch. `main` never saw it, and that's the whole point.
|
||||
@@ -331,7 +331,7 @@ Then verify the result yourself:
|
||||
|
||||
```bash
|
||||
git log --oneline --graph # straight line = fast-forward merge
|
||||
python cli.py list # the feature is now on main
|
||||
python3 cli.py list # the feature is now on main
|
||||
git branch # experiment/priorities is gone
|
||||
```
|
||||
|
||||
@@ -343,7 +343,7 @@ Then verify:
|
||||
|
||||
```bash
|
||||
git log --oneline # no trace of the experiment on main
|
||||
python cli.py list # main is untouched, exactly as before
|
||||
python3 cli.py list # main is untouched, exactly as before
|
||||
git branch # the branch is gone
|
||||
```
|
||||
|
||||
@@ -411,9 +411,9 @@ Merge conflicts have an outsized reputation for difficulty. You'll engineer a gu
|
||||
|
||||
```python
|
||||
<<<<<<< HEAD
|
||||
print("usage: python cli.py [add <title> | list | done <index> | purge]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | purge]")
|
||||
=======
|
||||
print("usage: python cli.py [add <title> | list | done <index> | stats]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | stats]")
|
||||
>>>>>>> feature/stats
|
||||
```
|
||||
|
||||
@@ -446,7 +446,7 @@ Merge conflicts have an outsized reputation for difficulty. You'll engineer a gu
|
||||
should have produced a single, marker-free line listing both commands, e.g.:
|
||||
|
||||
```python
|
||||
print("usage: python cli.py [add <title> | list | done <index> | stats | purge]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | stats | purge]")
|
||||
```
|
||||
|
||||
**Here is the punchline of the whole module: you have no idea yet whether that's right, so verify.**
|
||||
@@ -458,9 +458,9 @@ Merge conflicts have an outsized reputation for difficulty. You'll engineer a gu
|
||||
```bash
|
||||
git diff HEAD~1 # what the merge actually changed; confirm no markers remain
|
||||
git log --oneline --graph # the fork-and-join: this is a merge commit
|
||||
python cli.py # run with no args, see the merged usage string
|
||||
python cli.py stats # both commands actually work
|
||||
python cli.py purge
|
||||
python3 cli.py # run with no args, see the merged usage string
|
||||
python3 cli.py stats # both commands actually work
|
||||
python3 cli.py purge
|
||||
```
|
||||
|
||||
If the usage line lists both commands and both run, the AI's silent resolution was correct. If it
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -323,8 +323,8 @@ This is the part to actually *do simultaneously*, not one then the other.
|
||||
writing them.) Give each worktree its own task and list it:
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python cli.py add "from worktree A" && python cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python cli.py add "from worktree B" && python cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python3 cli.py add "from worktree A" && python3 cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python3 cli.py add "from worktree B" && python3 cli.py list
|
||||
```
|
||||
|
||||
Each `list` shows only its own task: worktree A never sees "from worktree B" and vice versa. Each
|
||||
@@ -349,8 +349,8 @@ This is the part to actually *do simultaneously*, not one then the other.
|
||||
5. *Now* the new commands exist: run each in its own worktree to watch it work:
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python cli.py wipe # agent A's new command
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python cli.py remaining # agent B's new command
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python3 cli.py wipe # agent A's new command
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python3 cli.py remaining # agent B's new command
|
||||
```
|
||||
|
||||
`remaining` counts a single pending task, the one you added to worktree B in step 3, because B's
|
||||
@@ -378,9 +378,9 @@ Then **verify** the result before you trust it, the same way you did in Module 6
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
git diff # no conflict markers remain
|
||||
python cli.py list # the app still runs
|
||||
python cli.py wipe # both new commands work
|
||||
python cli.py remaining
|
||||
python3 cli.py list # the app still runs
|
||||
python3 cli.py wipe # both new commands work
|
||||
python3 cli.py remaining
|
||||
```
|
||||
|
||||
Now tear down the worktrees. Direct the coordinating session:
|
||||
|
||||
@@ -8,8 +8,8 @@ Add a `wipe` command to this task app that removes **all** tasks.
|
||||
|
||||
- Put the deletion logic on `TaskList` in `tasks.py` (a `wipe()` method that empties the list),
|
||||
and wire a `wipe` command into the dispatch in `cli.py` that calls it and saves.
|
||||
- Running `python cli.py wipe` should empty the list and print a short confirmation like
|
||||
- Running `python3 cli.py wipe` should empty the list and print a short confirmation like
|
||||
`wiped all tasks`.
|
||||
- After `wipe`, `python cli.py list` should print `(no tasks yet)`.
|
||||
- After `wipe`, `python3 cli.py list` should print `(no tasks yet)`.
|
||||
|
||||
Make the change, then stop. I'll review the diff, then have you commit it on this branch.
|
||||
|
||||
@@ -8,7 +8,7 @@ Add a `remaining` command to this task app that prints how many tasks are still
|
||||
|
||||
- Reuse the existing `pending()` method on `TaskList` in `tasks.py`; don't reimplement it.
|
||||
- Wire a `remaining` command into the dispatch in `cli.py`.
|
||||
- Running `python cli.py remaining` should print something like `2 pending` (the number of tasks not
|
||||
- Running `python3 cli.py remaining` should print something like `2 pending` (the number of tasks not
|
||||
marked done).
|
||||
|
||||
Make the change, then stop. I'll review the diff, then have you commit it on this branch.
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -105,8 +105,8 @@ well-formed version of the same bug:
|
||||
|
||||
> **Title:** `done` command crashes on an out-of-range or non-integer index
|
||||
>
|
||||
> **Context:** `python cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and
|
||||
> dumps a traceback. `python cli.py done abc` raises `ValueError`. Either way the user sees a stack
|
||||
> **Context:** `python3 cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and
|
||||
> dumps a traceback. `python3 cli.py done abc` raises `ValueError`. Either way the user sees a stack
|
||||
> trace instead of a helpful message.
|
||||
>
|
||||
> **Acceptance criteria:**
|
||||
@@ -264,7 +264,7 @@ plenty it still can't do. Because it's carried forward across modules, skip anyt
|
||||
already built (a `delete` command, task priorities) and pick work that's genuinely still missing.
|
||||
Good candidates:
|
||||
|
||||
1. **A bug**: `python cli.py done 99` (an out-of-range index) and `python cli.py done abc` (a
|
||||
1. **A bug**: `python3 cli.py done 99` (an out-of-range index) and `python3 cli.py done abc` (a
|
||||
non-integer) both crash with an uncaught traceback. Run them and watch.
|
||||
2. **A small, patterned feature**: an `undone <index>` command that clears a task's done flag,
|
||||
mirroring the existing `done` command (it's the inverse).
|
||||
|
||||
@@ -18,16 +18,16 @@
|
||||
|
||||
## Context / problem
|
||||
|
||||
`python cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and dumps a Python
|
||||
traceback. `python cli.py done abc` raises `ValueError` the same way. The user sees a stack trace
|
||||
`python3 cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and dumps a Python
|
||||
traceback. `python3 cli.py done abc` raises `ValueError` the same way. The user sees a stack trace
|
||||
instead of a helpful message, and the process exits as if it crashed.
|
||||
|
||||
Reproduce:
|
||||
|
||||
```
|
||||
python cli.py add "first"
|
||||
python cli.py done 99 # IndexError traceback
|
||||
python cli.py done abc # ValueError traceback
|
||||
python3 cli.py add "first"
|
||||
python3 cli.py done 99 # IndexError traceback
|
||||
python3 cli.py done abc # ValueError traceback
|
||||
```
|
||||
|
||||
## Acceptance criteria
|
||||
@@ -61,7 +61,7 @@ command, which already takes an index and flips a task's state; this is simply i
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
- [ ] `python cli.py undone <index>` clears the done flag on the task at that index and saves.
|
||||
- [ ] `python3 cli.py undone <index>` clears the done flag on the task at that index and saves.
|
||||
- [ ] `undone` with an out-of-range or non-integer index prints a clear error and exits non-zero
|
||||
(same behavior as the fixed `done`, see Issue 1).
|
||||
- [ ] `list` after `undone` shows that task as not done (`[ ]`).
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -243,8 +243,8 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
Then see the baseline behavior with your own eyes, because the trap is going to change it:
|
||||
|
||||
```bash
|
||||
python cli.py add "write the review module"
|
||||
python cli.py done 99 # baseline: prints "error: no task at index 99", exits non-zero
|
||||
python3 cli.py add "write the review module"
|
||||
python3 cli.py done 99 # baseline: prints "error: no task at index 99", exits non-zero
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
@@ -296,12 +296,12 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
5. Now verify your read by running the *failure* path, not the happy one:
|
||||
|
||||
```bash
|
||||
python cli.py add "a real task"
|
||||
python cli.py delete 0 # the requested feature: works fine on the happy path
|
||||
python cli.py add "another"
|
||||
python cli.py done 99 # the trap: compare this to your Part A baseline
|
||||
python3 cli.py add "a real task"
|
||||
python3 cli.py delete 0 # the requested feature: works fine on the happy path
|
||||
python3 cli.py add "another"
|
||||
python3 cli.py done 99 # the trap: compare this to your Part A baseline
|
||||
echo "exit code: $?"
|
||||
python cli.py list # did task 99 (which doesn't exist) get marked done? did anything?
|
||||
python3 cli.py list # did task 99 (which doesn't exist) get marked done? did anything?
|
||||
```
|
||||
|
||||
In the base app, `done 99` was a clean error with a non-zero exit. After this "add a delete
|
||||
|
||||
@@ -6,8 +6,8 @@ index 91e9276..2189230 100644
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
- print("usage: python cli.py [add <title> | list | done <index>]")
|
||||
+ print("usage: python cli.py [add <title> | list | done <index> | delete <index>]")
|
||||
- print("usage: python3 cli.py [add <title> | list | done <index>]")
|
||||
+ print("usage: python3 cli.py [add <title> | list | done <index> | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
|
||||
State is kept in tasks.json next to this file. The `done` command turns a bad index into a
|
||||
clean error message and a non-zero exit code; note that behavior before you review the AI
|
||||
@@ -33,7 +33,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
|
||||
State is kept in tasks.json next to this file. The `done` command turns a bad index into a
|
||||
clean error message and a non-zero exit code; note that behavior before you review the AI
|
||||
@@ -33,7 +33,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -355,11 +355,11 @@ the server say *no* is the point: "never commit to `main`" is now a rule, not a
|
||||
the CLI), and it does what you asked. Run it:
|
||||
|
||||
```bash
|
||||
python cli.py add "keeper" ; python cli.py add "trash"
|
||||
python cli.py list # note the index shown next to "trash"
|
||||
python cli.py done <trash-index> # use the index "list" just printed, NOT a fixed 1
|
||||
python cli.py clear-done # expect it to remove the completed one
|
||||
python cli.py list # "keeper" remains, "trash" is gone
|
||||
python3 cli.py add "keeper" ; python3 cli.py add "trash"
|
||||
python3 cli.py list # note the index shown next to "trash"
|
||||
python3 cli.py done <trash-index> # use the index "list" just printed, NOT a fixed 1
|
||||
python3 cli.py clear-done # expect it to remove the completed one
|
||||
python3 cli.py list # "keeper" remains, "trash" is gone
|
||||
```
|
||||
|
||||
Read the index off `list` rather than assuming it: `done` is positional, and your `tasks-app` has
|
||||
|
||||
@@ -19,7 +19,7 @@ After working through a list, the completed items pile up as noise. There's curr
|
||||
clear them out short of editing `tasks.json` by hand.
|
||||
|
||||
**Acceptance criteria**
|
||||
- `python cli.py clear-done` removes all completed tasks and keeps all pending ones.
|
||||
- `python3 cli.py clear-done` removes all completed tasks and keeps all pending ones.
|
||||
- It prints how many tasks were removed.
|
||||
- The removal logic lives in `tasks.py` (a `TaskList` method), not in `cli.py`.
|
||||
- Running it when nothing is done is a no-op that removes 0 tasks (no crash).
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -295,9 +295,9 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
4. **Now feel the bug.** It passes the first skim:
|
||||
|
||||
```bash
|
||||
python cli.py add "ship it"
|
||||
python cli.py clear # prints "cleared all tasks", looks fine!
|
||||
python cli.py list # CRASHES: it corrupted tasks.json, load() blows up
|
||||
python3 cli.py add "ship it"
|
||||
python3 cli.py clear # prints "cleared all tasks", looks fine!
|
||||
python3 cli.py list # CRASHES: it corrupted tasks.json, load() blows up
|
||||
```
|
||||
|
||||
This is the AI plausibility trap made concrete: the change reviewed fine and "worked," and broke
|
||||
@@ -337,8 +337,8 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
|
||||
```bash
|
||||
rm -f tasks.json # drop the corrupted state file the bug wrote
|
||||
python cli.py add "back to normal"
|
||||
python cli.py list # works again, the clear command is gone
|
||||
python3 cli.py add "back to normal"
|
||||
python3 cli.py list # works again, the clear command is gone
|
||||
git log --oneline # the bad merge is STILL there, with a revert after it
|
||||
```
|
||||
|
||||
@@ -360,7 +360,7 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
|
||||
```bash
|
||||
git log --oneline -1 # "Add version command"
|
||||
python cli.py version # prints the version
|
||||
python3 cli.py version # prints the version
|
||||
```
|
||||
|
||||
2. Now destroy it the way an over-eager "clean up the history" cleanup (or an agent) would, with a
|
||||
@@ -369,7 +369,7 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
```bash
|
||||
git reset --hard HEAD~1
|
||||
git log --oneline -2 # the "Add version command" commit is GONE from the branch
|
||||
python cli.py version 2>/dev/null || echo "command no longer exists"
|
||||
python3 cli.py version 2>/dev/null || echo "command no longer exists"
|
||||
```
|
||||
|
||||
It's not in `log`. It feels permanently lost. It isn't.
|
||||
@@ -384,7 +384,7 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
|
||||
```bash
|
||||
git log --oneline -1 # "Add version command" is back
|
||||
python cli.py version # works again
|
||||
python3 cli.py version # works again
|
||||
```
|
||||
|
||||
You just recovered a commit that `log` swore was gone. Note the honest limit: step 2's `--hard`
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -49,7 +49,7 @@ that runs a piece of your code and asserts that the result is what it should be.
|
||||
holds, the test passes silently. If it doesn't, the test fails loudly and tells you exactly which
|
||||
expectation broke.
|
||||
|
||||
You've already been testing, by hand. Every time you ran `python cli.py list` and eyeballed the
|
||||
You've already been testing, by hand. Every time you ran `python3 cli.py list` and eyeballed the
|
||||
output, you ran a manual test: *do something, check the result looks right.* The problem with the
|
||||
manual version is the same problem copy-paste had in Module 1: it doesn't scale across files or
|
||||
across time. You can't re-run "eyeball every command" on every change, so you don't, so regressions
|
||||
@@ -71,12 +71,12 @@ class TestTaskList(unittest.TestCase):
|
||||
self.assertEqual(tl.tasks[0].title, "write the tests")
|
||||
```
|
||||
|
||||
The whole suite runs from the project folder with a single command: `python -m unittest`
|
||||
The whole suite runs from the project folder with a single command: `python3 -m unittest`
|
||||
auto-discovers files named `test_*.py`, and `-v` prints each test name and its result. A verbose run
|
||||
looks like:
|
||||
|
||||
```text
|
||||
$ python -m unittest -v
|
||||
$ python3 -m unittest -v
|
||||
test_add_appends_a_task (test_tasks.TestTaskList) ... ok
|
||||
|
||||
----------------------------------------------------------------------
|
||||
@@ -167,7 +167,7 @@ intent has to come from you.
|
||||
|
||||
One more framing before the lab. A test file just sitting in your repo is useful when you remember to
|
||||
run it; like the manual eyeball check, you eventually won't. The full payoff comes in
|
||||
**Module 14**, where Continuous Integration runs this exact `python -m unittest` command
|
||||
**Module 14**, where Continuous Integration runs this exact `python3 -m unittest` command
|
||||
automatically on every push, so a regression can't reach `main` without something going red first.
|
||||
|
||||
That's why this module comes immediately before CI: **tests are the content CI runs.** You can't
|
||||
@@ -256,7 +256,7 @@ Do this once yourself so the tool isn't magic. From inside your working copy of
|
||||
2. Run it:
|
||||
|
||||
```bash
|
||||
python -m unittest -v
|
||||
python3 -m unittest -v
|
||||
```
|
||||
|
||||
You should see one test, and `OK`. That's the entire mechanism. Everything else is more of these.
|
||||
@@ -286,7 +286,7 @@ Do this once yourself so the tool isn't magic. From inside your working copy of
|
||||
5. Run the suite:
|
||||
|
||||
```bash
|
||||
python -m unittest -v
|
||||
python3 -m unittest -v
|
||||
```
|
||||
|
||||
At least one `pending_count` test should **FAIL**, with something like
|
||||
@@ -310,8 +310,8 @@ Do this once yourself so the tool isn't magic. From inside your working copy of
|
||||
return len(self.pending())
|
||||
```
|
||||
|
||||
Re-run `python -m unittest -v`; green. Confirm the app agrees:
|
||||
`python cli.py add a && python cli.py add b && python cli.py done 0 && python cli.py count`
|
||||
Re-run `python3 -m unittest -v`; green. Confirm the app agrees:
|
||||
`python3 cli.py add a && python3 cli.py add b && python3 cli.py done 0 && python3 cli.py count`
|
||||
should report **1 task(s) pending**.
|
||||
|
||||
> Using your own app from earlier modules instead? If your `count` command was already correct,
|
||||
@@ -365,7 +365,7 @@ The honest limits, because a green suite invites overconfidence:
|
||||
|
||||
**You're done when:**
|
||||
|
||||
- You can run `python -m unittest -v` in your `tasks-app` and see your own tests pass.
|
||||
- You can run `python3 -m unittest -v` in your `tasks-app` and see your own tests pass.
|
||||
- You watched an intent-encoding test **fail**, traced it to the real `pending_count` bug, fixed the
|
||||
*code*, and watched it pass.
|
||||
- You can articulate, in your own words, the difference between a test that asserts current behavior
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
"""Reference test suite for the Module 13 lab. Peek only after you've tried it yourself.
|
||||
|
||||
Named `reference_test_tasks.py` (not `test_*.py`) on purpose, so `python -m unittest discover`
|
||||
Named `reference_test_tasks.py` (not `test_*.py`) on purpose, so `python3 -m unittest discover`
|
||||
does NOT pick it up automatically. To run it, copy it next to your working `tasks.py` (e.g.
|
||||
`~/ai-workflow-course/work/tasks-app/`) and run, from that directory:
|
||||
|
||||
python -m unittest reference_test_tasks
|
||||
python3 -m unittest reference_test_tasks
|
||||
|
||||
It assumes `tasks.py` is importable, which is why you run it from the tasks-app directory.
|
||||
|
||||
|
||||
@@ -15,11 +15,11 @@ has a `count` command (the Module 2 lab added one). The planted bug in this copy
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "write the tests"
|
||||
python cli.py add "fix the bug"
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python cli.py count
|
||||
python3 cli.py add "write the tests"
|
||||
python3 cli.py add "fix the bug"
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
python3 cli.py count
|
||||
```
|
||||
|
||||
Requires Python 3.10+. No third-party packages; tests use the standard library `unittest`.
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python cli.py count
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
python3 cli.py count
|
||||
|
||||
State is kept in tasks.json next to this file. Same minimal app from Modules 1 and 2, with a
|
||||
`count` command bolted on.
|
||||
@@ -32,7 +32,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -15,11 +15,11 @@ has a `count` command (the Module 2 lab added one). The planted bug in this copy
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "write the tests"
|
||||
python cli.py add "fix the bug"
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python cli.py count
|
||||
python3 cli.py add "write the tests"
|
||||
python3 cli.py add "fix the bug"
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
python3 cli.py count
|
||||
```
|
||||
|
||||
Requires Python 3.10+. No third-party packages; tests use the standard library `unittest`.
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python cli.py count
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
python3 cli.py count
|
||||
|
||||
State is kept in tasks.json next to this file. Same minimal app from Modules 1 and 2, with a
|
||||
`count` command bolted on.
|
||||
@@ -32,7 +32,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -78,7 +78,7 @@ Almost every CI configuration, on every forge, is the same four moves:
|
||||
4. **Run the checks**: lint, then test. Any check that exits non-zero fails the whole run.
|
||||
|
||||
That last point is the load-bearing one. CI's entire enforcement mechanism is the **exit code**.
|
||||
Every tool you'd run in a terminal returns 0 for success and non-zero for failure. `python -m
|
||||
Every tool you'd run in a terminal returns 0 for success and non-zero for failure. `python3 -m
|
||||
unittest` exits non-zero if a test fails. `ruff check` exits non-zero if it finds a lint problem. CI runs your
|
||||
commands and watches those exit codes; one failure turns the run red. You're not learning a new
|
||||
testing system; you're wiring the tools you already have to a trigger.
|
||||
@@ -154,7 +154,7 @@ When CI goes red, the skill is triage, and it's fast once you know the shape:
|
||||
3. **Read that step's log.** It's the same output the tool prints in your terminal: a failing
|
||||
`unittest` assertion, a `ruff` finding with a file and line number. CI didn't invent a new error
|
||||
format; it's showing you the command's own output.
|
||||
4. **Reproduce it locally.** The same command from the failed step (`python -m unittest` or
|
||||
4. **Reproduce it locally.** The same command from the failed step (`python3 -m unittest` or
|
||||
`ruff check .`) fails the same way on your own machine, because CI ran exactly that command. That
|
||||
reproducibility is the point: fix locally, confirm green locally, push again.
|
||||
|
||||
@@ -254,7 +254,7 @@ your machine first.
|
||||
that CI is nothing more than these same two commands is what makes the rest of the module click.
|
||||
|
||||
```bash
|
||||
python -m unittest # should report all tests passing
|
||||
python3 -m unittest # should report all tests passing
|
||||
ruff check . # should report no issues (or fix what it flags)
|
||||
```
|
||||
|
||||
@@ -325,7 +325,7 @@ and watch CI stop it.
|
||||
`git restore` (Module 12). What the agent runs looks like:
|
||||
|
||||
```bash
|
||||
python -m unittest # fails locally too: same command, same failure
|
||||
python3 -m unittest # fails locally too: same command, same failure
|
||||
git revert --no-edit HEAD # new commit that undoes "Simplify pending()" (Module 12)
|
||||
git push # CI re-runs on the fixed code and goes green again
|
||||
```
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Reproduced here so this module's lab is self-contained: if you already wrote tests in Module 13,
|
||||
use those instead. Standard-library `unittest`, exactly like Module 13, nothing to install.
|
||||
Run locally with `python -m unittest` from the project folder. CI runs exactly this.
|
||||
Run locally with `python3 -m unittest` from the project folder. CI runs exactly this.
|
||||
"""
|
||||
|
||||
import unittest
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -90,10 +90,10 @@ Set one for a single command:
|
||||
|
||||
```bash
|
||||
# macOS / Linux
|
||||
TASKS_API_KEY="sk-live-..." python sync.py
|
||||
TASKS_API_KEY="sk-live-..." python3 sync.py
|
||||
|
||||
# Windows PowerShell
|
||||
$env:TASKS_API_KEY="sk-live-..."; python sync.py
|
||||
$env:TASKS_API_KEY="sk-live-..."; python3 sync.py
|
||||
```
|
||||
|
||||
Read it back in code, and **fail loudly if it's missing**, because a silent empty string is worse
|
||||
@@ -307,7 +307,7 @@ type the commands by hand. Then you'll make it select config per environment.
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
python sync.py
|
||||
python3 sync.py
|
||||
```
|
||||
|
||||
It prints a simulated request, including `Authorization: Bearer sk-live-...`. Open `sync.py` and
|
||||
@@ -407,7 +407,7 @@ type the commands by hand. Then you'll make it select config per environment.
|
||||
5. Run it reading from your `.env`:
|
||||
|
||||
```bash
|
||||
python sync.py # loads .env -> dev URL, key from the file
|
||||
python3 sync.py # loads .env -> dev URL, key from the file
|
||||
```
|
||||
|
||||
6. Now prove the 12-factor point: **same code, different environment, no edit.** Override at the
|
||||
@@ -415,13 +415,13 @@ type the commands by hand. Then you'll make it select config per environment.
|
||||
|
||||
```bash
|
||||
# macOS / Linux
|
||||
APP_ENV=staging python sync.py
|
||||
APP_ENV=prod TASKS_API_KEY="sk-live-prod-key" python sync.py
|
||||
APP_ENV=staging python3 sync.py
|
||||
APP_ENV=prod TASKS_API_KEY="sk-live-prod-key" python3 sync.py
|
||||
```
|
||||
|
||||
```powershell
|
||||
# Windows PowerShell
|
||||
$env:APP_ENV="staging"; python sync.py
|
||||
$env:APP_ENV="staging"; python3 sync.py
|
||||
```
|
||||
|
||||
Watch the backend URL change with `APP_ENV` while the source never does. That's config in the
|
||||
@@ -483,7 +483,7 @@ left behind a `.env.example` so the next person (or agent) knows what to supply.
|
||||
- `sync.py` runs entirely from the environment, and `grep "sk-live" sync.py` prints nothing.
|
||||
- A real `.env` exists, contains your secret, and does **not** appear in `git status`, while
|
||||
`.env.example` is tracked.
|
||||
- `APP_ENV=staging python sync.py` and the default run hit different backend URLs with **zero**
|
||||
- `APP_ENV=staging python3 sync.py` and the default run hit different backend URLs with **zero**
|
||||
source edits between them.
|
||||
- You can state, in one sentence, why deleting a committed secret and re-committing does not fix the
|
||||
leak, and what the actual fix is (rotation).
|
||||
|
||||
@@ -11,7 +11,7 @@ Your job in the lab is to refactor BOTH out of the source and into the environme
|
||||
ahead and fix it yet; first run it as-is so you can see the smell.
|
||||
|
||||
Run it:
|
||||
python sync.py
|
||||
python3 sync.py
|
||||
|
||||
It does not actually hit the network (so the lab works offline, on any OS); it simulates the
|
||||
request and prints what it *would* send.
|
||||
|
||||
@@ -250,7 +250,7 @@ A CLI that exits immediately is awkward to "deploy." Give the app a long-running
|
||||
2. Run the service locally first, no container, to see it work:
|
||||
|
||||
```bash
|
||||
python serve.py # serves on http://localhost:8000
|
||||
python3 serve.py # serves on http://localhost:8000
|
||||
```
|
||||
|
||||
In another terminal:
|
||||
|
||||
@@ -4,7 +4,7 @@ Standard library only, no pip install, so the container image stays tiny and the
|
||||
dependencies to drift. It reuses the TaskList from tasks.py (Modules 1-2) unchanged.
|
||||
|
||||
Run it:
|
||||
python serve.py # serves on http://localhost:8000
|
||||
python3 serve.py # serves on http://localhost:8000
|
||||
|
||||
Endpoints:
|
||||
GET /health -> {"status": "ok", "version": <APP_VERSION>} (200)
|
||||
|
||||
@@ -330,7 +330,7 @@ That's the entire client/server loop, end to end, with zero code you wrote. Now
|
||||
> contents so I can read it."*
|
||||
|
||||
Then open the copied file yourself and read it. (It reuses `tasks.py` and shares the same
|
||||
`tasks.json`, so anything it changes shows up in `python cli.py list`.) The whole server is two
|
||||
`tasks.json`, so anything it changes shows up in `python3 cli.py list`.) The whole server is two
|
||||
tools:
|
||||
|
||||
```python
|
||||
@@ -411,14 +411,14 @@ That's the entire client/server loop, end to end, with zero code you wrote. Now
|
||||
the way you'd verify any runtime effect, by reading the *state*, not the repo:
|
||||
|
||||
```bash
|
||||
python cli.py list # the new task is there, because the server wrote the same tasks.json
|
||||
python3 cli.py list # the new task is there, because the server wrote the same tasks.json
|
||||
cat tasks.json # the raw state the server changed, end to end
|
||||
```
|
||||
|
||||
The AI just changed real state in a real system through a tool call. Notice what you did *not*
|
||||
reach for: `git diff`. `tasks.json` is deliberately gitignored (Module 2's `.gitignore` treats it
|
||||
as generated runtime state, not source), so `git diff` stays empty here, and that's correct, not a
|
||||
bug. The proof the task list changed is the live state (`python cli.py list` / `cat tasks.json`),
|
||||
bug. The proof the task list changed is the live state (`python3 cli.py list` / `cat tasks.json`),
|
||||
not version control; runtime data the app owns is exactly the kind of thing you keep *out* of
|
||||
history. No copy-paste, no script you ran by hand, no pasting `tasks.json` into a chat. That's
|
||||
"hands."
|
||||
@@ -477,7 +477,7 @@ The caveats, and one of them is large enough that it gets its own module.
|
||||
connected with `list_tasks` and `add_task` available.
|
||||
- You asked the AI a question and it answered by **calling a tool** against the live system, and you
|
||||
asked it to add a task and then **verified the change outside the AI** by reading the runtime state
|
||||
(`python cli.py list` / `cat tasks.json`), not `git diff`, because `tasks.json` is deliberately
|
||||
(`python3 cli.py list` / `cat tasks.json`), not `git diff`, because `tasks.json` is deliberately
|
||||
gitignored (Module 2).
|
||||
- You can explain the client/server model in one breath (*servers expose tools/resources/prompts;
|
||||
the client (your agentic tool) discovers and calls them on the AI's behalf*) and why "it's a
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -11,10 +11,10 @@ Setup (once):
|
||||
pip install "mcp[cli]"
|
||||
|
||||
Drop this file into your tasks-app folder, next to tasks.py and cli.py (it reuses them, and shares
|
||||
the same tasks.json, so a task the AI adds through this server shows up in `python cli.py list`).
|
||||
the same tasks.json, so a task the AI adds through this server shows up in `python3 cli.py list`).
|
||||
|
||||
Sanity-check that it starts (it will sit waiting for a client to talk to it; Ctrl-C to stop):
|
||||
python tasks_mcp_server.py
|
||||
python3 tasks_mcp_server.py
|
||||
|
||||
You don't normally run it by hand, though. Your agentic tool launches it for you; see the lab.
|
||||
"""
|
||||
|
||||
@@ -87,7 +87,7 @@ This is the distinction to lock in, because the two are siblings and easy to con
|
||||
| Analogy | The standing house rules posted on the wall | A labeled recipe card you pull out when you cook that dish |
|
||||
|
||||
They're complementary. The instructions file is the right home for facts true *all the time* ("tests
|
||||
run with `python -m unittest`"). A skill is the right home for a procedure you run *sometimes* ("here
|
||||
run with `python3 -m unittest`"). A skill is the right home for a procedure you run *sometimes* ("here
|
||||
is exactly how we add a command"). Module 5 even told you this was coming: start with the always-on
|
||||
file; graduate a procedure into a skill when it earns its own page.
|
||||
|
||||
@@ -147,7 +147,7 @@ On paper this is just "write a runbook." The AI-specific twist is what changes t
|
||||
|
||||
- **The AI will execute the playbook, not just read it.** A runbook for a human is a reminder; a skill
|
||||
for an agent is something it *performs*. The precision pays off immediately: vague step, vague
|
||||
result; imperative step ("run `python -m unittest`; do not claim success until it's green"), reliable
|
||||
result; imperative step ("run `python3 -m unittest`; do not claim success until it's green"), reliable
|
||||
result.
|
||||
- **The AI is confidently incomplete without one.** Asked to "add a command," it'll happily stop at
|
||||
the code and skip the test, the changelog, the clean commit, and sound finished doing it. The skill
|
||||
@@ -222,8 +222,8 @@ seen, producing all four parts without you listing the steps.
|
||||
5. Watch it perform the procedure. A correctly-followed skill will, without you saying any of it:
|
||||
- add `clear()` to `tasks.py` and wire a `clear` branch into `cli.py` (logic in the right file);
|
||||
- add a real test to `test_tasks.py` that asserts the list is empty afterward (not just "no crash");
|
||||
- run `python -m unittest` and show it green;
|
||||
- smoke-test `python cli.py clear` and show the output;
|
||||
- run `python3 -m unittest` and show it green;
|
||||
- smoke-test `python3 cli.py clear` and show the output;
|
||||
- add a `CHANGELOG.md` line;
|
||||
- stage code + test + changelog into one commit, **without** `tasks.json`.
|
||||
|
||||
@@ -232,8 +232,8 @@ seen, producing all four parts without you listing the steps.
|
||||
6. Don't take the AI's word for it. Check against the skill's own done-criteria:
|
||||
|
||||
```bash
|
||||
python -m unittest # green, and a clear-related test is present
|
||||
python cli.py add "x" && python cli.py clear && python cli.py list # -> (no tasks yet)
|
||||
python3 -m unittest # green, and a clear-related test is present
|
||||
python3 cli.py add "x" && python3 cli.py clear && python3 cli.py list # -> (no tasks yet)
|
||||
git show --stat HEAD # one commit: tasks.py, cli.py, test_tasks.py, CHANGELOG.md; no tasks.json
|
||||
```
|
||||
|
||||
@@ -318,6 +318,6 @@ time:
|
||||
that the example skill format stays generic (when-to-use / inputs / steps / done-criteria).
|
||||
- [ ] **Dependency chain intact.** Confirm Module 20 (MCP) and Module 22 (securing servers/skills) are
|
||||
still numbered as referenced, and that nothing here leans on a tool introduced after Module 20.
|
||||
- [ ] **Lab still runs.** `python -m unittest` is green in `lab/tasks-app/`, and the `clear`-command
|
||||
- [ ] **Lab still runs.** `python3 -m unittest` is green in `lab/tasks-app/`, and the `clear`-command
|
||||
walkthrough still matches the starter files (`add`/`list`/`done`/`count`, `test_tasks.py`,
|
||||
`CHANGELOG.md`).
|
||||
|
||||
@@ -24,7 +24,7 @@ Ask for these if they weren't given:
|
||||
|
||||
- Core logic lives in `tasks.py` (the `TaskList` class). The CLI front end is `cli.py`. State
|
||||
persists to `tasks.json`. **Never edit `tasks.json` by hand; it's generated.**
|
||||
- Tests live in `test_tasks.py` and run with `python -m unittest`. Standard library only; no
|
||||
- Tests live in `test_tasks.py` and run with `python3 -m unittest`. Standard library only; no
|
||||
third-party packages, no new dependencies.
|
||||
- The human-facing change log is `CHANGELOG.md`, newest entry on top.
|
||||
|
||||
@@ -42,10 +42,10 @@ Ask for these if they weren't given:
|
||||
crash." Assert the end state (e.g. after `clear()`, `len(tasks) == 0` and `pending()` is empty).
|
||||
A test that passes against a broken implementation is worse than no test.
|
||||
|
||||
4. **Run the tests.** `python -m unittest` from the project root. Do not claim success until it's
|
||||
4. **Run the tests.** `python3 -m unittest` from the project root. Do not claim success until it's
|
||||
green. If it fails, fix the code, not the test, and run again.
|
||||
|
||||
5. **Smoke-test the CLI.** Actually run it: `python cli.py COMMAND_NAME`, then `python cli.py list`
|
||||
5. **Smoke-test the CLI.** Actually run it: `python3 cli.py COMMAND_NAME`, then `python3 cli.py list`
|
||||
to confirm the visible result. Paste what you ran and what it printed.
|
||||
|
||||
6. **Add a `CHANGELOG.md` entry.** One line under the top heading, present tense:
|
||||
@@ -57,8 +57,8 @@ Ask for these if they weren't given:
|
||||
|
||||
## Done when
|
||||
|
||||
- `python -m unittest` is green and includes a new test that actually exercises `COMMAND_NAME`.
|
||||
- `python cli.py COMMAND_NAME` does `WHAT_IT_DOES` and you've shown the output.
|
||||
- `python3 -m unittest` is green and includes a new test that actually exercises `COMMAND_NAME`.
|
||||
- `python3 cli.py COMMAND_NAME` does `WHAT_IT_DOES` and you've shown the output.
|
||||
- `CHANGELOG.md` has a new top line for the command.
|
||||
- One commit contains the code, the test, and the changelog line, and nothing else (no
|
||||
`tasks.json`, no unrelated reformatting).
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python cli.py count
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
python3 cli.py count
|
||||
|
||||
State is kept in tasks.json next to this file. The same minimal app from Module 1 onward; the
|
||||
target your "add a command" skill extends.
|
||||
@@ -32,7 +32,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
"""Test suite for the tasks-app. Run from this folder with:
|
||||
|
||||
python -m unittest
|
||||
python3 -m unittest
|
||||
|
||||
Your "add a command" skill should ADD a test here for every new command. The point is to assert
|
||||
intended behavior, not just that nothing crashed.
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python cli.py count
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
python3 cli.py count
|
||||
|
||||
State is kept in tasks.json next to this file. The same minimal app from Module 1 onward; the
|
||||
target your "add a command" skill extends.
|
||||
@@ -32,7 +32,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
"""Test suite for the tasks-app. Run from this folder with:
|
||||
|
||||
python -m unittest
|
||||
python3 -m unittest
|
||||
|
||||
Your "add a command" skill should ADD a test here for every new command. The point is to assert
|
||||
intended behavior, not just that nothing crashed.
|
||||
|
||||
@@ -265,14 +265,14 @@ normal question) and the attacker (you plant content the agent reads).
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
python cli.py add "$(cat ~/ai-workflow-course/modules/22-securing-third-party-mcp-and-skills/lab/poisoned-task.txt)"
|
||||
python cli.py list
|
||||
python3 cli.py add "$(cat ~/ai-workflow-course/modules/22-securing-third-party-mcp-and-skills/lab/poisoned-task.txt)"
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
`poisoned-task.txt` contains a normal-looking task followed by an injected instruction (a fake
|
||||
"system" directive telling the assistant to reveal local secrets / run a command and hide it).
|
||||
|
||||
2. **Be the victim.** Paste the full output of `python cli.py list` into your agent's chat (Claude
|
||||
2. **Be the victim.** Paste the full output of `python3 cli.py list` into your agent's chat (Claude
|
||||
Code in these examples; sub your own) and ask the thing you'd actually ask: *"Here's my task list,
|
||||
summarize what's pending and tell me what to
|
||||
work on first."* Watch what happens. Depending on the model, it may flag the injection, or it may
|
||||
@@ -303,7 +303,7 @@ normal question) and the attacker (you plant content the agent reads).
|
||||
|
||||
```bash
|
||||
# the "tool" the agent is allowed to call in read-only mode
|
||||
python cli.py list # works
|
||||
python3 cli.py list # works
|
||||
# the tool it is NOT exposed (a write); in a least-privilege setup this path is simply absent
|
||||
```
|
||||
|
||||
|
||||
@@ -21,7 +21,7 @@ Set your Notion token and run the sync:
|
||||
|
||||
```
|
||||
export NOTION_TOKEN="secret_..."
|
||||
python tools/sync.py
|
||||
python3 tools/sync.py
|
||||
```
|
||||
|
||||
## Usage notes for the AI assistant
|
||||
|
||||
@@ -196,7 +196,7 @@ This lab does **not** use `tasks-app`; the entire point is a codebase you *didn'
|
||||
git clone <repo-url> unfamiliar-repo
|
||||
cd unfamiliar-repo
|
||||
# copy modules/23-working-with-existing-codebases/lab/orient.py into this folder
|
||||
python orient.py > ORIENT.md
|
||||
python3 orient.py > ORIENT.md
|
||||
```
|
||||
|
||||
2. Read `ORIENT.md` yourself first. In 30 seconds you should know the language, the likely entry
|
||||
|
||||
@@ -12,8 +12,8 @@ then verifies and deepens this; you never let it map from vibes alone.
|
||||
|
||||
No dependencies. Standard library only. Works on any OS with Python 3.10+ and git.
|
||||
|
||||
python orient.py # print the pack
|
||||
python orient.py > ORIENT.md # save it to hand to the AI (don't commit it)
|
||||
python3 orient.py # print the pack
|
||||
python3 orient.py > ORIENT.md # save it to hand to the AI (don't commit it)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -61,7 +61,7 @@ SIGNALS: dict[str, str] = {
|
||||
|
||||
# Common test-runner hints keyed off a present signal file.
|
||||
TEST_HINTS: dict[str, str] = {
|
||||
"pyproject.toml": "pytest (or: python -m pytest)",
|
||||
"pyproject.toml": "pytest (or: python3 -m pytest)",
|
||||
"tox.ini": "tox",
|
||||
"package.json": "npm test (check the \"scripts\" block for the real command)",
|
||||
"go.mod": "go test ./...",
|
||||
|
||||
@@ -184,7 +184,7 @@ This is the real production loop with the forge plumbing simulated locally.
|
||||
|
||||
**You'll need:**
|
||||
|
||||
- Python 3.10+ (`python --version`).
|
||||
- Python 3.10+ (`python3 --version`).
|
||||
- The lab files in `~/ai-workflow-course/modules/24-assistive-agents/lab/`.
|
||||
- Claude Code (`claude --version`; sub your own agent), the editor/CLI agent from Module 4.
|
||||
|
||||
@@ -205,7 +205,7 @@ it runs the scripts and writes the files. You verify at the gate.
|
||||
|
||||
```
|
||||
You: In ~/ai-workflow-course/modules/24-assistive-agents/lab, run
|
||||
`python reviewer.py apply ai-review.sample.json` and show me the output.
|
||||
`python3 reviewer.py apply ai-review.sample.json` and show me the output.
|
||||
```
|
||||
|
||||
Read what comes back: comments sorted by severity, a recommendation, and then the **human decision
|
||||
@@ -215,7 +215,7 @@ it runs the scripts and writes the files. You verify at the gate.
|
||||
the reviewer, and write its JSON review to a file:
|
||||
|
||||
```
|
||||
You: Run `python reviewer.py prompt`, follow the rubric in that output to review the diff, and
|
||||
You: Run `python3 reviewer.py prompt`, follow the rubric in that output to review the diff, and
|
||||
save your review as JSON to my-review.json.
|
||||
```
|
||||
|
||||
@@ -226,7 +226,7 @@ it runs the scripts and writes the files. You verify at the gate.
|
||||
3. Have the agent render its own review through the gate:
|
||||
|
||||
```
|
||||
You: Run `python reviewer.py apply my-review.json` and show me the result.
|
||||
You: Run `python3 reviewer.py apply my-review.json` and show me the result.
|
||||
```
|
||||
|
||||
4. **Make the human decision. This part stays yours.** Open `feature.patch` and check the agent's
|
||||
@@ -243,7 +243,7 @@ A new issue just arrived: `sample-issue.md` (the `done` command crashes on an em
|
||||
1. See the loop with the canned response:
|
||||
|
||||
```
|
||||
You: Run `python triage.py apply ai-triage.sample.json` and show me the output.
|
||||
You: Run `python3 triage.py apply ai-triage.sample.json` and show me the output.
|
||||
```
|
||||
|
||||
Read the suggested labels, the route, and the **human confirm gate**. The agent applied nothing.
|
||||
@@ -252,14 +252,14 @@ A new issue just arrived: `sample-issue.md` (the `done` command crashes on an em
|
||||
and save its suggestion:
|
||||
|
||||
```
|
||||
You: Run `python triage.py prompt`, follow it to triage the issue using only the committed
|
||||
You: Run `python3 triage.py prompt`, follow it to triage the issue using only the committed
|
||||
taxonomy, and save your JSON suggestion to my-triage.json.
|
||||
```
|
||||
|
||||
3. Render the suggestion through the gate:
|
||||
|
||||
```
|
||||
You: Run `python triage.py apply my-triage.json` and show me the result.
|
||||
You: Run `python3 triage.py apply my-triage.json` and show me the result.
|
||||
```
|
||||
|
||||
4. **Watch the guardrail.** The script validates every suggested label against the committed
|
||||
@@ -340,8 +340,8 @@ This is expansion-zone material; the agent-tooling landscape moves fast. Re-chec
|
||||
merge/close, so comment/label-only is actually grantable? Name two that do.
|
||||
- [ ] Is the turnkey "AI review bot / app" framing still accurate, or has the dominant pattern shifted
|
||||
(e.g. baked into the forge, or into editor agents)? Keep the description vendor-neutral.
|
||||
- [ ] Confirm the lab scripts run on a current Python (`python reviewer.py apply ai-review.sample.json`
|
||||
and `python triage.py apply ai-triage.sample.json`) with no dependencies.
|
||||
- [ ] Confirm the lab scripts run on a current Python (`python3 reviewer.py apply ai-review.sample.json`
|
||||
and `python3 triage.py apply ai-triage.sample.json`) with no dependencies.
|
||||
- [ ] Re-verify the cross-references resolve to the right module numbers (9, 10, 13, 14, 15, 22, 25)
|
||||
if any modules were renumbered.
|
||||
- [ ] Check that nothing here pins a specific LLM vendor or a specific bot's config filename.
|
||||
|
||||
@@ -6,8 +6,8 @@ index 91e9276..b2c4f1a 100644
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
- print("usage: python cli.py [add <title> | list | done <index>]")
|
||||
+ print("usage: python cli.py [add <title> | list | done <index> | clear]")
|
||||
- print("usage: python3 cli.py [add <title> | list | done <index>]")
|
||||
+ print("usage: python3 cli.py [add <title> | list | done <index> | clear]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -4,8 +4,8 @@ This stands in for a forge-native reviewer (an app/bot triggered when a PR opens
|
||||
runner from Module 19) without needing any hosted account. It does the two deterministic halves of
|
||||
the job and leaves the one judgment call (what actually happens to the PR) to you.
|
||||
|
||||
python reviewer.py prompt # assemble the prompt: rubric + diff, for the agent to review
|
||||
python reviewer.py apply ai-review.sample.json # ingest the agent's JSON, render it, gate it
|
||||
python3 reviewer.py prompt # assemble the prompt: rubric + diff, for the agent to review
|
||||
python3 reviewer.py apply ai-review.sample.json # ingest the agent's JSON, render it, gate it
|
||||
|
||||
The point of this module: the agent produces comments and a recommendation. It never approves,
|
||||
never requests-changes-as-a-gate, never merges. The `apply` step ends at a HUMAN DECISION, every
|
||||
|
||||
@@ -1,12 +1,12 @@
|
||||
Title: `done` command crashes on an empty list
|
||||
|
||||
When I run `python cli.py done 0` right after a fresh checkout, before adding any tasks, it throws
|
||||
When I run `python3 cli.py done 0` right after a fresh checkout, before adding any tasks, it throws
|
||||
an IndexError and dumps a stack trace instead of a friendly message. Every other command handles the
|
||||
empty-list case fine, so this one feels like an oversight.
|
||||
|
||||
Steps to reproduce:
|
||||
1. Delete tasks.json (or clone fresh).
|
||||
2. Run `python cli.py done 0`.
|
||||
2. Run `python3 cli.py done 0`.
|
||||
3. See the traceback.
|
||||
|
||||
Expected: a clear message like "no task at index 0", exit non-zero, no traceback.
|
||||
|
||||
@@ -4,8 +4,8 @@ Stands in for a forge-native triage agent (triggered when an issue opens) withou
|
||||
It assembles the prompt, then validates and renders the AI's suggestion, and stops at a human
|
||||
confirm. The agent proposes labels and a route; it does not apply them.
|
||||
|
||||
python triage.py prompt # taxonomy + issue -> prompt for the agent
|
||||
python triage.py apply ai-triage.sample.json # validate + render + confirm gate
|
||||
python3 triage.py prompt # taxonomy + issue -> prompt for the agent
|
||||
python3 triage.py apply ai-triage.sample.json # validate + render + confirm gate
|
||||
|
||||
The validation step matters: the agent may only use labels that exist in label-taxonomy.md. A
|
||||
hallucinated label is rejected. Stdlib only, no pip install.
|
||||
|
||||
@@ -256,7 +256,7 @@ out of the agent's `git add -A`, so the change you review in Part B is clean. Th
|
||||
|
||||
```bash
|
||||
# Simulate an agent that produces a BROKEN change, then run the gate on it:
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
```
|
||||
|
||||
The orchestrator creates and switches to its own `agent/issue-delete-command` branch first (the same
|
||||
@@ -270,7 +270,7 @@ reached `main`.
|
||||
### Part B: See a good change land as a PR proposal
|
||||
|
||||
```bash
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate good
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md --simulate good
|
||||
```
|
||||
|
||||
This time the planted change is correct. The gate passes, the script commits to the branch and prints
|
||||
@@ -284,7 +284,7 @@ stops at a PR; it never merges.
|
||||
### Part C: Run the self-healing loop
|
||||
|
||||
```bash
|
||||
python agent_runner.py self-heal --simulate bad
|
||||
python3 agent_runner.py self-heal --simulate bad
|
||||
```
|
||||
|
||||
The orchestrator switches to its own `agent/self-heal` branch (again, you direct the automation, not
|
||||
@@ -302,7 +302,7 @@ Two ways to go from simulation to a genuine autonomous run:
|
||||
|
||||
```bash
|
||||
export AGENT_CMD='your-agent-cli --print --prompt-file {prompt_file}' # your tool's one-shot mode
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md
|
||||
```
|
||||
|
||||
The script builds the prompt from the issue **and** your committed config (Module 5), runs your
|
||||
|
||||
@@ -62,7 +62,7 @@ jobs:
|
||||
# In the triggered case, write the issue body to a file for the agent to read. Read it from
|
||||
# $BODY so the shell treats it as data, not as script text.
|
||||
printf '%s' "$BODY" > issue.md
|
||||
python modules/25-autonomous-agents/lab/agent_runner.py issue-to-pr issue.md
|
||||
python3 modules/25-autonomous-agents/lab/agent_runner.py issue-to-pr issue.md
|
||||
|
||||
# The agent's output is a PROPOSAL. Open the PR; do NOT merge. CI + security + review decide.
|
||||
# (Use your forge's PR-creation step or CLI here; kept generic to stay vendor-neutral.)
|
||||
|
||||
@@ -14,10 +14,10 @@ not a human watching it type.
|
||||
Run it two ways:
|
||||
|
||||
1. Simulated (no agent needed, fully deterministic); see the machinery and the gates:
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate good
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
python agent_runner.py self-heal --simulate bad
|
||||
python agent_runner.py self-heal --simulate stuck
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md --simulate good
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
python3 agent_runner.py self-heal --simulate bad
|
||||
python3 agent_runner.py self-heal --simulate stuck
|
||||
|
||||
Simulation works on a SELF-CONTAINED demo target (agent_demo.py + test_agent_demo.py) so it is
|
||||
deterministic and never corrupts your real tasks-app files. The gate it runs (ruff + pytest) is
|
||||
@@ -26,7 +26,7 @@ Run it two ways:
|
||||
2. Real agent: drives your own agentic tool against the actual issue. Point AGENT_CMD at your
|
||||
tool's non-interactive / one-shot mode, then drop --simulate:
|
||||
export AGENT_CMD='your-agent-cli --print --prompt-file {prompt_file}'
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md
|
||||
|
||||
Language: Python 3.10+. Standard library only.
|
||||
"""
|
||||
|
||||
@@ -20,7 +20,7 @@ is a patterned change, not a design problem.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
- `python cli.py delete <index>` removes the task at that 0-based index and saves the list.
|
||||
- `python3 cli.py delete <index>` removes the task at that 0-based index and saves the list.
|
||||
- After deleting, the remaining tasks keep their relative order.
|
||||
- `delete` with an out-of-range or non-integer index prints a clear error (e.g.
|
||||
`no task at index 99`) and exits non-zero, instead of dumping a traceback.
|
||||
|
||||
@@ -401,7 +401,7 @@ thing you're waiting on.
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
python cli.py list && python cli.py count && python cli.py clear # all three features live
|
||||
python3 cli.py list && python3 cli.py count && python3 cli.py clear # all three features live
|
||||
```
|
||||
|
||||
If any of those three commands fails, the resolution was wrong. That's why you verify the result
|
||||
|
||||
@@ -15,7 +15,7 @@ You are working in this worktree only. Do not touch any other folder.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- `python cli.py count` prints the number of pending tasks and exits 0.
|
||||
- `python3 cli.py count` prints the number of pending tasks and exits 0.
|
||||
- No other files change. (`README.md`, `CHANGELOG.md`, and `tasks.py` are owned by other agents;
|
||||
stay out of them.)
|
||||
|
||||
|
||||
@@ -17,8 +17,8 @@ You are working in this worktree only. Do not touch any other folder.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- `python cli.py clear` removes all tasks and prints `cleared`.
|
||||
- `python cli.py list` afterward shows `(no tasks yet)`.
|
||||
- `python3 cli.py clear` removes all tasks and prints `cleared`.
|
||||
- `python3 cli.py list` afterward shows `(no tasks yet)`.
|
||||
|
||||
When done, commit your work on this branch with a message referencing #44, then push the branch. Stop
|
||||
there; the human opens and reviews the PR, and should expect a conflict against `feature/42-count` at
|
||||
|
||||
@@ -15,11 +15,11 @@ This is the running example for **Module 1** (where you feel the copy-paste prob
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
python cli.py add "read module 1"
|
||||
python cli.py add "set up my editor"
|
||||
python cli.py list
|
||||
python cli.py done 0
|
||||
python cli.py list
|
||||
python3 cli.py add "read module 1"
|
||||
python3 cli.py add "set up my editor"
|
||||
python3 cli.py list
|
||||
python3 cli.py done 0
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
Requires Python 3.10+ (it uses `list[Task]` style type hints). No third-party packages.
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Tiny command-line front end for the demo task app.
|
||||
|
||||
Run it:
|
||||
python cli.py add "write the lesson"
|
||||
python cli.py list
|
||||
python3 cli.py add "write the lesson"
|
||||
python3 cli.py list
|
||||
|
||||
State is kept in tasks.json next to this file. It's intentionally minimal; the point of this app
|
||||
is to be a realistic-but-small thing you change with an AI, not a product.
|
||||
@@ -31,7 +31,7 @@ def save(tlist: TaskList) -> None:
|
||||
def main(argv: list[str]) -> int:
|
||||
tlist = load()
|
||||
if not argv:
|
||||
print("usage: python cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | count | delete <index>]")
|
||||
return 1
|
||||
|
||||
command = argv[0]
|
||||
|
||||
@@ -245,7 +245,7 @@ output.
|
||||
|
||||
```bash
|
||||
cd modules/27-evals/lab
|
||||
python run_eval.py candidates/current_model
|
||||
python3 run_eval.py candidates/current_model
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
@@ -258,7 +258,7 @@ output.
|
||||
2. Now simulate the swap: run the *exact same eval set* against the other candidate:
|
||||
|
||||
```bash
|
||||
python run_eval.py candidates/swapped_model
|
||||
python3 run_eval.py candidates/swapped_model
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
@@ -275,7 +275,7 @@ output.
|
||||
yourself and read the scorecard:
|
||||
|
||||
```bash
|
||||
python run_eval.py candidates/my_run_1
|
||||
python3 run_eval.py candidates/my_run_1
|
||||
```
|
||||
|
||||
4. Now actually swap something. Either change the model Claude Code uses, or change the *prompt* (ask
|
||||
@@ -309,10 +309,10 @@ output.
|
||||
`working-directory:` line makes the CI job `cd` into the lab folder first, so the
|
||||
`candidates/...` path and `run_eval.py`'s own `from eval_set import CASES` resolve exactly as they
|
||||
did on your machine. (Drop it and point a repo-root job straight at
|
||||
`python modules/27-evals/lab/run_eval.py candidates/current_model`, and `candidates/`
|
||||
`python3 modules/27-evals/lab/run_eval.py candidates/current_model`, and `candidates/`
|
||||
won't exist from the repo root: the gate crashes with a *false* failure, which is worse than no
|
||||
gate. If the agent prefers a single line, it can spell both paths out from the repo root:
|
||||
`python modules/27-evals/lab/run_eval.py modules/27-evals/lab/candidates/current_model
|
||||
`python3 modules/27-evals/lab/run_eval.py modules/27-evals/lab/candidates/current_model
|
||||
--threshold 1.0`.)
|
||||
|
||||
Below threshold exits non-zero and the pipeline blocks, exactly like a failing test. The guardrail
|
||||
@@ -388,5 +388,5 @@ This is an expansion-zone module over fast-moving ground. Re-check at build/publ
|
||||
- [ ] **Module cross-references.** Confirm Modules 13, 14, 10, and 24–26 still carry the
|
||||
responsibilities referenced here (tests, CI gating, review, the agent autonomy ladder) and that
|
||||
none were renumbered.
|
||||
- [ ] **Lab still runs.** `python run_eval.py candidates/current_model` exits 0 at 100%, and
|
||||
- [ ] **Lab still runs.** `python3 run_eval.py candidates/current_model` exits 0 at 100%, and
|
||||
`candidates/swapped_model` exits 1 below threshold, on a current Python 3.x.
|
||||
|
||||
@@ -14,7 +14,7 @@ than pretending. NOTHING here pins a provider.
|
||||
EVAL_JUDGE_MODEL # the model name to ask for
|
||||
|
||||
Run it standalone to grade one sample:
|
||||
python llm_judge.py "Add count command" "fix"
|
||||
python3 llm_judge.py "Add count command" "fix"
|
||||
"""
|
||||
|
||||
import json
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Run the eval set against one candidate and print a scorecard.
|
||||
|
||||
Usage:
|
||||
python run_eval.py candidates/current_model
|
||||
python run_eval.py candidates/swapped_model
|
||||
python run_eval.py candidates/current_model --threshold 0.9
|
||||
python3 run_eval.py candidates/current_model
|
||||
python3 run_eval.py candidates/swapped_model
|
||||
python3 run_eval.py candidates/current_model --threshold 0.9
|
||||
|
||||
A "candidate" is a directory containing a tasks.py that an agent produced. The
|
||||
runner imports that tasks.py, runs every case in eval_set.py against it, prints
|
||||
|
||||
Reference in New Issue
Block a user