docs(wiki): sync from modules/ @ 95e59119
+14
-12
@@ -151,11 +151,13 @@ purpose** so you recognize it later.
|
||||
- Python 3.10 or newer (`python --version` or `python3 --version` to check).
|
||||
- Your usual AI chat assistant, open in a browser tab.
|
||||
|
||||
> **One command name, the whole course through:** whichever of `python` / `python3` just printed a
|
||||
> 3.10+ version is the command to use in *every* lab from here on. The labs are written with
|
||||
> `python`; if that's "command not found" on your machine (common on current macOS and default
|
||||
> Debian/Ubuntu, where Python is installed only as `python3`), read it as `python3` (and `pip3`
|
||||
> wherever a lab uses `pip`). This note holds course-wide; we won't repeat it.
|
||||
> **One command name, the whole course through:** the labs are written with `python3`, the command
|
||||
> name current macOS and default Debian/Ubuntu actually ship (they install Python only as `python3`,
|
||||
> with no bare `python` on PATH). Run `python3 --version`; if it prints a 3.10+ version, use `python3`
|
||||
> in *every* lab from here on. If `python3` is "command not found" but `python --version` shows a
|
||||
> 3.10+ version (older or some Windows setups), read every `python3` in the labs as `python` instead.
|
||||
> Where a lab runs `pip`, use whichever pairs with your Python (`pip3` commonly goes with `python3`).
|
||||
> This note holds course-wide; we won't repeat it.
|
||||
|
||||
### Get the course materials
|
||||
|
||||
@@ -196,8 +198,8 @@ You now have every module's files locally, including this one's under
|
||||
3. Run it in your terminal to confirm it works:
|
||||
|
||||
```bash
|
||||
python cli.py add "finish module 1"
|
||||
python cli.py list
|
||||
python3 cli.py add "finish module 1"
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
You should see your task listed. **This is your "real local project, an editor, and a terminal."**
|
||||
@@ -208,14 +210,14 @@ You now have every module's files locally, including this one's under
|
||||
Now reproduce each failure deliberately. Keep the AI strictly in the **browser chat**; no
|
||||
editor-integrated tools yet (those arrive in Module 4). This is the "before" picture on purpose.
|
||||
|
||||
1. **Seam 1 (multiple files).** First mark a task done so there's something to hide. Run `python
|
||||
cli.py done 0`, then `python cli.py list` shows it as `[x]`. Now paste *only* `cli.py` into your
|
||||
1. **Seam 1 (multiple files).** First mark a task done so there's something to hide. Run `python3
|
||||
cli.py done 0`, then `python3 cli.py list` shows it as `[x]`. Now paste *only* `cli.py` into your
|
||||
chat and ask: *"Make the `list` command hide tasks that are already done."* Apply whatever it
|
||||
gives you and run `python cli.py list`. The clean version of this change lives in `tasks.py`, the
|
||||
gives you and run `python3 cli.py list`. The clean version of this change lives in `tasks.py`, the
|
||||
file you *didn't* paste: open it and you'll see `render()` already owns the `[x]`/`[ ]`
|
||||
box-and-index formatting, and a `pending()` helper already returns exactly the not-done tasks. But
|
||||
the chat never saw that file, so it had to do one of two things. Either it guessed at methods it
|
||||
couldn't see (and `python cli.py list` errors out), or it reached into the raw task list and
|
||||
couldn't see (and `python3 cli.py list` errors out), or it reached into the raw task list and
|
||||
*re-created* that box-and-index formatting inside `cli.py`, duplicating logic that already existed
|
||||
one file over. Either way, *you* had to be the one who knew the change really belonged in the
|
||||
other file.
|
||||
@@ -254,7 +256,7 @@ Be honest about the limits of this module's claims:
|
||||
|
||||
**You're done when:**
|
||||
|
||||
- You can run `python cli.py list` in your terminal and see output; your project, editor, and
|
||||
- You can run `python3 cli.py list` in your terminal and see output; your project, editor, and
|
||||
terminal are working together.
|
||||
- You can name the three seams where copy-paste breaks (more than one file, more than one day, no
|
||||
undo) without looking back at the lesson.
|
||||
|
||||
@@ -210,7 +210,7 @@ and your AI assistant.
|
||||
|
||||
This is the habit that replaces "paste it back and hope." You're reading exactly what changed,
|
||||
nothing more, nothing less. Confirm it does what you asked and didn't touch anything it shouldn't.
|
||||
Run it (`python cli.py count`), then commit:
|
||||
Run it (`python3 cli.py count`), then commit:
|
||||
|
||||
```bash
|
||||
git add .
|
||||
@@ -229,7 +229,7 @@ and your AI assistant.
|
||||
git status # shows tasks.py as modified
|
||||
git restore tasks.py # discard the change; back to your last commit, byte for byte
|
||||
git diff # empty: nothing changed. you're clean.
|
||||
python cli.py list # works again
|
||||
python3 cli.py list # works again
|
||||
```
|
||||
|
||||
You just recovered from a bad AI change in one command, with zero retyping and zero guesswork.
|
||||
@@ -264,7 +264,7 @@ and your AI assistant.
|
||||
|
||||
9. Close the loop and leave the repo clean. The cold session just told you what's in progress and
|
||||
what to do next: finish the `delete <index>` command. Do that with the AI (paste in `cli.py` the
|
||||
same way as Part B), run it to confirm it works (`python cli.py delete 1`), then commit:
|
||||
same way as Part B), run it to confirm it works (`python3 cli.py delete 1`), then commit:
|
||||
|
||||
```bash
|
||||
git add .
|
||||
|
||||
@@ -54,7 +54,7 @@ committed instructions file from the repo, and you control what's in it.**
|
||||
> content if so. The principle outlives any one vendor's filename.
|
||||
|
||||
Without this file, you re-explain your project every session: "we use 4-space indent," "run the tests
|
||||
with `python -m unittest` before you say you're done," "don't touch the generated `tasks.json`." You say it,
|
||||
with `python3 -m unittest` before you say you're done," "don't touch the generated `tasks.json`." You say it,
|
||||
the AI complies, the session ends, the memory evaporates (Module 1's second seam), and tomorrow you
|
||||
say it all again. The instructions file is where that knowledge stops being something you retype and
|
||||
becomes something the project *carries*.
|
||||
@@ -68,7 +68,7 @@ a briefing for an agent that will edit this code. Keep it to what changes the AI
|
||||
uses. "Core logic lives in `tasks.py`; the CLI front end is `cli.py`; state persists to
|
||||
`tasks.json`."
|
||||
- **Build and test commands**: the exact commands, copy-pasteable. "Run the app with
|
||||
`python cli.py <command>`. Run tests with `python -m unittest`. Don't claim a change works until
|
||||
`python3 cli.py <command>`. Run tests with `python3 -m unittest`. Don't claim a change works until
|
||||
the tests pass." This single line stops the AI from inventing a test runner you don't use.
|
||||
- **Coding standards**: formatting, typing, error handling, the libraries you do and don't want.
|
||||
"Use the standard library only, no third-party packages. Type-hint public functions."
|
||||
@@ -89,7 +89,7 @@ useful for personal preferences, but it's the wrong home for project knowledge,
|
||||
lives: on *your* laptop, invisible to everyone else.
|
||||
|
||||
Picture a two-person project with no committed instructions file. You've trained your local setup to
|
||||
run `python -m unittest` and avoid `tasks.json`. Your teammate's setup hasn't, so their agent reformats whole files
|
||||
run `python3 -m unittest` and avoid `tasks.json`. Your teammate's setup hasn't, so their agent reformats whole files
|
||||
and hand-edits the generated JSON. You're both "using AI on the same repo," but you're getting
|
||||
different behavior, and neither of you can see the other's configuration. That's **drift**: the same
|
||||
codebase, diverging because the rules live in two heads instead of one file.
|
||||
@@ -221,7 +221,7 @@ editor-integrated AI (Module 4) for the part where the AI obeys the file.
|
||||
- The `tasks-app` repo from Module 2 (already a Git repo with some history).
|
||||
- Your agentic coding tool from Module 4, and knowledge of which filename it reads for repo-level
|
||||
instructions (check its docs; see the note in *Key concepts*).
|
||||
- Optionally, a test command for the AI to honor; Python's built-in `python -m unittest` works with
|
||||
- Optionally, a test command for the AI to honor; Python's built-in `python3 -m unittest` works with
|
||||
nothing to install (you'll write a real suite in Module 13; until then it simply reports no tests).
|
||||
|
||||
### Part A: Write the instructions file and let the AI commit the config
|
||||
@@ -327,7 +327,7 @@ Be honest about what a committed instructions file does and doesn't buy you:
|
||||
- **Bloat kills it.** A 300-line instructions file is read the way *you* read a 300-line terms-of-
|
||||
service: not really. Every line you add dilutes the rest. Keep it to what actually changes behavior,
|
||||
and prune lines the model already honors without being told.
|
||||
- **Stale instructions are worse than none.** A file that says "run the tests with `python -m
|
||||
- **Stale instructions are worse than none.** A file that says "run the tests with `python3 -m
|
||||
unittest`" after you've switched to a different runner will actively misdirect the AI. The file is
|
||||
code-adjacent: it has to be maintained like code, and reviewed like code. That's exactly why
|
||||
committing it (so changes are
|
||||
|
||||
@@ -170,9 +170,9 @@ decide:
|
||||
|
||||
```python
|
||||
<<<<<<< HEAD
|
||||
print("usage: python cli.py [add <title> | list | done <index> | stats]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | stats]")
|
||||
=======
|
||||
print("usage: python cli.py [add <title> | list | done <index> | purge]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | purge]")
|
||||
>>>>>>> experiment
|
||||
```
|
||||
|
||||
@@ -301,9 +301,9 @@ the one job that's still yours: verify the result.
|
||||
|
||||
```bash
|
||||
git diff # read what it actually changed
|
||||
python cli.py add "ship module 6" --priority high
|
||||
python cli.py add "water plants" --priority low
|
||||
python cli.py list # see if priorities work and sort
|
||||
python3 cli.py add "ship module 6" --priority high
|
||||
python3 cli.py add "water plants" --priority low
|
||||
python3 cli.py list # see if priorities work and sort
|
||||
```
|
||||
|
||||
Once the diff looks right and the feature runs, tell the agent:
|
||||
@@ -318,7 +318,7 @@ the one job that's still yours: verify the result.
|
||||
> *"Switch back to `main`."*
|
||||
|
||||
```bash
|
||||
python cli.py list # no priorities; main is exactly as you left it
|
||||
python3 cli.py list # no priorities; main is exactly as you left it
|
||||
```
|
||||
|
||||
Your bold change exists only on the branch. `main` never saw it, and that's the whole point.
|
||||
@@ -337,7 +337,7 @@ Then verify the result yourself:
|
||||
|
||||
```bash
|
||||
git log --oneline --graph # straight line = fast-forward merge
|
||||
python cli.py list # the feature is now on main
|
||||
python3 cli.py list # the feature is now on main
|
||||
git branch # experiment/priorities is gone
|
||||
```
|
||||
|
||||
@@ -349,7 +349,7 @@ Then verify:
|
||||
|
||||
```bash
|
||||
git log --oneline # no trace of the experiment on main
|
||||
python cli.py list # main is untouched, exactly as before
|
||||
python3 cli.py list # main is untouched, exactly as before
|
||||
git branch # the branch is gone
|
||||
```
|
||||
|
||||
@@ -417,9 +417,9 @@ Merge conflicts have an outsized reputation for difficulty. You'll engineer a gu
|
||||
|
||||
```python
|
||||
<<<<<<< HEAD
|
||||
print("usage: python cli.py [add <title> | list | done <index> | purge]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | purge]")
|
||||
=======
|
||||
print("usage: python cli.py [add <title> | list | done <index> | stats]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | stats]")
|
||||
>>>>>>> feature/stats
|
||||
```
|
||||
|
||||
@@ -452,7 +452,7 @@ Merge conflicts have an outsized reputation for difficulty. You'll engineer a gu
|
||||
should have produced a single, marker-free line listing both commands, e.g.:
|
||||
|
||||
```python
|
||||
print("usage: python cli.py [add <title> | list | done <index> | stats | purge]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | stats | purge]")
|
||||
```
|
||||
|
||||
**Here is the punchline of the whole module: you have no idea yet whether that's right, so verify.**
|
||||
@@ -464,9 +464,9 @@ Merge conflicts have an outsized reputation for difficulty. You'll engineer a gu
|
||||
```bash
|
||||
git diff HEAD~1 # what the merge actually changed; confirm no markers remain
|
||||
git log --oneline --graph # the fork-and-join: this is a merge commit
|
||||
python cli.py # run with no args, see the merged usage string
|
||||
python cli.py stats # both commands actually work
|
||||
python cli.py purge
|
||||
python3 cli.py # run with no args, see the merged usage string
|
||||
python3 cli.py stats # both commands actually work
|
||||
python3 cli.py purge
|
||||
```
|
||||
|
||||
If the usage line lists both commands and both run, the AI's silent resolution was correct. If it
|
||||
|
||||
@@ -329,8 +329,8 @@ This is the part to actually *do simultaneously*, not one then the other.
|
||||
writing them.) Give each worktree its own task and list it:
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python cli.py add "from worktree A" && python cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python cli.py add "from worktree B" && python cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python3 cli.py add "from worktree A" && python3 cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python3 cli.py add "from worktree B" && python3 cli.py list
|
||||
```
|
||||
|
||||
Each `list` shows only its own task: worktree A never sees "from worktree B" and vice versa. Each
|
||||
@@ -355,8 +355,8 @@ This is the part to actually *do simultaneously*, not one then the other.
|
||||
5. *Now* the new commands exist: run each in its own worktree to watch it work:
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python cli.py wipe # agent A's new command
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python cli.py remaining # agent B's new command
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python3 cli.py wipe # agent A's new command
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python3 cli.py remaining # agent B's new command
|
||||
```
|
||||
|
||||
`remaining` counts a single pending task, the one you added to worktree B in step 3, because B's
|
||||
@@ -384,9 +384,9 @@ Then **verify** the result before you trust it, the same way you did in Module 6
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
git diff # no conflict markers remain
|
||||
python cli.py list # the app still runs
|
||||
python cli.py wipe # both new commands work
|
||||
python cli.py remaining
|
||||
python3 cli.py list # the app still runs
|
||||
python3 cli.py wipe # both new commands work
|
||||
python3 cli.py remaining
|
||||
```
|
||||
|
||||
Now tear down the worktrees. Direct the coordinating session:
|
||||
|
||||
@@ -111,8 +111,8 @@ well-formed version of the same bug:
|
||||
|
||||
> **Title:** `done` command crashes on an out-of-range or non-integer index
|
||||
>
|
||||
> **Context:** `python cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and
|
||||
> dumps a traceback. `python cli.py done abc` raises `ValueError`. Either way the user sees a stack
|
||||
> **Context:** `python3 cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and
|
||||
> dumps a traceback. `python3 cli.py done abc` raises `ValueError`. Either way the user sees a stack
|
||||
> trace instead of a helpful message.
|
||||
>
|
||||
> **Acceptance criteria:**
|
||||
@@ -270,7 +270,7 @@ plenty it still can't do. Because it's carried forward across modules, skip anyt
|
||||
already built (a `delete` command, task priorities) and pick work that's genuinely still missing.
|
||||
Good candidates:
|
||||
|
||||
1. **A bug**: `python cli.py done 99` (an out-of-range index) and `python cli.py done abc` (a
|
||||
1. **A bug**: `python3 cli.py done 99` (an out-of-range index) and `python3 cli.py done abc` (a
|
||||
non-integer) both crash with an uncaught traceback. Run them and watch.
|
||||
2. **A small, patterned feature**: an `undone <index>` command that clears a task's done flag,
|
||||
mirroring the existing `done` command (it's the inverse).
|
||||
|
||||
@@ -249,8 +249,8 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
Then see the baseline behavior with your own eyes, because the trap is going to change it:
|
||||
|
||||
```bash
|
||||
python cli.py add "write the review module"
|
||||
python cli.py done 99 # baseline: prints "error: no task at index 99", exits non-zero
|
||||
python3 cli.py add "write the review module"
|
||||
python3 cli.py done 99 # baseline: prints "error: no task at index 99", exits non-zero
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
@@ -302,12 +302,12 @@ real change, then review a diff the "AI" produced and catch the trap planted in
|
||||
5. Now verify your read by running the *failure* path, not the happy one:
|
||||
|
||||
```bash
|
||||
python cli.py add "a real task"
|
||||
python cli.py delete 0 # the requested feature: works fine on the happy path
|
||||
python cli.py add "another"
|
||||
python cli.py done 99 # the trap: compare this to your Part A baseline
|
||||
python3 cli.py add "a real task"
|
||||
python3 cli.py delete 0 # the requested feature: works fine on the happy path
|
||||
python3 cli.py add "another"
|
||||
python3 cli.py done 99 # the trap: compare this to your Part A baseline
|
||||
echo "exit code: $?"
|
||||
python cli.py list # did task 99 (which doesn't exist) get marked done? did anything?
|
||||
python3 cli.py list # did task 99 (which doesn't exist) get marked done? did anything?
|
||||
```
|
||||
|
||||
In the base app, `done 99` was a clean error with a non-zero exit. After this "add a delete
|
||||
|
||||
@@ -361,11 +361,11 @@ the server say *no* is the point: "never commit to `main`" is now a rule, not a
|
||||
the CLI), and it does what you asked. Run it:
|
||||
|
||||
```bash
|
||||
python cli.py add "keeper" ; python cli.py add "trash"
|
||||
python cli.py list # note the index shown next to "trash"
|
||||
python cli.py done <trash-index> # use the index "list" just printed, NOT a fixed 1
|
||||
python cli.py clear-done # expect it to remove the completed one
|
||||
python cli.py list # "keeper" remains, "trash" is gone
|
||||
python3 cli.py add "keeper" ; python3 cli.py add "trash"
|
||||
python3 cli.py list # note the index shown next to "trash"
|
||||
python3 cli.py done <trash-index> # use the index "list" just printed, NOT a fixed 1
|
||||
python3 cli.py clear-done # expect it to remove the completed one
|
||||
python3 cli.py list # "keeper" remains, "trash" is gone
|
||||
```
|
||||
|
||||
Read the index off `list` rather than assuming it: `done` is positional, and your `tasks-app` has
|
||||
|
||||
@@ -301,9 +301,9 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
4. **Now feel the bug.** It passes the first skim:
|
||||
|
||||
```bash
|
||||
python cli.py add "ship it"
|
||||
python cli.py clear # prints "cleared all tasks", looks fine!
|
||||
python cli.py list # CRASHES: it corrupted tasks.json, load() blows up
|
||||
python3 cli.py add "ship it"
|
||||
python3 cli.py clear # prints "cleared all tasks", looks fine!
|
||||
python3 cli.py list # CRASHES: it corrupted tasks.json, load() blows up
|
||||
```
|
||||
|
||||
This is the AI plausibility trap made concrete: the change reviewed fine and "worked," and broke
|
||||
@@ -343,8 +343,8 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
|
||||
```bash
|
||||
rm -f tasks.json # drop the corrupted state file the bug wrote
|
||||
python cli.py add "back to normal"
|
||||
python cli.py list # works again, the clear command is gone
|
||||
python3 cli.py add "back to normal"
|
||||
python3 cli.py list # works again, the clear command is gone
|
||||
git log --oneline # the bad merge is STILL there, with a revert after it
|
||||
```
|
||||
|
||||
@@ -366,7 +366,7 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
|
||||
```bash
|
||||
git log --oneline -1 # "Add version command"
|
||||
python cli.py version # prints the version
|
||||
python3 cli.py version # prints the version
|
||||
```
|
||||
|
||||
2. Now destroy it the way an over-eager "clean up the history" cleanup (or an agent) would, with a
|
||||
@@ -375,7 +375,7 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
```bash
|
||||
git reset --hard HEAD~1
|
||||
git log --oneline -2 # the "Add version command" commit is GONE from the branch
|
||||
python cli.py version 2>/dev/null || echo "command no longer exists"
|
||||
python3 cli.py version 2>/dev/null || echo "command no longer exists"
|
||||
```
|
||||
|
||||
It's not in `log`. It feels permanently lost. It isn't.
|
||||
@@ -390,7 +390,7 @@ that *you* hold the judgment: which undo, which parent, whether it actually work
|
||||
|
||||
```bash
|
||||
git log --oneline -1 # "Add version command" is back
|
||||
python cli.py version # works again
|
||||
python3 cli.py version # works again
|
||||
```
|
||||
|
||||
You just recovered a commit that `log` swore was gone. Note the honest limit: step 2's `--hard`
|
||||
|
||||
@@ -55,7 +55,7 @@ that runs a piece of your code and asserts that the result is what it should be.
|
||||
holds, the test passes silently. If it doesn't, the test fails loudly and tells you exactly which
|
||||
expectation broke.
|
||||
|
||||
You've already been testing, by hand. Every time you ran `python cli.py list` and eyeballed the
|
||||
You've already been testing, by hand. Every time you ran `python3 cli.py list` and eyeballed the
|
||||
output, you ran a manual test: *do something, check the result looks right.* The problem with the
|
||||
manual version is the same problem copy-paste had in Module 1: it doesn't scale across files or
|
||||
across time. You can't re-run "eyeball every command" on every change, so you don't, so regressions
|
||||
@@ -77,12 +77,12 @@ class TestTaskList(unittest.TestCase):
|
||||
self.assertEqual(tl.tasks[0].title, "write the tests")
|
||||
```
|
||||
|
||||
The whole suite runs from the project folder with a single command: `python -m unittest`
|
||||
The whole suite runs from the project folder with a single command: `python3 -m unittest`
|
||||
auto-discovers files named `test_*.py`, and `-v` prints each test name and its result. A verbose run
|
||||
looks like:
|
||||
|
||||
```text
|
||||
$ python -m unittest -v
|
||||
$ python3 -m unittest -v
|
||||
test_add_appends_a_task (test_tasks.TestTaskList) ... ok
|
||||
|
||||
----------------------------------------------------------------------
|
||||
@@ -173,7 +173,7 @@ intent has to come from you.
|
||||
|
||||
One more framing before the lab. A test file just sitting in your repo is useful when you remember to
|
||||
run it; like the manual eyeball check, you eventually won't. The full payoff comes in
|
||||
**Module 14**, where Continuous Integration runs this exact `python -m unittest` command
|
||||
**Module 14**, where Continuous Integration runs this exact `python3 -m unittest` command
|
||||
automatically on every push, so a regression can't reach `main` without something going red first.
|
||||
|
||||
That's why this module comes immediately before CI: **tests are the content CI runs.** You can't
|
||||
@@ -262,7 +262,7 @@ Do this once yourself so the tool isn't magic. From inside your working copy of
|
||||
2. Run it:
|
||||
|
||||
```bash
|
||||
python -m unittest -v
|
||||
python3 -m unittest -v
|
||||
```
|
||||
|
||||
You should see one test, and `OK`. That's the entire mechanism. Everything else is more of these.
|
||||
@@ -292,7 +292,7 @@ Do this once yourself so the tool isn't magic. From inside your working copy of
|
||||
5. Run the suite:
|
||||
|
||||
```bash
|
||||
python -m unittest -v
|
||||
python3 -m unittest -v
|
||||
```
|
||||
|
||||
At least one `pending_count` test should **FAIL**, with something like
|
||||
@@ -316,8 +316,8 @@ Do this once yourself so the tool isn't magic. From inside your working copy of
|
||||
return len(self.pending())
|
||||
```
|
||||
|
||||
Re-run `python -m unittest -v`; green. Confirm the app agrees:
|
||||
`python cli.py add a && python cli.py add b && python cli.py done 0 && python cli.py count`
|
||||
Re-run `python3 -m unittest -v`; green. Confirm the app agrees:
|
||||
`python3 cli.py add a && python3 cli.py add b && python3 cli.py done 0 && python3 cli.py count`
|
||||
should report **1 task(s) pending**.
|
||||
|
||||
> Using your own app from earlier modules instead? If your `count` command was already correct,
|
||||
@@ -371,7 +371,7 @@ The honest limits, because a green suite invites overconfidence:
|
||||
|
||||
**You're done when:**
|
||||
|
||||
- You can run `python -m unittest -v` in your `tasks-app` and see your own tests pass.
|
||||
- You can run `python3 -m unittest -v` in your `tasks-app` and see your own tests pass.
|
||||
- You watched an intent-encoding test **fail**, traced it to the real `pending_count` bug, fixed the
|
||||
*code*, and watched it pass.
|
||||
- You can articulate, in your own words, the difference between a test that asserts current behavior
|
||||
|
||||
@@ -84,7 +84,7 @@ Almost every CI configuration, on every forge, is the same four moves:
|
||||
4. **Run the checks**: lint, then test. Any check that exits non-zero fails the whole run.
|
||||
|
||||
That last point is the load-bearing one. CI's entire enforcement mechanism is the **exit code**.
|
||||
Every tool you'd run in a terminal returns 0 for success and non-zero for failure. `python -m
|
||||
Every tool you'd run in a terminal returns 0 for success and non-zero for failure. `python3 -m
|
||||
unittest` exits non-zero if a test fails. `ruff check` exits non-zero if it finds a lint problem. CI runs your
|
||||
commands and watches those exit codes; one failure turns the run red. You're not learning a new
|
||||
testing system; you're wiring the tools you already have to a trigger.
|
||||
@@ -160,7 +160,7 @@ When CI goes red, the skill is triage, and it's fast once you know the shape:
|
||||
3. **Read that step's log.** It's the same output the tool prints in your terminal: a failing
|
||||
`unittest` assertion, a `ruff` finding with a file and line number. CI didn't invent a new error
|
||||
format; it's showing you the command's own output.
|
||||
4. **Reproduce it locally.** The same command from the failed step (`python -m unittest` or
|
||||
4. **Reproduce it locally.** The same command from the failed step (`python3 -m unittest` or
|
||||
`ruff check .`) fails the same way on your own machine, because CI ran exactly that command. That
|
||||
reproducibility is the point: fix locally, confirm green locally, push again.
|
||||
|
||||
@@ -260,7 +260,7 @@ your machine first.
|
||||
that CI is nothing more than these same two commands is what makes the rest of the module click.
|
||||
|
||||
```bash
|
||||
python -m unittest # should report all tests passing
|
||||
python3 -m unittest # should report all tests passing
|
||||
ruff check . # should report no issues (or fix what it flags)
|
||||
```
|
||||
|
||||
@@ -331,7 +331,7 @@ and watch CI stop it.
|
||||
`git restore` (Module 12). What the agent runs looks like:
|
||||
|
||||
```bash
|
||||
python -m unittest # fails locally too: same command, same failure
|
||||
python3 -m unittest # fails locally too: same command, same failure
|
||||
git revert --no-edit HEAD # new commit that undoes "Simplify pending()" (Module 12)
|
||||
git push # CI re-runs on the fixed code and goes green again
|
||||
```
|
||||
|
||||
@@ -96,10 +96,10 @@ Set one for a single command:
|
||||
|
||||
```bash
|
||||
# macOS / Linux
|
||||
TASKS_API_KEY="sk-live-..." python sync.py
|
||||
TASKS_API_KEY="sk-live-..." python3 sync.py
|
||||
|
||||
# Windows PowerShell
|
||||
$env:TASKS_API_KEY="sk-live-..."; python sync.py
|
||||
$env:TASKS_API_KEY="sk-live-..."; python3 sync.py
|
||||
```
|
||||
|
||||
Read it back in code, and **fail loudly if it's missing**, because a silent empty string is worse
|
||||
@@ -313,7 +313,7 @@ type the commands by hand. Then you'll make it select config per environment.
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
python sync.py
|
||||
python3 sync.py
|
||||
```
|
||||
|
||||
It prints a simulated request, including `Authorization: Bearer sk-live-...`. Open `sync.py` and
|
||||
@@ -413,7 +413,7 @@ type the commands by hand. Then you'll make it select config per environment.
|
||||
5. Run it reading from your `.env`:
|
||||
|
||||
```bash
|
||||
python sync.py # loads .env -> dev URL, key from the file
|
||||
python3 sync.py # loads .env -> dev URL, key from the file
|
||||
```
|
||||
|
||||
6. Now prove the 12-factor point: **same code, different environment, no edit.** Override at the
|
||||
@@ -421,13 +421,13 @@ type the commands by hand. Then you'll make it select config per environment.
|
||||
|
||||
```bash
|
||||
# macOS / Linux
|
||||
APP_ENV=staging python sync.py
|
||||
APP_ENV=prod TASKS_API_KEY="sk-live-prod-key" python sync.py
|
||||
APP_ENV=staging python3 sync.py
|
||||
APP_ENV=prod TASKS_API_KEY="sk-live-prod-key" python3 sync.py
|
||||
```
|
||||
|
||||
```powershell
|
||||
# Windows PowerShell
|
||||
$env:APP_ENV="staging"; python sync.py
|
||||
$env:APP_ENV="staging"; python3 sync.py
|
||||
```
|
||||
|
||||
Watch the backend URL change with `APP_ENV` while the source never does. That's config in the
|
||||
@@ -489,7 +489,7 @@ left behind a `.env.example` so the next person (or agent) knows what to supply.
|
||||
- `sync.py` runs entirely from the environment, and `grep "sk-live" sync.py` prints nothing.
|
||||
- A real `.env` exists, contains your secret, and does **not** appear in `git status`, while
|
||||
`.env.example` is tracked.
|
||||
- `APP_ENV=staging python sync.py` and the default run hit different backend URLs with **zero**
|
||||
- `APP_ENV=staging python3 sync.py` and the default run hit different backend URLs with **zero**
|
||||
source edits between them.
|
||||
- You can state, in one sentence, why deleting a committed secret and re-committing does not fix the
|
||||
leak, and what the actual fix is (rotation).
|
||||
|
||||
@@ -256,7 +256,7 @@ A CLI that exits immediately is awkward to "deploy." Give the app a long-running
|
||||
2. Run the service locally first, no container, to see it work:
|
||||
|
||||
```bash
|
||||
python serve.py # serves on http://localhost:8000
|
||||
python3 serve.py # serves on http://localhost:8000
|
||||
```
|
||||
|
||||
In another terminal:
|
||||
|
||||
@@ -336,7 +336,7 @@ That's the entire client/server loop, end to end, with zero code you wrote. Now
|
||||
> contents so I can read it."*
|
||||
|
||||
Then open the copied file yourself and read it. (It reuses `tasks.py` and shares the same
|
||||
`tasks.json`, so anything it changes shows up in `python cli.py list`.) The whole server is two
|
||||
`tasks.json`, so anything it changes shows up in `python3 cli.py list`.) The whole server is two
|
||||
tools:
|
||||
|
||||
```python
|
||||
@@ -417,14 +417,14 @@ That's the entire client/server loop, end to end, with zero code you wrote. Now
|
||||
the way you'd verify any runtime effect, by reading the *state*, not the repo:
|
||||
|
||||
```bash
|
||||
python cli.py list # the new task is there, because the server wrote the same tasks.json
|
||||
python3 cli.py list # the new task is there, because the server wrote the same tasks.json
|
||||
cat tasks.json # the raw state the server changed, end to end
|
||||
```
|
||||
|
||||
The AI just changed real state in a real system through a tool call. Notice what you did *not*
|
||||
reach for: `git diff`. `tasks.json` is deliberately gitignored (Module 2's `.gitignore` treats it
|
||||
as generated runtime state, not source), so `git diff` stays empty here, and that's correct, not a
|
||||
bug. The proof the task list changed is the live state (`python cli.py list` / `cat tasks.json`),
|
||||
bug. The proof the task list changed is the live state (`python3 cli.py list` / `cat tasks.json`),
|
||||
not version control; runtime data the app owns is exactly the kind of thing you keep *out* of
|
||||
history. No copy-paste, no script you ran by hand, no pasting `tasks.json` into a chat. That's
|
||||
"hands."
|
||||
@@ -483,7 +483,7 @@ The caveats, and one of them is large enough that it gets its own module.
|
||||
connected with `list_tasks` and `add_task` available.
|
||||
- You asked the AI a question and it answered by **calling a tool** against the live system, and you
|
||||
asked it to add a task and then **verified the change outside the AI** by reading the runtime state
|
||||
(`python cli.py list` / `cat tasks.json`), not `git diff`, because `tasks.json` is deliberately
|
||||
(`python3 cli.py list` / `cat tasks.json`), not `git diff`, because `tasks.json` is deliberately
|
||||
gitignored (Module 2).
|
||||
- You can explain the client/server model in one breath (*servers expose tools/resources/prompts;
|
||||
the client (your agentic tool) discovers and calls them on the AI's behalf*) and why "it's a
|
||||
|
||||
@@ -93,7 +93,7 @@ This is the distinction to lock in, because the two are siblings and easy to con
|
||||
| Analogy | The standing house rules posted on the wall | A labeled recipe card you pull out when you cook that dish |
|
||||
|
||||
They're complementary. The instructions file is the right home for facts true *all the time* ("tests
|
||||
run with `python -m unittest`"). A skill is the right home for a procedure you run *sometimes* ("here
|
||||
run with `python3 -m unittest`"). A skill is the right home for a procedure you run *sometimes* ("here
|
||||
is exactly how we add a command"). Module 5 even told you this was coming: start with the always-on
|
||||
file; graduate a procedure into a skill when it earns its own page.
|
||||
|
||||
@@ -153,7 +153,7 @@ On paper this is just "write a runbook." The AI-specific twist is what changes t
|
||||
|
||||
- **The AI will execute the playbook, not just read it.** A runbook for a human is a reminder; a skill
|
||||
for an agent is something it *performs*. The precision pays off immediately: vague step, vague
|
||||
result; imperative step ("run `python -m unittest`; do not claim success until it's green"), reliable
|
||||
result; imperative step ("run `python3 -m unittest`; do not claim success until it's green"), reliable
|
||||
result.
|
||||
- **The AI is confidently incomplete without one.** Asked to "add a command," it'll happily stop at
|
||||
the code and skip the test, the changelog, the clean commit, and sound finished doing it. The skill
|
||||
@@ -228,8 +228,8 @@ seen, producing all four parts without you listing the steps.
|
||||
5. Watch it perform the procedure. A correctly-followed skill will, without you saying any of it:
|
||||
- add `clear()` to `tasks.py` and wire a `clear` branch into `cli.py` (logic in the right file);
|
||||
- add a real test to `test_tasks.py` that asserts the list is empty afterward (not just "no crash");
|
||||
- run `python -m unittest` and show it green;
|
||||
- smoke-test `python cli.py clear` and show the output;
|
||||
- run `python3 -m unittest` and show it green;
|
||||
- smoke-test `python3 cli.py clear` and show the output;
|
||||
- add a `CHANGELOG.md` line;
|
||||
- stage code + test + changelog into one commit, **without** `tasks.json`.
|
||||
|
||||
@@ -238,8 +238,8 @@ seen, producing all four parts without you listing the steps.
|
||||
6. Don't take the AI's word for it. Check against the skill's own done-criteria:
|
||||
|
||||
```bash
|
||||
python -m unittest # green, and a clear-related test is present
|
||||
python cli.py add "x" && python cli.py clear && python cli.py list # -> (no tasks yet)
|
||||
python3 -m unittest # green, and a clear-related test is present
|
||||
python3 cli.py add "x" && python3 cli.py clear && python3 cli.py list # -> (no tasks yet)
|
||||
git show --stat HEAD # one commit: tasks.py, cli.py, test_tasks.py, CHANGELOG.md; no tasks.json
|
||||
```
|
||||
|
||||
@@ -324,7 +324,7 @@ time:
|
||||
that the example skill format stays generic (when-to-use / inputs / steps / done-criteria).
|
||||
- [ ] **Dependency chain intact.** Confirm Module 20 (MCP) and Module 22 (securing servers/skills) are
|
||||
still numbered as referenced, and that nothing here leans on a tool introduced after Module 20.
|
||||
- [ ] **Lab still runs.** `python -m unittest` is green in `lab/tasks-app/`, and the `clear`-command
|
||||
- [ ] **Lab still runs.** `python3 -m unittest` is green in `lab/tasks-app/`, and the `clear`-command
|
||||
walkthrough still matches the starter files (`add`/`list`/`done`/`count`, `test_tasks.py`,
|
||||
`CHANGELOG.md`).
|
||||
|
||||
|
||||
@@ -271,14 +271,14 @@ normal question) and the attacker (you plant content the agent reads).
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
python cli.py add "$(cat ~/ai-workflow-course/modules/22-securing-third-party-mcp-and-skills/lab/poisoned-task.txt)"
|
||||
python cli.py list
|
||||
python3 cli.py add "$(cat ~/ai-workflow-course/modules/22-securing-third-party-mcp-and-skills/lab/poisoned-task.txt)"
|
||||
python3 cli.py list
|
||||
```
|
||||
|
||||
`poisoned-task.txt` contains a normal-looking task followed by an injected instruction (a fake
|
||||
"system" directive telling the assistant to reveal local secrets / run a command and hide it).
|
||||
|
||||
2. **Be the victim.** Paste the full output of `python cli.py list` into your agent's chat (Claude
|
||||
2. **Be the victim.** Paste the full output of `python3 cli.py list` into your agent's chat (Claude
|
||||
Code in these examples; sub your own) and ask the thing you'd actually ask: *"Here's my task list,
|
||||
summarize what's pending and tell me what to
|
||||
work on first."* Watch what happens. Depending on the model, it may flag the injection, or it may
|
||||
@@ -309,7 +309,7 @@ normal question) and the attacker (you plant content the agent reads).
|
||||
|
||||
```bash
|
||||
# the "tool" the agent is allowed to call in read-only mode
|
||||
python cli.py list # works
|
||||
python3 cli.py list # works
|
||||
# the tool it is NOT exposed (a write); in a least-privilege setup this path is simply absent
|
||||
```
|
||||
|
||||
|
||||
@@ -202,7 +202,7 @@ This lab does **not** use `tasks-app`; the entire point is a codebase you *didn'
|
||||
git clone <repo-url> unfamiliar-repo
|
||||
cd unfamiliar-repo
|
||||
# copy modules/23-working-with-existing-codebases/lab/orient.py into this folder
|
||||
python orient.py > ORIENT.md
|
||||
python3 orient.py > ORIENT.md
|
||||
```
|
||||
|
||||
2. Read `ORIENT.md` yourself first. In 30 seconds you should know the language, the likely entry
|
||||
|
||||
+9
-9
@@ -190,7 +190,7 @@ This is the real production loop with the forge plumbing simulated locally.
|
||||
|
||||
**You'll need:**
|
||||
|
||||
- Python 3.10+ (`python --version`).
|
||||
- Python 3.10+ (`python3 --version`).
|
||||
- The lab files in `~/ai-workflow-course/modules/24-assistive-agents/lab/`.
|
||||
- Claude Code (`claude --version`; sub your own agent), the editor/CLI agent from Module 4.
|
||||
|
||||
@@ -211,7 +211,7 @@ it runs the scripts and writes the files. You verify at the gate.
|
||||
|
||||
```
|
||||
You: In ~/ai-workflow-course/modules/24-assistive-agents/lab, run
|
||||
`python reviewer.py apply ai-review.sample.json` and show me the output.
|
||||
`python3 reviewer.py apply ai-review.sample.json` and show me the output.
|
||||
```
|
||||
|
||||
Read what comes back: comments sorted by severity, a recommendation, and then the **human decision
|
||||
@@ -221,7 +221,7 @@ it runs the scripts and writes the files. You verify at the gate.
|
||||
the reviewer, and write its JSON review to a file:
|
||||
|
||||
```
|
||||
You: Run `python reviewer.py prompt`, follow the rubric in that output to review the diff, and
|
||||
You: Run `python3 reviewer.py prompt`, follow the rubric in that output to review the diff, and
|
||||
save your review as JSON to my-review.json.
|
||||
```
|
||||
|
||||
@@ -232,7 +232,7 @@ it runs the scripts and writes the files. You verify at the gate.
|
||||
3. Have the agent render its own review through the gate:
|
||||
|
||||
```
|
||||
You: Run `python reviewer.py apply my-review.json` and show me the result.
|
||||
You: Run `python3 reviewer.py apply my-review.json` and show me the result.
|
||||
```
|
||||
|
||||
4. **Make the human decision. This part stays yours.** Open `feature.patch` and check the agent's
|
||||
@@ -249,7 +249,7 @@ A new issue just arrived: `sample-issue.md` (the `done` command crashes on an em
|
||||
1. See the loop with the canned response:
|
||||
|
||||
```
|
||||
You: Run `python triage.py apply ai-triage.sample.json` and show me the output.
|
||||
You: Run `python3 triage.py apply ai-triage.sample.json` and show me the output.
|
||||
```
|
||||
|
||||
Read the suggested labels, the route, and the **human confirm gate**. The agent applied nothing.
|
||||
@@ -258,14 +258,14 @@ A new issue just arrived: `sample-issue.md` (the `done` command crashes on an em
|
||||
and save its suggestion:
|
||||
|
||||
```
|
||||
You: Run `python triage.py prompt`, follow it to triage the issue using only the committed
|
||||
You: Run `python3 triage.py prompt`, follow it to triage the issue using only the committed
|
||||
taxonomy, and save your JSON suggestion to my-triage.json.
|
||||
```
|
||||
|
||||
3. Render the suggestion through the gate:
|
||||
|
||||
```
|
||||
You: Run `python triage.py apply my-triage.json` and show me the result.
|
||||
You: Run `python3 triage.py apply my-triage.json` and show me the result.
|
||||
```
|
||||
|
||||
4. **Watch the guardrail.** The script validates every suggested label against the committed
|
||||
@@ -346,8 +346,8 @@ This is expansion-zone material; the agent-tooling landscape moves fast. Re-chec
|
||||
merge/close, so comment/label-only is actually grantable? Name two that do.
|
||||
- [ ] Is the turnkey "AI review bot / app" framing still accurate, or has the dominant pattern shifted
|
||||
(e.g. baked into the forge, or into editor agents)? Keep the description vendor-neutral.
|
||||
- [ ] Confirm the lab scripts run on a current Python (`python reviewer.py apply ai-review.sample.json`
|
||||
and `python triage.py apply ai-triage.sample.json`) with no dependencies.
|
||||
- [ ] Confirm the lab scripts run on a current Python (`python3 reviewer.py apply ai-review.sample.json`
|
||||
and `python3 triage.py apply ai-triage.sample.json`) with no dependencies.
|
||||
- [ ] Re-verify the cross-references resolve to the right module numbers (9, 10, 13, 14, 15, 22, 25)
|
||||
if any modules were renumbered.
|
||||
- [ ] Check that nothing here pins a specific LLM vendor or a specific bot's config filename.
|
||||
|
||||
+4
-4
@@ -262,7 +262,7 @@ out of the agent's `git add -A`, so the change you review in Part B is clean. Th
|
||||
|
||||
```bash
|
||||
# Simulate an agent that produces a BROKEN change, then run the gate on it:
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
```
|
||||
|
||||
The orchestrator creates and switches to its own `agent/issue-delete-command` branch first (the same
|
||||
@@ -276,7 +276,7 @@ reached `main`.
|
||||
### Part B: See a good change land as a PR proposal
|
||||
|
||||
```bash
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate good
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md --simulate good
|
||||
```
|
||||
|
||||
This time the planted change is correct. The gate passes, the script commits to the branch and prints
|
||||
@@ -290,7 +290,7 @@ stops at a PR; it never merges.
|
||||
### Part C: Run the self-healing loop
|
||||
|
||||
```bash
|
||||
python agent_runner.py self-heal --simulate bad
|
||||
python3 agent_runner.py self-heal --simulate bad
|
||||
```
|
||||
|
||||
The orchestrator switches to its own `agent/self-heal` branch (again, you direct the automation, not
|
||||
@@ -308,7 +308,7 @@ Two ways to go from simulation to a genuine autonomous run:
|
||||
|
||||
```bash
|
||||
export AGENT_CMD='your-agent-cli --print --prompt-file {prompt_file}' # your tool's one-shot mode
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md
|
||||
```
|
||||
|
||||
The script builds the prompt from the issue **and** your committed config (Module 5), runs your
|
||||
|
||||
@@ -407,7 +407,7 @@ thing you're waiting on.
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app
|
||||
python cli.py list && python cli.py count && python cli.py clear # all three features live
|
||||
python3 cli.py list && python3 cli.py count && python3 cli.py clear # all three features live
|
||||
```
|
||||
|
||||
If any of those three commands fails, the resolution was wrong. That's why you verify the result
|
||||
|
||||
+6
-6
@@ -251,7 +251,7 @@ output.
|
||||
|
||||
```bash
|
||||
cd modules/27-evals/lab
|
||||
python run_eval.py candidates/current_model
|
||||
python3 run_eval.py candidates/current_model
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
@@ -264,7 +264,7 @@ output.
|
||||
2. Now simulate the swap: run the *exact same eval set* against the other candidate:
|
||||
|
||||
```bash
|
||||
python run_eval.py candidates/swapped_model
|
||||
python3 run_eval.py candidates/swapped_model
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
@@ -281,7 +281,7 @@ output.
|
||||
yourself and read the scorecard:
|
||||
|
||||
```bash
|
||||
python run_eval.py candidates/my_run_1
|
||||
python3 run_eval.py candidates/my_run_1
|
||||
```
|
||||
|
||||
4. Now actually swap something. Either change the model Claude Code uses, or change the *prompt* (ask
|
||||
@@ -315,10 +315,10 @@ output.
|
||||
`working-directory:` line makes the CI job `cd` into the lab folder first, so the
|
||||
`candidates/...` path and `run_eval.py`'s own `from eval_set import CASES` resolve exactly as they
|
||||
did on your machine. (Drop it and point a repo-root job straight at
|
||||
`python modules/27-evals/lab/run_eval.py candidates/current_model`, and `candidates/`
|
||||
`python3 modules/27-evals/lab/run_eval.py candidates/current_model`, and `candidates/`
|
||||
won't exist from the repo root: the gate crashes with a *false* failure, which is worse than no
|
||||
gate. If the agent prefers a single line, it can spell both paths out from the repo root:
|
||||
`python modules/27-evals/lab/run_eval.py modules/27-evals/lab/candidates/current_model
|
||||
`python3 modules/27-evals/lab/run_eval.py modules/27-evals/lab/candidates/current_model
|
||||
--threshold 1.0`.)
|
||||
|
||||
Below threshold exits non-zero and the pipeline blocks, exactly like a failing test. The guardrail
|
||||
@@ -394,7 +394,7 @@ This is an expansion-zone module over fast-moving ground. Re-check at build/publ
|
||||
- [ ] **Module cross-references.** Confirm Modules 13, 14, 10, and 24–26 still carry the
|
||||
responsibilities referenced here (tests, CI gating, review, the agent autonomy ladder) and that
|
||||
none were renumbered.
|
||||
- [ ] **Lab still runs.** `python run_eval.py candidates/current_model` exits 0 at 100%, and
|
||||
- [ ] **Lab still runs.** `python3 run_eval.py candidates/current_model` exits 0 at 100%, and
|
||||
`candidates/swapped_model` exits 1 below threshold, on a current Python 3.x.
|
||||
|
||||
|
||||
|
||||
+5
-5
@@ -53,7 +53,7 @@ already standing; it doesn't re-pour the foundation.
|
||||
Pick something small enough to finish in one sitting and real enough to touch the whole stack. We'll
|
||||
add **due dates**:
|
||||
|
||||
- A task can carry an optional due date: `python cli.py add "file taxes" --due <YYYY-MM-DD>`.
|
||||
- A task can carry an optional due date: `python3 cli.py add "file taxes" --due <YYYY-MM-DD>`.
|
||||
- A new `overdue` command lists pending tasks whose due date has already passed.
|
||||
- The deployed service grows a matching `GET /overdue` endpoint, so the change is visible in the
|
||||
running container, not just the CLI.
|
||||
@@ -190,9 +190,9 @@ agent), your forge account, and a working Docker install.
|
||||
in the future, one safely in the past) so the assertion below holds whenever you run this:
|
||||
|
||||
```bash
|
||||
python cli.py add "file taxes" --due <a date a few months out> # future → NOT overdue
|
||||
python cli.py add "renew domain" --due 2020-01-01 # past → overdue
|
||||
python cli.py overdue # should list "renew domain", not "file taxes"
|
||||
python3 cli.py add "file taxes" --due <a date a few months out> # future → NOT overdue
|
||||
python3 cli.py add "renew domain" --due 2020-01-01 # past → overdue
|
||||
python3 cli.py overdue # should list "renew domain", not "file taxes"
|
||||
```
|
||||
|
||||
> *Verify-before-publish: refresh the example due dates so the "future" one is still in the future
|
||||
@@ -205,7 +205,7 @@ agent), your forge account, and a working Docker install.
|
||||
them by name. Confirm the suite is green:
|
||||
|
||||
```bash
|
||||
pytest # or: python -m unittest
|
||||
pytest # or: python3 -m unittest
|
||||
```
|
||||
|
||||
Once it's green, tell the AI to commit the change. Then verify what it actually staged and wrote:
|
||||
|
||||
Reference in New Issue
Block a user