This commit was merged in pull request #105.
This commit is contained in:
@@ -52,7 +52,7 @@ Here's what to get in place. You'll use all of it for the rest of the course.
|
||||
|
||||
**A code editor.** Any will do, but a graphical one like VS Code is the easiest starting point; later modules build on editor-integrated AI tools, and VS Code is the path of least resistance there.
|
||||
|
||||
**Python 3.10 or newer.** Check with `python --version` or `python3 --version`. Whichever one prints a 3.10+ version is the command you'll use everywhere from here on. (On current macOS and default Ubuntu, it's usually `python3`; if `python` says "command not found," just read every `python` in the labs as `python3`.)
|
||||
**Python 3.10 or newer.** The labs are written with `python3`, the command current macOS and default Ubuntu actually ship (they install Python only as `python3`, with no bare `python` on PATH). Check with `python3 --version`; if it prints a 3.10+ version, use `python3` everywhere from here on. If `python3` says "command not found" but `python --version` shows 3.10+ (older or some Windows setups), just read every `python3` in the labs as `python` instead.
|
||||
|
||||
**Your usual AI chat assistant,** open in a browser tab. Any of them. Remember: model-agnostic.
|
||||
|
||||
|
||||
@@ -120,7 +120,7 @@ git commit -m "Add count command"
|
||||
git status # shows tasks.py as modified
|
||||
git restore tasks.py # discard the change, back to your last commit, byte for byte
|
||||
git diff # empty. nothing changed. you're clean.
|
||||
python cli.py list # works again
|
||||
python3 cli.py list # works again
|
||||
```
|
||||
|
||||
That's it. You just recovered from a bad AI change in one command, with zero retyping and zero guesswork. Sit with how *cheap* that was for a second; that cheapness is the thing that lets you say yes to riskier AI work for the rest of the course.
|
||||
|
||||
@@ -10,7 +10,7 @@ Tags: AI, developer workflow, version control, configuration, AGEN
|
||||
|
||||
# Commit the AI's Config, Not Just the Code
|
||||
|
||||
I used to start every AI coding session the same way: by giving the same little speech. "We use four-space indent. Run the tests with `python -m unittest` before you tell me it works. The logic goes in `tasks.py`, not crammed into the CLI file. And whatever you do, don't hand-edit `tasks.json`; it's generated."
|
||||
I used to start every AI coding session the same way: by giving the same little speech. "We use four-space indent. Run the tests with `python3 -m unittest` before you tell me it works. The logic goes in `tasks.py`, not crammed into the CLI file. And whatever you do, don't hand-edit `tasks.json`; it's generated."
|
||||
|
||||
The AI would nod (figuratively), do exactly that, and we'd have a great session. Then I'd close the tab. The next morning I'd open a fresh one, and the AI had forgotten every word of it. So I'd give the speech again. And again. I was a broken record reading my own project back to a goldfish.
|
||||
|
||||
@@ -27,7 +27,7 @@ Different vendors look for different filenames, and honestly, the names keep cha
|
||||
So what goes in it? Not a prompt, and not your README. This is a briefing for an agent that's about to edit your code. Keep it to things that actually change the AI's behavior:
|
||||
|
||||
- **Project conventions**: the layout and patterns this codebase actually uses. *"Core logic lives in `tasks.py`; the CLI front end is `cli.py`; state persists to `tasks.json`."*
|
||||
- **Build and test commands**: the exact, copy-pasteable commands. *"Run tests with `python -m unittest`. Don't claim a change works until they pass."* That one line stops the AI from inventing a test runner you don't use.
|
||||
- **Build and test commands**: the exact, copy-pasteable commands. *"Run tests with `python3 -m unittest`. Don't claim a change works until they pass."* That one line stops the AI from inventing a test runner you don't use.
|
||||
- **Coding standards**: *"Standard library only, no third-party packages. Type-hint public functions."*
|
||||
- **The don't-touch list**: generated files, vendored code, secrets. *"Never edit `tasks.json` by hand; it's generated."*
|
||||
- **House style**: the taste calls that otherwise come back wrong every time. *"Keep functions small. Don't reformat files you aren't changing."*
|
||||
|
||||
@@ -79,9 +79,9 @@ Let it edit `tasks.py` and `cli.py` freely. This is a multi-file change: exactly
|
||||
|
||||
```bash
|
||||
git diff # read what it actually changed
|
||||
python cli.py add "ship module 6" --priority high
|
||||
python cli.py add "water plants" --priority low
|
||||
python cli.py list # see if priorities work and sort
|
||||
python3 cli.py add "ship module 6" --priority high
|
||||
python3 cli.py add "water plants" --priority low
|
||||
python3 cli.py list # see if priorities work and sort
|
||||
git add .
|
||||
git commit -m "Add task priorities (experiment)"
|
||||
```
|
||||
@@ -90,7 +90,7 @@ The payoff: prove the isolation. Switch back to `main` and watch the whole featu
|
||||
|
||||
```bash
|
||||
git switch main
|
||||
python cli.py list # no priorities: main is exactly as you left it
|
||||
python3 cli.py list # no priorities: main is exactly as you left it
|
||||
```
|
||||
|
||||
Sit with that for a second. Your bold change exists *only* on the branch. `main` never saw it. That's the entire point of the module in two commands.
|
||||
@@ -103,7 +103,7 @@ Sit with that for a second. Your bold change exists *only* on the branch. `main`
|
||||
git switch main
|
||||
git merge experiment/priorities # likely a fast-forward: main slides up to the branch
|
||||
git log --oneline --graph # straight line = fast-forward
|
||||
python cli.py list # the feature is now on main
|
||||
python3 cli.py list # the feature is now on main
|
||||
git branch -d experiment/priorities # branch did its job; -d is the safe delete
|
||||
```
|
||||
|
||||
@@ -127,9 +127,9 @@ Most merges just work; Git is genuinely good at combining changes that touch *di
|
||||
|
||||
```python
|
||||
<<<<<<< HEAD
|
||||
print("usage: python cli.py [add <title> | list | done <index> | purge]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | purge]")
|
||||
=======
|
||||
print("usage: python cli.py [add <title> | list | done <index> | stats]")
|
||||
print("usage: python3 cli.py [add <title> | list | done <index> | stats]")
|
||||
>>>>>>> feature/stats
|
||||
```
|
||||
|
||||
@@ -149,8 +149,8 @@ It resolves silently and the merge lands. And here is the only part that's still
|
||||
|
||||
```bash
|
||||
git diff HEAD~1 # what the merge actually changed; confirm no markers, both commands present
|
||||
python cli.py # run it: see the merged usage string
|
||||
python cli.py stats && python cli.py purge # both actually work
|
||||
python3 cli.py # run it: see the merged usage string
|
||||
python3 cli.py stats && python3 cli.py purge # both actually work
|
||||
```
|
||||
|
||||
That `git diff` after *every* merge is the whole skill now. Not "edit the markers by hand," which the AI did for you before you could blink, but "know a conflict can happen and check the silent resolution," because a resolution that runs cleanly can still be wrong and it won't leave an error behind to warn you. (And if your AI's edits didn't happen to collide (they're nondeterministic), the course ships a little `make-conflict.sh` helper that manufactures one deterministically so you can still see the markers at least once.)
|
||||
|
||||
@@ -119,8 +119,8 @@ git worktree list
|
||||
Then you point one editor/AI session at `tasks-app-wipe` and a second at `tasks-app-remaining`, and let both work at the same time. While they run, you can prove the isolation from a third terminal:
|
||||
|
||||
```bash
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python cli.py add "from worktree A" && python cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python cli.py add "from worktree B" && python cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-wipe && python3 cli.py add "from worktree A" && python3 cli.py list
|
||||
cd ~/ai-workflow-course/tasks-app-remaining && python3 cli.py add "from worktree B" && python3 cli.py list
|
||||
```
|
||||
|
||||
Each `list` shows only its own task. Worktree A never sees "from worktree B." Each worktree even has its own `tasks.json` runtime state: separate files, separate state, while both agents work. Total isolation. When they're done, each commit lands on its own branch, and bringing both home is trivial because it's all already in one repo:
|
||||
|
||||
@@ -53,7 +53,7 @@ Nobody (human or agent) can do anything with that without coming back to ask you
|
||||
|
||||
> **Title:** `done` command crashes on an out-of-range or non-integer index
|
||||
>
|
||||
> **Context:** `python cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and dumps a traceback. `python cli.py done abc` raises `ValueError`. Either way the user sees a stack trace instead of a helpful message.
|
||||
> **Context:** `python3 cli.py done 99` on a list with 3 tasks raises an uncaught `IndexError` and dumps a traceback. `python3 cli.py done abc` raises `ValueError`. Either way the user sees a stack trace instead of a helpful message.
|
||||
>
|
||||
> **Acceptance criteria:**
|
||||
> - `done <index>` with an out-of-range index prints a clear error (e.g. `no task at index 99`) and exits non-zero.
|
||||
@@ -109,7 +109,7 @@ The reframe: writing a clear issue used to be a courtesy to your teammates. Now
|
||||
|
||||
The lab is deliberately low-stakes: you're writing issues, not code, so your AI assistant can stay in a browser tab. Against the `tasks-app` repo you pushed to a forge:
|
||||
|
||||
1. **Find three real pieces of work.** A bug (`python cli.py done 99` and `done abc` both crash (run them and watch)), a small patterned feature (`delete <index>`, mirroring `done`), and a judgment-heavy one (task priorities).
|
||||
1. **Find three real pieces of work.** A bug (`python3 cli.py done 99` and `done abc` both crash (run them and watch)), a small patterned feature (`delete <index>`, mirroring `done`), and a judgment-heavy one (task priorities).
|
||||
2. **Draft all three as well-formed issues:** title, context with repro steps, acceptance criteria, out-of-scope. This is a great place to *use* the AI: paste a file, ask it to draft acceptance criteria, then **edit them down.** The model over-produces; tightening its draft is exactly the skill.
|
||||
3. **Create, label, and route them.** Assign the priorities feature to a human (it has open design questions). Earmark the bug and the `delete` feature for an agent: actual agent assignee, an `agent-ready` label, or just a note saying "suitable for an issue-to-PR agent." The mechanism doesn't matter yet; the *decision* does.
|
||||
4. **Write one sentence per issue explaining why it went where it went**, in terms of the issue's clarity, not the model's smarts. That sentence *is* the routing skill.
|
||||
|
||||
@@ -55,7 +55,7 @@ And here's the part people resist: this holds **even when you're the only human
|
||||
Talk is cheap, so here's the lab the course runs, compressed. You've got a tiny `tasks-app`, a command-line to-do list. In the base version, `complete()` validates the index, so `done 99` on a list with three tasks gives you a clean, loud error and a non-zero exit code:
|
||||
|
||||
```bash
|
||||
python cli.py done 99 # prints "error: no task at index 99", exits non-zero
|
||||
python3 cli.py done 99 # prints "error: no task at index 99", exits non-zero
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
@@ -74,7 +74,7 @@ The diff adds a `delete` command. It works: try `delete 0`, the task goes away,
|
||||
But run the *failure* path, not the happy one:
|
||||
|
||||
```bash
|
||||
python cli.py done 99 # the trap
|
||||
python3 cli.py done 99 # the trap
|
||||
echo "exit code: $?"
|
||||
```
|
||||
|
||||
|
||||
@@ -33,7 +33,7 @@ That "one done" case is the one where a correct implementation and a buggy one g
|
||||
|
||||
A test file sitting in your repo is useful right up until you forget to run it, which, like every manual check, you eventually will. Continuous Integration removes the "eventually." It's a grand name for a mundane core: **the same checks you'd run by hand (lint, build, test) bound to a trigger, on a clean machine you don't control, on every single push.**
|
||||
|
||||
The magic is entirely in *automatically*. You don't run CI; pushing runs it. It can't be skipped by forgetting, it doesn't get tired on the fortieth push of the day, and its whole enforcement mechanism is the humble exit code: `python -m unittest` returns non-zero when a test fails, and one non-zero turns the run red. The actual config is shorter than this paragraph:
|
||||
The magic is entirely in *automatically*. You don't run CI; pushing runs it. It can't be skipped by forgetting, it doesn't get tired on the fortieth push of the day, and its whole enforcement mechanism is the humble exit code: `python3 -m unittest` returns non-zero when a test fails, and one non-zero turns the run red. The actual config is shorter than this paragraph:
|
||||
|
||||
```yaml
|
||||
name: CI
|
||||
|
||||
@@ -40,9 +40,9 @@ The lab makes this concrete and local: no hosted bot account required. You run a
|
||||
|
||||
```bash
|
||||
cd modules/24-assistive-agents/lab
|
||||
python reviewer.py prompt # builds: your committed rubric + the diff
|
||||
python3 reviewer.py prompt # builds: your committed rubric + the diff
|
||||
# (paste into your AI, save its JSON to my-review.json)
|
||||
python reviewer.py apply my-review.json
|
||||
python3 reviewer.py apply my-review.json
|
||||
```
|
||||
|
||||
The diff it's reviewing has a real trap planted in it: a new `clear` command that prints "cleared all tasks" but never actually calls `save()`, so `tasks.json` is untouched. Did your AI catch it? Either way, *you* make the merge call, and you learn exactly how much this reviewer is worth before the stakes go up.
|
||||
@@ -70,7 +70,7 @@ The lab runs the whole thing locally against the `tasks-app`, and the best part
|
||||
|
||||
```bash
|
||||
git checkout -b agent/delete-command
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
python3 agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
# → ruff + pytest run, a test fails, the script refuses to call the work ready.
|
||||
# Exit code non-zero. No PR. Nothing reached main.
|
||||
```
|
||||
@@ -124,8 +124,8 @@ The lab is the punchline of the whole series. You run the same eval set against
|
||||
|
||||
```bash
|
||||
cd modules/27-evals/lab
|
||||
python run_eval.py candidates/current_model # 100%, exit 0, your baseline
|
||||
python run_eval.py candidates/swapped_model # 60%, exit 1, blocked
|
||||
python3 run_eval.py candidates/current_model # 100%, exit 0, your baseline
|
||||
python3 run_eval.py candidates/swapped_model # 60%, exit 1, blocked
|
||||
```
|
||||
|
||||
The "swapped model" is a stand-in for the day a cheaper model ships, or your provider deprecates the one you're on, or someone edits the agent's prompt. The easy cases still pass (this output would sail through a casual skim), but the eval caught a regression a skim would have missed, *and the non-zero exit code means a pipeline would have blocked the merge.* That's a **regression eval**, and it's the moment this course's thesis stops being a slogan and becomes a procedure you run from the keyboard.
|
||||
|
||||
@@ -22,7 +22,7 @@ If you've been following the series here on the blog, this is the part where the
|
||||
|
||||
Here's the trick that makes a capstone honest: pick something *small* enough to finish in one sitting but *real* enough to touch the whole stack. We're adding due dates to the running `tasks-app`:
|
||||
|
||||
- A task can carry an optional due date: `python cli.py add "file taxes" --due 2026-09-15`.
|
||||
- A task can carry an optional due date: `python3 cli.py add "file taxes" --due 2026-09-15`.
|
||||
- A new `overdue` command lists pending tasks whose due date has already passed.
|
||||
- The deployed service grows a matching `GET /overdue` endpoint, so the change is visible in the *running container*, not just the CLI.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user