Reframe Module 9 worked-examples off already-built features (#40) #70

Merged
claude merged 1 commits from fix/p2-module-9-worked-examples into main 2026-06-22 17:51:58 -04:00
2 changed files with 46 additions and 35 deletions
+19 -14
View File
@@ -165,16 +165,17 @@ for both.
So how do you decide? A useful heuristic, which is really a property of the *issue*, not the model: So how do you decide? A useful heuristic, which is really a property of the *issue*, not the model:
**Hand it to an agent when the issue is well-scoped, has concrete acceptance criteria, and follows **Hand it to an agent when the issue is well-scoped, has concrete acceptance criteria, and follows
a pattern already in the codebase.** The `delete <index>` command is a strong candidate — it mirrors a pattern already in the codebase.** An `undone <index>` command — the inverse of `done` — is a
the existing `done` command almost exactly, "done" is unambiguous, and a human can verify the result strong candidate: it mirrors the existing command almost exactly, "clear the done flag" is
in seconds. The bug above is another: contained, reproducible, testable. unambiguous, and a human can verify the result in seconds. The bug above is another: contained,
reproducible, testable.
**Keep it with a human when the issue carries genuine ambiguity, design judgment, or cross-cutting **Keep it with a human when the issue carries genuine ambiguity, design judgment, or cross-cutting
risk.** "Add task priorities" sounds small but isn't: how many levels? Does the list re-sort? How are risk.** "Add due dates" sounds small but isn't: what date format does the user type? Does the list
priorities displayed and stored? Those are product decisions an agent will *answer confidently and re-sort by date? How are overdue tasks shown, and in whose timezone? Those are product decisions an
probably wrongly*, because nothing in the issue tells it the right call. A human resolves the agent will *answer confidently and probably wrongly*, because nothing in the issue tells it the
ambiguity first (often by splitting it into clear sub-issues — at which point the pieces may become right call. A human resolves the ambiguity first (often by splitting it into clear sub-issues — at
agent-ready). which point the pieces may become agent-ready).
Notice the heuristic doesn't ask how smart the model is. It asks how well-specified the *work* is. Notice the heuristic doesn't ask how smart the model is. It asks how well-specified the *work* is.
A vague issue degrades gracefully with a human — they ask you a question — and catastrophically with A vague issue degrades gracefully with a human — they ask you a question — and catastrophically with
@@ -244,13 +245,17 @@ from whichever forge's web form you happen to be filling in.
### Part A — Find the work ### Part A — Find the work
Look at the `tasks-app` and find three real pieces of work. You already know two from earlier Look at the `tasks-app` and find three real pieces of work. The app is deliberately thin, so there's
modules; the app is deliberately thin, so there's plenty. Good candidates: plenty it still can't do. Because it's carried forward across modules, skip anything you may have
already built (a `delete` command, task priorities) and pick work that's genuinely still missing.
Good candidates:
1. **A bug**`python cli.py done 99` (an out-of-range index) and `python cli.py done abc` (a 1. **A bug**`python cli.py done 99` (an out-of-range index) and `python cli.py done abc` (a
non-integer) both crash with an uncaught traceback. Run them and watch. non-integer) both crash with an uncaught traceback. Run them and watch.
2. **A small, patterned feature** — a `delete <index>` command, mirroring the existing `done` command. 2. **A small, patterned feature** — an `undone <index>` command that clears a task's done flag,
3. **A judgment-heavy feature** — task priorities (levels? sorting? display? storage?). mirroring the existing `done` command (it's the inverse).
3. **A judgment-heavy feature** — due dates on tasks (date format? sorting? overdue display?
storage?).
### Part B — Draft three well-formed issues ### Part B — Draft three well-formed issues
@@ -269,9 +274,9 @@ On your forge:
2. Apply a small label set to each: a **type** (`bug`/`feature`), a **priority**, and — for the ones 2. Apply a small label set to each: a **type** (`bug`/`feature`), a **priority**, and — for the ones
that qualify — a **`ready`** label meaning the acceptance criteria are solid enough to start. that qualify — a **`ready`** label meaning the acceptance criteria are solid enough to start.
3. **Route them.** This is the module's core exercise: 3. **Route them.** This is the module's core exercise:
- Assign the **judgment-heavy feature (priorities) to a human** — yourself. It has unresolved - Assign the **judgment-heavy feature (due dates) to a human** — yourself. It has unresolved
design questions; it is not agent-ready as written. design questions; it is not agent-ready as written.
- Earmark the **bug** and the **`delete` feature for an agent.** They're well-scoped, patterned, - Earmark the **bug** and the **`undone` feature for an agent.** They're well-scoped, patterned,
and easy to verify. Use whatever your forge offers: an actual agent assignee, an `agent-ready` and easy to verify. Use whatever your forge offers: an actual agent assignee, an `agent-ready`
label, or just a note in the issue saying "suitable for an issue-to-PR agent (Module 25)." The label, or just a note in the issue saying "suitable for an issue-to-PR agent (Module 25)." The
mechanism doesn't matter yet; the *decision* does. mechanism doesn't matter yet; the *decision* does.
@@ -6,6 +6,10 @@
context (with repro for the bug), concrete acceptance criteria, and a stated scope. context (with repro for the bug), concrete acceptance criteria, and a stated scope.
Note how the routing call is a property of the ISSUE (clear vs. ambiguous), not the model. Note how the routing call is a property of the ISSUE (clear vs. ambiguous), not the model.
Because the tasks-app carries forward across modules, some commands you might reach for (a
`delete` command, task priorities) may already exist from earlier labs. These examples
deliberately target work the app does NOT have yet, so each reads as a genuine open issue.
--> -->
# Issue 1 — bug — route to AGENT # Issue 1 — bug — route to AGENT
@@ -47,54 +51,56 @@ Changing how tasks are stored, numbered, or displayed.
# Issue 2 — feature — route to AGENT # Issue 2 — feature — route to AGENT
# Title: Add a `delete <index>` command to remove a task # Title: Add an `undone <index>` command to mark a completed task as not done
## Context / problem ## Context / problem
There's no way to remove a task once added — only `add`, `list`, and `done`. Users accumulate stale You can mark a task `done`, but there's no way to undo it — flag the wrong index by mistake and the
tasks with no way to clear them. The command should mirror the existing `done <index>` command, only "fix" is to delete the task and re-add it. The command should mirror the existing `done <index>`
which already takes an index and mutates the list. command, which already takes an index and flips a task's state; this is simply its inverse.
## Acceptance criteria ## Acceptance criteria
- [ ] `python cli.py delete <index>` removes the task at that index and saves. - [ ] `python cli.py undone <index>` clears the done flag on the task at that index and saves.
- [ ] `delete` with an out-of-range or non-integer index prints a clear error and exits non-zero - [ ] `undone` with an out-of-range or non-integer index prints a clear error and exits non-zero
(same behavior as the fixed `done`, see Issue 1). (same behavior as the fixed `done`, see Issue 1).
- [ ] `list` after a delete shows the remaining tasks, re-indexed. - [ ] `list` after `undone` shows that task as not done (`[ ]`).
- [ ] Usage text mentions the new `delete` command. - [ ] Usage text mentions the new `undone` command.
## Out of scope ## Out of scope
Bulk delete / `clear all` (separate issue if wanted). Changing the storage format. A general multi-step undo / command history (separate concern). Changing the storage format.
## Proposed approach (optional) ## Proposed approach (optional)
Add a `remove(index)` method on `TaskList` in `tasks.py` and wire a `delete` branch in `cli.py`, Add a `reopen(index)` method on `TaskList` in `tasks.py` — the inverse of the existing `complete`
parallel to the existing `done` handling. and wire an `undone` branch in `cli.py`, parallel to the existing `done` handling.
--- ---
- **Type:** feature - **Type:** feature
- **Priority:** med - **Priority:** med
- **Ready:** yes - **Ready:** yes
- **Route to:** agent — well-scoped and patterned directly on existing code; low ambiguity, easy to - **Route to:** agent — well-scoped and patterned directly on existing code (the inverse of `done`);
verify. low ambiguity, easy to verify.
# Issue 3 — feature — route to HUMAN # Issue 3 — feature — route to HUMAN
# Title: Support task priorities # Title: Support due dates on tasks
## Context / problem ## Context / problem
Users want to mark some tasks as more important so the list reflects what to do first. Today every Users want to attach a due date to a task so the list can reflect what's coming up, not just what
task is equal. This is desirable but underspecified — several product decisions have to be made exists. Today a task is only a title and a done flag. This is desirable but underspecified — several
before any code is written. product decisions have to be made before any code is written.
Open questions (resolve before this is `ready`): Open questions (resolve before this is `ready`):
- How many priority levels? (high/med/low, or a numeric scale?) - What date format does the user type, and how forgiving is parsing? (ISO `2026-06-30` only, or
- Does `list` re-sort by priority, or just display it inline? relative like `tomorrow` / `friday`?)
- How is a priority set — at `add` time (a flag?) or with a separate command? - Does `list` re-sort by due date, group by it, or just display it inline?
- How is it stored, and what's the default for existing tasks? - How is a due date set — at `add` time (a flag?) or with a separate command? Can it be cleared?
- How are overdue tasks surfaced — highlighted, flagged, sorted to the top — and in whose timezone?
- How is it stored, and what's the default for the existing tasks that have none?
## Acceptance criteria ## Acceptance criteria