Build out all 27 modules + capstone (#1)
Co-authored-by: claude <claude@jpaul.io> Co-committed-by: claude <claude@jpaul.io>
This commit was merged in pull request #1.
This commit is contained in:
@@ -0,0 +1,82 @@
|
||||
# Reference: an autonomous agent running as a RUNNER JOB (Module 19) — triggered and scheduled.
|
||||
#
|
||||
# This is the "for real" version of agent_runner.py: instead of you launching the agent, the forge
|
||||
# launches it on a runner in response to an event or a timer, and the agent opens a PR. That PR then
|
||||
# hits your NORMAL gates — CI (Module 14), security scanning (Module 15), and human review (Module
|
||||
# 10) — exactly like a human's PR. The supervision is structural; this file just automates the start.
|
||||
#
|
||||
# GitHub Actions flavor (same as Module 14's ci.yml), so it goes in .github/workflows/. Equivalents:
|
||||
# * GitLab: a job with `rules:` on $CI_PIPELINE_SOURCE + a `workflow:` schedule.
|
||||
# * Forgejo/Gitea: the same YAML under .forgejo/workflows/ or .gitea/workflows/.
|
||||
#
|
||||
# DO NOT enable this blindly. Read the security notes at the bottom first — an unattended agent with a
|
||||
# write token is automation acting in your name. This is the last thing you turn on, on purpose.
|
||||
|
||||
name: agent-issue-to-pr
|
||||
|
||||
on:
|
||||
# TRIGGERED: fire when an issue gets the `agent` label. Event in -> agent runs -> PR out.
|
||||
issues:
|
||||
types: [labeled]
|
||||
# SCHEDULED: also attempt work overnight. This is "the workflow runs itself" — keep it cheap.
|
||||
schedule:
|
||||
- cron: "0 6 * * *" # 06:00 UTC daily; adjust to your timezone and budget.
|
||||
|
||||
jobs:
|
||||
agent:
|
||||
# Only run the triggered path when the label is actually `agent` (labeled events fire for ANY
|
||||
# label). The scheduled path has no label, so allow it through too.
|
||||
if: ${{ github.event_name == 'schedule' || github.event.label.name == 'agent' }}
|
||||
runs-on: ubuntu-latest # whose compute this is — see Module 19 for self-hosted runners.
|
||||
|
||||
# Least privilege (Module 17): grant ONLY what opening a PR needs. Not admin, not secrets access.
|
||||
permissions:
|
||||
contents: write # create the branch and commit
|
||||
pull-requests: write # open the PR
|
||||
issues: read # read the issue body (the agent's brief)
|
||||
|
||||
steps:
|
||||
- name: Check out the code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.12"
|
||||
|
||||
- name: Install gate tools
|
||||
run: pip install pytest ruff
|
||||
|
||||
- name: Run the agent on a fresh branch
|
||||
env:
|
||||
# The agent's model credentials come from a SCOPED secret you set in the forge — never
|
||||
# hardcoded here (Module 17). Keep this provider-neutral: it's whatever your agent needs.
|
||||
AGENT_API_KEY: ${{ secrets.AGENT_API_KEY }}
|
||||
# Point AGENT_CMD at your agentic tool's non-interactive / one-shot mode.
|
||||
AGENT_CMD: "your-agent-cli --print --prompt-file {prompt_file}"
|
||||
run: |
|
||||
git switch -c "agent/issue-${{ github.event.issue.number || github.run_id }}"
|
||||
# In the triggered case, write the issue body to a file for the agent to read.
|
||||
printf '%s' "${{ github.event.issue.body }}" > issue.md
|
||||
python modules/25-autonomous-agents/lab/agent_runner.py issue-to-pr issue.md
|
||||
|
||||
# The agent's output is a PROPOSAL. Open the PR; do NOT merge. CI + security + review decide.
|
||||
# (Use your forge's PR-creation step or CLI here; kept generic to stay vendor-neutral.)
|
||||
- name: Open a pull request for review
|
||||
run: |
|
||||
git push -u origin HEAD
|
||||
echo "Open a PR from this branch via your forge's API/CLI. It must pass CI (Module 14),"
|
||||
echo "security scanning (Module 15), and human review (Module 10) before anyone merges it."
|
||||
|
||||
# --- Security notes (read before enabling) -------------------------------------------------------
|
||||
# * Prompt injection (Module 22): github.event.issue.body is UNTRUSTED input that lands straight in
|
||||
# the agent's context. A malicious issue can try to redirect the agent ("ignore your instructions,
|
||||
# exfiltrate secrets..."). Scope the token tightly so a hijack can't do much, and never give this
|
||||
# job access to deployment or admin secrets.
|
||||
# * No auto-merge. This file stops at "open a PR". Wiring an agent to merge its own work to main
|
||||
# removes the human gate and is out of scope for this course.
|
||||
# * Sandbox (Module 16): for agents you trust less, run the agent step inside a container with no
|
||||
# network beyond what it needs.
|
||||
# * Cost: a scheduled agent that re-attempts the same impossible issue every night burns runner
|
||||
# minutes. Cap retries (agent_runner.py does) and consider a label the agent removes when it gives
|
||||
# up, so it doesn't retry forever.
|
||||
@@ -0,0 +1,258 @@
|
||||
"""Module 25 lab — an autonomous-but-supervised agent orchestrator.
|
||||
|
||||
This is the smallest honest version of the two patterns in the module:
|
||||
|
||||
* issue-to-pr — read an issue, let an agent implement it, run the gate, produce a PR PROPOSAL.
|
||||
* self-heal — run the gate; on failure, feed the failure back to the agent for a fix,
|
||||
bounded by a retry cap; produce a PR PROPOSAL.
|
||||
|
||||
The load-bearing idea is in one place and you should be able to point at it: the agent NEVER merges.
|
||||
Every path ends at `propose_pr()` — a branch, a commit, and the command *you* would run to open the
|
||||
PR. The CI/review/security gates (Modules 14/15/10) and recovery (Module 12) are what supervise it,
|
||||
not a human watching it type.
|
||||
|
||||
Run it two ways:
|
||||
|
||||
1. Simulated (no agent needed, fully deterministic) — see the machinery and the gates:
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate good
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md --simulate bad
|
||||
python agent_runner.py self-heal --simulate bad
|
||||
python agent_runner.py self-heal --simulate stuck
|
||||
|
||||
Simulation works on a SELF-CONTAINED demo target (agent_demo.py + test_agent_demo.py) so it is
|
||||
deterministic and never corrupts your real tasks-app files. The gate it runs (ruff + pytest) is
|
||||
the real one — the same checks Module 14's CI runs.
|
||||
|
||||
2. Real agent — drives your own agentic tool against the actual issue. Point AGENT_CMD at your
|
||||
tool's non-interactive / one-shot mode, then drop --simulate:
|
||||
export AGENT_CMD='your-agent-cli --print --prompt-file {prompt_file}'
|
||||
python agent_runner.py issue-to-pr issue-delete-command.md
|
||||
|
||||
Language: Python 3.10+. Standard library only.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import shlex
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
RETRY_CAP = 3 # self-healing stops after this many fix attempts and hands off to a human.
|
||||
|
||||
# Demo target the simulator works on, so simulation never touches your real cli.py / tasks.py.
|
||||
DEMO_SRC = Path("agent_demo.py")
|
||||
DEMO_TEST = Path("test_agent_demo.py")
|
||||
|
||||
# Vendor-neutral: where your committed AI config (Module 5) might live. Override with AGENT_CONFIG.
|
||||
CONFIG_CANDIDATES = ["AGENTS.md", ".agent/instructions.md", "agent-config.md"]
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
# The gate — the same lint + test checks Module 14 runs in CI, run locally so they're reproducible.
|
||||
# This is the structural supervision. It does not care whether a human or an agent wrote the change.
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
def run_gate() -> tuple[bool, str]:
|
||||
"""Run ruff then pytest in the current directory. Return (passed, combined_output)."""
|
||||
out: list[str] = []
|
||||
ok = True
|
||||
for label, cmd in (("ruff (lint)", ["ruff", "check", "."]),
|
||||
("pytest (tests)", ["pytest", "-q"])):
|
||||
out.append(f"\n=== gate: {label} -> {' '.join(cmd)} ===")
|
||||
try:
|
||||
proc = subprocess.run(cmd, capture_output=True, text=True)
|
||||
except FileNotFoundError:
|
||||
out.append(f" ! {cmd[0]} not installed — `pip install pytest ruff`. Treating as a gate FAIL.")
|
||||
ok = False
|
||||
continue
|
||||
out.append(proc.stdout.rstrip())
|
||||
if proc.stderr.strip():
|
||||
out.append(proc.stderr.rstrip())
|
||||
if proc.returncode != 0:
|
||||
ok = False
|
||||
out.append(f" -> FAILED ({label})")
|
||||
return ok, "\n".join(line for line in out if line is not None)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
# The agent — real (your tool) or simulated (deterministic, for the lab).
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
def find_config() -> Path | None:
|
||||
env = os.environ.get("AGENT_CONFIG")
|
||||
if env and Path(env).exists():
|
||||
return Path(env)
|
||||
for name in CONFIG_CANDIDATES:
|
||||
if Path(name).exists():
|
||||
return Path(name)
|
||||
return None
|
||||
|
||||
|
||||
def build_prompt(task: str, *, issue_path: Path | None = None, failure: str | None = None) -> str:
|
||||
"""Assemble the agent's brief: standing config (Module 5) + the specific task (issue or failure)."""
|
||||
parts = ["You are working in a Git repository on the current branch. Make the change directly in",
|
||||
"the files. Do not commit, push, or merge — just edit. Follow the project's conventions."]
|
||||
config = find_config()
|
||||
if config:
|
||||
parts += ["", f"# Project conventions (from {config})", config.read_text()]
|
||||
if issue_path:
|
||||
parts += ["", "# Task (issue to implement)", issue_path.read_text()]
|
||||
if failure:
|
||||
parts += ["", "# A CI check just failed. Fix the CODE so it passes — do not weaken or delete",
|
||||
"# the test to make it pass. Here is the failing output:", "```", failure, "```"]
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
def run_real_agent(prompt: str) -> None:
|
||||
"""Drive the learner's agentic tool via AGENT_CMD. Template may contain {prompt_file}; otherwise
|
||||
the prompt is piped to stdin. Kept vendor-neutral on purpose."""
|
||||
template = os.environ["AGENT_CMD"]
|
||||
with tempfile.NamedTemporaryFile("w", suffix=".md", delete=False) as fh:
|
||||
fh.write(prompt)
|
||||
prompt_file = fh.name
|
||||
try:
|
||||
if "{prompt_file}" in template:
|
||||
cmd = shlex.split(template.replace("{prompt_file}", prompt_file))
|
||||
proc = subprocess.run(cmd)
|
||||
else:
|
||||
proc = subprocess.run(shlex.split(template), input=prompt, text=True)
|
||||
if proc.returncode != 0:
|
||||
sys.exit(f"agent command exited non-zero ({proc.returncode}); aborting.")
|
||||
finally:
|
||||
os.unlink(prompt_file)
|
||||
|
||||
|
||||
# Simulated agent: writes a self-contained demo module so the gate has something real to judge.
|
||||
def simulate_implement(variant: str) -> None:
|
||||
DEMO_TEST.write_text(
|
||||
"from agent_demo import discount\n\n\n"
|
||||
"def test_discount_takes_a_percentage():\n"
|
||||
" # 10% off 200 is 180. A flat subtraction (200 - 10 = 190) is the plausible-but-wrong bug.\n"
|
||||
" assert discount(200, 10) == 180\n"
|
||||
)
|
||||
if variant == "good":
|
||||
DEMO_SRC.write_text("def discount(price, pct):\n return price - price * pct / 100\n")
|
||||
else: # 'bad' — plausible but wrong: treats the percent as a flat amount.
|
||||
DEMO_SRC.write_text("def discount(price, pct):\n return price - pct\n")
|
||||
|
||||
|
||||
def simulate_fix(variant: str, attempt: int) -> None:
|
||||
if variant == "stuck":
|
||||
# The "agent" keeps producing plausible, still-wrong fixes — the loop must give up, not run forever.
|
||||
DEMO_SRC.write_text(f"def discount(price, pct):\n return price - pct - {attempt}\n")
|
||||
else: # 'bad' — converges on the second attempt with the correct formula.
|
||||
DEMO_SRC.write_text("def discount(price, pct):\n return price - price * pct / 100\n")
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
# The endpoint every path shares: a PR PROPOSAL. Never a merge.
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
def in_git_repo() -> bool:
|
||||
return subprocess.run(["git", "rev-parse", "--is-inside-work-tree"],
|
||||
capture_output=True).returncode == 0
|
||||
|
||||
|
||||
def propose_pr(message: str) -> None:
|
||||
print("\n" + "=" * 80)
|
||||
print("GATE PASSED. Proposing a PR — NOT merging. A human reviews the diff (Module 10).")
|
||||
print("=" * 80)
|
||||
if in_git_repo():
|
||||
subprocess.run(["git", "add", "-A"])
|
||||
subprocess.run(["git", "commit", "-m", message])
|
||||
branch = subprocess.run(["git", "rev-parse", "--abbrev-ref", "HEAD"],
|
||||
capture_output=True, text=True).stdout.strip()
|
||||
print("\nReview the change you're about to propose:")
|
||||
print(" git show HEAD # or: git diff main..HEAD")
|
||||
print("\nThen open the PR (nothing has left your machine yet):")
|
||||
print(f" git push -u origin {branch}")
|
||||
print(" # ...and open a pull request on your forge. CI + security gates run there.")
|
||||
else:
|
||||
print("\n(Not a Git repo — skipping commit. In your tasks-app this would commit to the branch.)")
|
||||
print("\nThe agent stops here. It cannot merge. That is the whole safety model.")
|
||||
|
||||
|
||||
def reject(reason: str, gate_output: str) -> None:
|
||||
print(gate_output)
|
||||
print("\n" + "=" * 80)
|
||||
print(f"GATE FAILED: {reason}")
|
||||
print("No PR proposed. The branch is left as-is for you to inspect or discard:")
|
||||
print(" git restore . # throw the agent's change away (Module 2)")
|
||||
print("=" * 80)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
# The two patterns.
|
||||
# --------------------------------------------------------------------------------------------------
|
||||
def cmd_issue_to_pr(issue_path: Path, simulate: str | None) -> int:
|
||||
print(f"[issue-to-pr] brief: {issue_path}")
|
||||
if simulate:
|
||||
print(f"[issue-to-pr] simulating a '{simulate}' agent on the self-contained demo target.")
|
||||
simulate_implement(simulate)
|
||||
else:
|
||||
run_real_agent(build_prompt("implement", issue_path=issue_path))
|
||||
|
||||
ok, gate_output = run_gate()
|
||||
if ok:
|
||||
print(gate_output)
|
||||
propose_pr(f"Agent: implement {issue_path.stem}")
|
||||
return 0
|
||||
reject("the agent's change does not pass the gate", gate_output)
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_self_heal(simulate: str | None) -> int:
|
||||
# Establish a failing state to heal. In a real pipeline this is "CI just went red on a push".
|
||||
if simulate:
|
||||
print(f"[self-heal] simulating a red build ('{simulate}') on the demo target.")
|
||||
simulate_implement("bad")
|
||||
else:
|
||||
print("[self-heal] running the gate on the current working tree to find the failure...")
|
||||
|
||||
for attempt in range(1, RETRY_CAP + 1):
|
||||
ok, gate_output = run_gate()
|
||||
if ok:
|
||||
print(gate_output)
|
||||
print(f"\n[self-heal] gate is green after {attempt - 1} fix attempt(s).")
|
||||
propose_pr("Agent: self-healing fix for failing CI")
|
||||
return 0
|
||||
print(gate_output)
|
||||
if attempt > RETRY_CAP - 1:
|
||||
break
|
||||
print(f"\n[self-heal] gate red — attempt {attempt}/{RETRY_CAP - 1}: asking the agent for a fix.")
|
||||
if simulate:
|
||||
simulate_fix(simulate, attempt)
|
||||
else:
|
||||
run_real_agent(build_prompt("fix", failure=gate_output))
|
||||
|
||||
print("\n" + "=" * 80)
|
||||
print(f"SELF-HEAL GAVE UP after {RETRY_CAP - 1} attempts. Handing off to a human — NOT looping forever.")
|
||||
print("This cap is what stops an agent burning a runner bill chasing a flaky or impossible fix.")
|
||||
print("=" * 80)
|
||||
return 2
|
||||
|
||||
|
||||
def main(argv: list[str]) -> int:
|
||||
parser = argparse.ArgumentParser(description="Autonomous-but-supervised agent orchestrator (Module 25).")
|
||||
sub = parser.add_subparsers(dest="command", required=True)
|
||||
|
||||
p_itp = sub.add_parser("issue-to-pr", help="implement an issue and propose a PR")
|
||||
p_itp.add_argument("issue", type=Path, help="path to the issue markdown file")
|
||||
p_itp.add_argument("--simulate", choices=["good", "bad"], help="run without a real agent")
|
||||
|
||||
p_sh = sub.add_parser("self-heal", help="fix a failing gate, bounded by a retry cap, and propose a PR")
|
||||
p_sh.add_argument("--simulate", choices=["bad", "stuck"], help="run without a real agent")
|
||||
|
||||
args = parser.parse_args(argv)
|
||||
if not args.simulate and "AGENT_CMD" not in os.environ:
|
||||
sys.exit("No --simulate and no AGENT_CMD set. Set AGENT_CMD to your agent's headless command, "
|
||||
"or pass --simulate to run the deterministic demo.")
|
||||
|
||||
if args.command == "issue-to-pr":
|
||||
return cmd_issue_to_pr(args.issue, args.simulate)
|
||||
return cmd_self_heal(args.simulate)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main(sys.argv[1:]))
|
||||
@@ -0,0 +1,35 @@
|
||||
<!--
|
||||
The agent's INPUT for Module 25. This is a well-formed issue in the Module 9 format: title,
|
||||
context, acceptance criteria, scope. It is deliberately a good candidate for an agent — well-
|
||||
scoped, concrete, and it mirrors a pattern already in the codebase (the existing `done` command).
|
||||
|
||||
The orchestrator (agent_runner.py) reads this file and pairs it with your committed AI config
|
||||
(Module 5) to build the agent's brief. Edit it and you change what the agent attempts.
|
||||
-->
|
||||
|
||||
# Add a `delete <index>` command to the CLI
|
||||
|
||||
**Type:** feature · **Priority:** p2 · **Labels:** `cli`, `ready`, `agent`
|
||||
|
||||
## Context
|
||||
|
||||
`tasks-app` can `add`, `list`, and mark a task `done`, but there's no way to remove a task. Once a
|
||||
task is added by mistake it stays forever. The `done` command already takes an index and mutates the
|
||||
list through a method on `TaskList`, so a `delete` command should follow the exact same shape — this
|
||||
is a patterned change, not a design problem.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
- `python cli.py delete <index>` removes the task at that 0-based index and saves the list.
|
||||
- After deleting, the remaining tasks keep their relative order.
|
||||
- `delete` with an out-of-range or non-integer index prints a clear error (e.g.
|
||||
`no task at index 99`) and exits non-zero, instead of dumping a traceback.
|
||||
- The logic lives on `TaskList` (a `remove(index)` method or equivalent), mirroring how `complete`
|
||||
works — `cli.py` only parses arguments and calls it.
|
||||
- A test covers: a successful delete removes the right task, and an out-of-range delete is handled.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Changing how tasks are stored or numbered.
|
||||
- Bulk delete, undo, or a confirmation prompt.
|
||||
- Reworking the existing `add` / `list` / `done` commands.
|
||||
Reference in New Issue
Block a user