fix(M7-27+capstone): apply AI-drives-git reframe, lesson=theory, de-slop course-wide
Phase 2 sweep — all modules are post-pivot, so the learner directs the AI agent
(Claude Code as the worked example) to do the git/setup work and verifies, instead
of typing commands by hand; no re-teaching basics. Lesson sections are theory with
example output; all execution lives in the labs. De-slopped ("prose" etc. gone
course-wide, em-dash density thinned). /path/to placeholders -> ~/ai-workflow-course.
Every deliberate teaching device verified intact: M10 ai-change.patch trap,
M12 bad-clear-snippet, M13/M27 planted pending_count bug, M15 secret+typosquat+MD5,
M18 BREAK=1, M21 absent-.gitignore, M22 poisoned skill, M24 no-op patch, M25 --simulate.
Labs compile/parse (py/sh/yaml/json); no junk.
Closes #83
Closes #86
Closes #89
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# Module 19 — Runners: The Compute Behind the Automation
|
||||
|
||||
> **Every green check in the last five modules ran on someone else's computer. This module is where
|
||||
> you find out whose — and decide whether it should be yours.** Owning the runner is what turns "I
|
||||
> you find out whose, and decide whether it should be yours.** Owning the runner is what turns "I
|
||||
> use a CI pipeline" into "I own the pipeline, end to end."
|
||||
|
||||
---
|
||||
@@ -85,7 +85,7 @@ A **self-hosted runner** runs that exact same loop — register, poll, execute,
|
||||
machine *you* own: a spare server, a VM in your own cloud account, a box in your homelab, a beefy
|
||||
workstation under a desk. You install the forge's runner agent, register it with a token, and it
|
||||
starts pulling jobs. To the pipeline author, almost nothing changes; the workflow just targets your
|
||||
runner instead of a hosted one (more on the targeting mechanic below).
|
||||
runner instead of a hosted one (the targeting mechanic is below).
|
||||
|
||||
This is the compute analogue of the Module 8 decision. There, you chose between pushing your repo to
|
||||
a hosted forge versus self-hosting one. Here, you choose between renting compute to run your
|
||||
@@ -110,8 +110,8 @@ Don't self-host for the vibe of it. Self-host when one of these actually applies
|
||||
(Module 18) needs to deploy to a server on your private network. Your tests need a database that
|
||||
lives on an internal VLAN. A hosted runner sits on the public internet and cannot reach any of
|
||||
that without you punching holes in your firewall. A self-hosted runner placed *inside* your
|
||||
network already has line-of-sight — no inbound holes, no VPN gymnastics. (This is also exactly why
|
||||
it's a security problem; hold that thought.)
|
||||
network already has line-of-sight, with no inbound holes and no VPN gymnastics. (This is also
|
||||
exactly why it's a security problem; hold that thought.)
|
||||
|
||||
4. **Custom or specialized hardware.** GPUs for ML work, a specific CPU architecture, more RAM than
|
||||
any hosted tier offers, a hardware security module, a USB device for hardware-in-the-loop tests.
|
||||
@@ -125,44 +125,50 @@ If none of these apply, stay on hosted. "I want to" is not on the list.
|
||||
|
||||
### The mechanic: register, target, run
|
||||
|
||||
The shape is the same on every forge; only the command names and config filenames differ. The
|
||||
pattern, vendor-neutral:
|
||||
The shape is the same on every forge; only the command names and config filenames differ. Three
|
||||
moving parts, vendor-neutral.
|
||||
|
||||
- **Get a registration token** from the forge — at the repo, org, or instance level, in the
|
||||
forge's settings under its "Runners" or "CI/CD" section. The token is short-lived and proves you're
|
||||
allowed to attach a runner here.
|
||||
- **Run the runner agent's register/config command** on your machine, pointing it at your forge URL
|
||||
and handing it the token. This writes a small local config/identity file and starts the agent
|
||||
polling. Concretely, the agent and command differ per forge — for example:
|
||||
- GitHub-style Actions: a `config` script that registers the agent, then a `run` script (or a
|
||||
service) that starts polling.
|
||||
- GitLab: a `gitlab-runner register` command, then the runner runs as a service.
|
||||
- Forgejo/Gitea: an `act_runner register` command (Actions-compatible), then `act_runner daemon`.
|
||||
A **registration token** ties a runner to a forge. It's generated in the forge's settings, under its
|
||||
"Runners" or "CI/CD" section, at the repo, org, or instance level. It's short-lived and proves the
|
||||
runner is allowed to attach here. Because it lives behind the forge's web UI, this is the one part of
|
||||
standing up a runner that stays a human-in-the-browser step.
|
||||
|
||||
All three do the same two things: *register an identity*, then *start the poll loop.* Don't memorize
|
||||
the flags — read your forge's runner docs at build time (the commands drift; see the checklist).
|
||||
- **Label the runner and target it from the workflow.** A runner advertises **labels** (e.g.
|
||||
`self-hosted`, `linux`, `gpu`, `internal-net`). Your job selects runners by label — in
|
||||
Actions-style YAML that's the `runs-on:` field; in GitLab it's `tags:`. So changing a job from
|
||||
hosted to your own runner is often a one-line edit:
|
||||
A **register/config command** turns that token into a running agent. The agent and its flags vary by
|
||||
forge: GitHub-style Actions uses a `config` script then a `run` script (or a service); GitLab uses
|
||||
`gitlab-runner register`; Forgejo/Gitea use `act_runner register` then `act_runner daemon`. Every one
|
||||
does the same two things, though: write a small local identity file, then start the poll loop. A
|
||||
successful registration confirms the runner and it shows up online in the forge. What that looks like:
|
||||
|
||||
```yaml
|
||||
# before — hosted:
|
||||
runs-on: ubuntu-latest
|
||||
# after — your runner, selected by label:
|
||||
runs-on: [self-hosted, linux, internal-net]
|
||||
```
|
||||
```text
|
||||
$ act_runner register --instance https://git.example.com --token *** --labels self-hosted,linux
|
||||
INFO Runner registered successfully.
|
||||
INFO Runner self-hosted is now online.
|
||||
```
|
||||
|
||||
That one line is the whole "I now own this pipeline" switch. Everything else in your Module 14
|
||||
workflow stays identical, because the runner runs the same loop either way.
|
||||
The flags drift between releases, so they're something to look up against current runner docs rather
|
||||
than memorize (see the checklist).
|
||||
|
||||
A **label** is how a workflow picks a runner. A runner advertises labels (`self-hosted`, `linux`,
|
||||
`gpu`, `internal-net`); a job selects them with `runs-on:` in Actions-style YAML, or `tags:` in
|
||||
GitLab. So moving a job from hosted to your own runner is one line:
|
||||
|
||||
```yaml
|
||||
# before — hosted:
|
||||
runs-on: ubuntu-latest
|
||||
# after — your runner, selected by label:
|
||||
runs-on: [self-hosted, linux, internal-net]
|
||||
```
|
||||
|
||||
That one line is the whole "I now own this pipeline" switch. Everything else in your Module 14
|
||||
workflow stays identical, because the runner runs the same loop either way.
|
||||
|
||||
### Ephemeral vs. persistent — the property that matters most
|
||||
|
||||
A hosted runner is **ephemeral**: fresh machine per job, destroyed after. A self-hosted runner is
|
||||
**persistent by default**: the same machine, with the same disk, runs job after job. That difference
|
||||
is the source of nearly every self-hosted runner security incident, so it gets its own section
|
||||
below — but flag it now. The clean-room guarantee you got for free with hosted runners is something
|
||||
you have to *rebuild on purpose* when you self-host.
|
||||
is the source of nearly every self-hosted runner security incident, so it gets its own section below;
|
||||
flag it now. The clean-room guarantee you got for free with hosted runners is something you have to
|
||||
*rebuild on purpose* when you self-host.
|
||||
|
||||
---
|
||||
|
||||
@@ -180,7 +186,7 @@ biggest line item. When you reach Module 25 and stand up an agent that runs unat
|
||||
*this* is the machine it runs on.
|
||||
|
||||
**2. The agent needs hands, and the self-hosted runner is the hands.** A self-hosted runner inside
|
||||
your network is the most direct way to give an automated agent real reach — deploy access, internal
|
||||
your network is the most direct way to give an automated agent real reach: deploy access, internal
|
||||
databases, private services. That's the payoff and the peril in one sentence. The same property that
|
||||
makes a self-hosted runner useful for an unattended agent (it can touch your real systems) is exactly
|
||||
what makes it dangerous when the code it runs isn't yours. Which brings us to the part you cannot skip.
|
||||
@@ -214,17 +220,20 @@ a repo also works). If a real runner is too heavy right now, Track A alone satis
|
||||
would see if they got code execution on it.
|
||||
- For Track B: a forge you can register a runner against, and a spare machine or VM to be the runner
|
||||
(your laptop is fine for a one-off; don't leave it registered).
|
||||
- Your AI assistant.
|
||||
- Claude Code (sub your own agent).
|
||||
|
||||
### Track A — Find out whose computer you've been using (everyone)
|
||||
|
||||
1. **Make the invisible visible.** Copy `lab/whoami-runner.yml` into your repo's workflow directory
|
||||
(the same place your Module 14 `ci.yml` lives — for Actions-style forges that's
|
||||
`.github/`/`.forgejo/`/`.gitea/` under `workflows/`; the file comments tell you where). Commit and
|
||||
push. It runs the same lint-and-test as Module 14, then prints the runner's hostname, OS, user,
|
||||
whether it looks ephemeral, and whether it can reach the public internet. The receipt step carries
|
||||
`if: always()` so it still prints even when lint or test fail — a diagnostic shouldn't disappear on
|
||||
a red build (the job still reports red). On GitLab CI the same idea is `when: always` on the job.
|
||||
1. **Make the invisible visible.** Direct Claude Code (sub your own agent) to place
|
||||
`lab/whoami-runner.yml` in the same workflow directory your Module 14 `ci.yml` lives in, then
|
||||
commit and push it. State the goal, not the path: *"Drop this whoami-runner workflow into the right
|
||||
workflows directory for this forge, commit it, and push."* The agent resolves the directory for an
|
||||
Actions-style forge (`.github/`/`.forgejo/`/`.gitea/` under `workflows/`). **You verify:** the run
|
||||
shows up on the forge. It runs the same lint-and-test as Module 14, then prints the runner's
|
||||
hostname, OS, user, whether it looks ephemeral, and whether it can reach the public internet. The
|
||||
receipt step carries `if: always()` so it still prints even when lint or test fail — a diagnostic
|
||||
shouldn't disappear on a red build (the job still reports red). On GitLab CI the same idea is
|
||||
`when: always` on the job.
|
||||
|
||||
2. **Read the receipt.** Open the job logs on your forge and read the `Where did this run?` step.
|
||||
You're now able to answer, for a real job, the question this module opened with: *whose computer
|
||||
@@ -243,27 +252,29 @@ a repo also works). If a real runner is too heavy right now, Track A alone satis
|
||||
private hosts on your network are reachable. This is not hypothetical. A workflow step is a shell
|
||||
command; whatever the script can see, a malicious workflow step can see too.
|
||||
|
||||
4. **Walk the tradeoff with your AI, grounded in that output.** Paste the `inspect-runner.sh` output
|
||||
into your AI and ask: *"If this machine were a self-hosted CI runner and someone opened a pull
|
||||
request with a malicious workflow step, what could they reach or steal? Rank it worst-first."*
|
||||
Read the answer against your real output. This is the honest version of "why you'd run your own" —
|
||||
the network reach that makes a self-hosted runner *useful* is the exact same reach that makes a
|
||||
compromised one *catastrophic.*
|
||||
4. **Walk the tradeoff with Claude Code (sub your own agent), grounded in that output.** Paste the
|
||||
`inspect-runner.sh` output into the agent and ask: *"If this machine were a self-hosted CI runner
|
||||
and someone opened a pull request with a malicious workflow step, what could they reach or steal?
|
||||
Rank it worst-first."* Read the answer against your real output. This is the honest version of "why
|
||||
you'd run your own" — the network reach that makes a self-hosted runner *useful* is the exact same
|
||||
reach that makes a compromised one *catastrophic.*
|
||||
|
||||
### Track B — Own the pipeline (if you can attach a runner)
|
||||
|
||||
5. **Get a registration token.** In your forge's settings, find the Runners / CI/CD section and
|
||||
generate a runner registration token (repo-level is the tightest scope — start there).
|
||||
|
||||
6. **Register the runner.** On your runner machine, download your forge's runner agent and run its
|
||||
register command, pointing at your forge URL with the token, and give it a clear label like
|
||||
`self-hosted`. The exact command is forge-specific — open your forge's runner docs and follow the
|
||||
register step (the Key concepts section names the three common agents). When it's registered, start
|
||||
the agent so it begins polling. Confirm it shows as **online** in the forge's Runners list.
|
||||
6. **Register the runner.** Hand this to Claude Code (sub your own agent) on your runner machine:
|
||||
*"Look up the current runner-agent docs for my forge, then download the agent, register it against
|
||||
my forge URL with this token, label it `self-hosted`, and start it polling."* The commands are
|
||||
forge-specific and drift between releases, which is exactly why you let the agent fetch the current
|
||||
docs instead of running a half-remembered command. **You verify:** the runner shows as **online**
|
||||
in the forge's Runners list.
|
||||
|
||||
7. **Aim CI at your runner — the one-line switch.** Edit the `runs-on:` (or `tags:`) line in your
|
||||
`tasks-app` CI workflow to select your runner's label instead of the hosted image, exactly as
|
||||
shown in Key concepts. Commit and push.
|
||||
7. **Aim CI at your runner — the one-line switch.** Tell Claude Code (sub your own agent): *"Change
|
||||
the `runs-on:` (or `tags:`) line in the `tasks-app` CI workflow to target my `self-hosted` runner
|
||||
instead of the hosted image, then commit and push."* That's the before/after edit from Key
|
||||
concepts. **You verify:** from the job log, the run executed on your own runner.
|
||||
|
||||
8. **Watch your own machine do the work.** Open the job logs. The lint-and-test pass from Module 14
|
||||
now runs on hardware you own. Re-run the `whoami-runner.yml` workflow too and compare its output to
|
||||
@@ -271,9 +282,10 @@ a repo also works). If a real runner is too heavy right now, Track A alone satis
|
||||
machine. Run it twice and look for leftovers (a `pip` cache, files from the previous run). That
|
||||
persistence is the thing to respect.
|
||||
|
||||
9. **Clean up.** If this was a one-off on your laptop, **remove the runner** from the forge and stop
|
||||
the agent. A registered-but-forgotten runner is a standing liability — exactly the kind of stale
|
||||
backdoor the security section warns about.
|
||||
9. **Clean up.** Have Claude Code (sub your own agent) stop and unregister the runner agent on your
|
||||
machine. Then **remove the runner** from the forge's Runners list yourself; that side is a forge-UI
|
||||
step. **You verify:** the runner disappears from the list. A registered-but-forgotten runner is a
|
||||
standing liability, exactly the kind of stale backdoor the security section warns about.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user