# Module 21 — Skills: Teaching the AI Your Playbook > **Stop re-explaining your own procedures.** A skill is a repeatable workflow written down once, > committed, and invoked on demand — so the AI does the thing *your* way, the same way, every time, > without you narrating the steps again. --- ## Prerequisites - **Module 2** — you commit, read diffs, and treat the repo as durable memory. Skills live in that repo and are versioned exactly like code. - **Module 3** — markdown-as-versioned-text, and the `CHANGELOG.md` convention this module's lab writes to. - **Module 4** — the AI lives in your editor/CLI and reads your files directly. A skill is a file it loads; a browser chat can't pick one up automatically. - **Module 5 — the one this builds on directly.** You committed an always-on instructions file that tells the AI how the project works in general. This module is its **structured big sibling**: the same write-it-down-and-commit instinct, but for *specific repeatable procedures* invoked on demand. - **Module 13** — what a real test is (and why "it didn't crash" isn't one). The lab's procedure includes writing one. - *Helpful, not required:* **Module 20 (MCP)** — a skill's steps can call the real tools an MCP server exposes, which is where playbooks get genuinely powerful. --- ## Learning objectives By the end of this module you can: 1. Explain the difference between an **always-on instructions file (Module 5)** and a **skill** — and say when each is the right tool. 2. Write a skill: a structured, named, invokable playbook for a recurring task, in your tool's format-agnostic essentials (when-to-use, inputs, ordered steps, done-criteria). 3. Have the AI **execute** a skill end to end and verify it followed every step. 4. Keep skills in version control so a procedure is shareable, reviewable, and recoverable like any other artifact. 5. Recognize when a one-off prompt has earned promotion into a durable skill — and when it hasn't. --- ## Key concepts ### The pain: you keep narrating the same procedure You've written the Module 5 instructions file, and it's working — the AI knows your layout, your test command, your off-limits files. But there's a class of knowledge it doesn't cover: **multi-step procedures you run again and again.** "Add a new CLI command" is the canonical example. Done properly it's never one edit — it's: put the logic in the right file, wire the CLI, write a test that actually checks the behavior, run the tests, smoke-test the command, add a changelog line, commit it as one clean change. The AI can do every step. But left to a bare prompt — *"add a `clear` command"* — it'll usually give you the code and forget the test, or skip the changelog, or commit `tasks.json` along for the ride. So you spell out the seven steps. It works. Next week you add another command and **you spell out the same seven steps again.** That re-narration is the exact pain Module 1 named, one level up: not re-explaining the *project* each session, but re-explaining the *procedure* each time you run it. A skill is where that procedure stops being something you retype and becomes something the repo carries. ### What a skill is A **skill** is a named, structured, invokable set of instructions for one repeatable procedure, stored as a file in the repo and loaded **on demand** when that procedure is the task at hand. Strip the vendor branding and every skill has the same four parts: - **A name and a "when to use it."** So both you and the AI know which playbook applies — and, just as importantly, when it *doesn't*. - **Inputs.** The few things the procedure needs to be told (here: the command name and what it does). - **Ordered steps.** The actual procedure — the commands, the files, the checks, in sequence, with the non-negotiables marked ("run the tests before claiming success," "don't stage `tasks.json`"). - **Done-criteria.** How the AI (and you) know it's actually finished, not just "produced something." That's it. A skill is a checklist precise enough that an agent can execute it and you can verify it did. ### Skill vs. the Module 5 instructions file This is the distinction to lock in, because the two are siblings and easy to conflate: | | **Committed instructions file (Module 5)** | **Skill (this module)** | |---|---|---| | Scope | How the project works, *in general* | How to do *one specific procedure* | | When it loads | **Always on** — read every session | **On demand** — invoked when relevant | | Shape | Ambient briefing: conventions, commands, don't-touch list | A playbook: when-to-use, inputs, ordered steps, done-criteria | | Analogy | The standing house rules posted on the wall | A labeled recipe card you pull out when you cook that dish | They're complementary. The instructions file is the right home for facts true *all the time* ("tests run with `python -m unittest`"). A skill is the right home for a procedure you run *sometimes* ("here is exactly how we add a command"). Module 5 even told you this was coming: start with the always-on file; graduate a procedure into a skill when it earns its own page. ### Why "on demand" is the whole point Module 5 warned that **bloat kills an instructions file** — a 300-line always-on briefing gets read the way you read a terms-of-service. So you *can't* solve the re-narration problem by stuffing every procedure into the always-on file; you'd drown the signal that makes it work. Skills are the escape hatch. Because a skill loads only when its procedure is the task, you can write it in full detail — every step, every guardrail — without taxing every unrelated session. Ten skills cost the AI nothing on a session that invokes none of them. This is **progressive disclosure**: keep the always-on context lean, and pull in the deep procedure exactly when it's needed. It's the same reason you don't tape every recipe you own to the kitchen wall. ### Skills live in version control This is what makes a skill more than a snippet in a notes app, and it's why this module sits where it does in the course. A skill is a file in the repo, so everything you already learned about versioned text applies to it directly: - **Recoverable and historied (Module 2).** A skill has a `git log`. You can see when a step was added and why, and `git restore` a botched edit. The procedure is a checkpoint like any other. - **Shareable (Modules 8 & 11).** Push the repo and the whole team — and every agent that later operates on it — inherits the same playbook. Nobody runs their own private version of "how we add a command." It's the Module 5 anti-drift argument, applied to procedures. - **Reviewable (Module 10).** Changing how the AI performs a procedure arrives as a **diff in a PR**. Tightening "add a test" into "add a test that asserts the end state, not just no-crash" is a reviewable change to your team's workflow — not an invisible tweak in one person's setup. A prompt you keep in your head dies with the session. A skill in the repo is durable, shared capability. That's the upgrade: from one-off prompting to a versioned, reviewable asset. ### Naming the pattern, not the vendor "Skills" is one name for this. Tools also call them custom commands, slash commands, recipes, prompts, playbooks, or modes, and they load them differently — some auto-discover a dedicated folder, some need you to point at a file, some let your always-on instructions file say *"when asked to add a command, follow `add-command.md`."* **The durable pattern is the same in all of them: a named, invokable file of structured steps for a repeatable procedure, kept in the repo.** Learn the pattern; map it onto whatever your tool calls it. As with everything in this course, the model and the tool are swappable; the playbook you wrote is the part that lasts. ### Skills compose with your tools A skill's steps aren't limited to editing files. They can drive the test runner, the CLI, Git — and, once you have **Module 20's MCP** servers wired up, the real systems behind them (open the issue, hit the staging API, query the database). A skill is where you encode *"use these hands, in this order, to get this outcome."* The deeper your toolchain, the more a written playbook is worth — because there are more steps to get wrong, and more value in getting them right every time. --- ## The AI angle A generic automation course would call this "write a runbook." The AI-specific twist is what makes it land: - **The AI will execute the playbook, not just read it.** A runbook for a human is a reminder; a skill for an agent is something it *performs*. The precision pays off immediately — vague step, vague result; imperative step ("run `python -m unittest`; do not claim success until it's green"), reliable result. - **The AI is confidently incomplete without one.** Asked to "add a command," it'll happily stop at the code and skip the test, the changelog, the clean commit — and sound finished doing it. The skill is how you make *complete* the default instead of a thing you have to keep catching. - **The skill outlives the model.** Swap models next quarter and the playbook carries over unchanged. You encoded the *procedure*, not the prompt that happened to coax it out of this month's model. The workflow is the durable skill; the model is the swappable part — here, literally. --- ## Hands-on lab **Lab language:** markdown (the skill file) plus shell and Python (the `tasks-app`). You'll write a skill, then have your editor-integrated AI (Module 4) execute it. You'll write a skill for the procedure from *Key concepts* — **add a new `tasks-app` command, end to end: code + test + changelog + clean commit** — and then watch the AI run it on a command it's never seen, producing all four parts without you listing the steps. **You'll need:** - Your agentic coding tool from Module 4, and knowledge of how it loads a procedure (a skills/commands folder it auto-discovers, or simply pointing it at a file by name — check its docs). - A Python 3.10+ `tasks-app`. Use the snapshot in this module's `lab/tasks-app/` (it has `add`, `list`, `done`, `count`, a `test_tasks.py`, and a `CHANGELOG.md`), or carry forward your own from earlier modules. Make it a Git repo if it isn't: `git init && git add . && git commit -m "Start"`. ### Part A — Install the skill 1. Copy this module's starter skill, `lab/add-command-skill.md`, into your `tasks-app` repo wherever your tool expects procedures. If your tool auto-discovers a folder, put it there under a clear name (e.g. `add-command.md`). If it doesn't, just drop it at the repo root — you'll invoke it by name. ```bash cd ~/workflow-course/tasks-app cp /path/to/modules/21-skills-teaching-the-ai-your-playbook/lab/add-command-skill.md add-command.md ``` 2. Read it. The whole file is short on purpose — when-to-use, inputs, seven ordered steps, and done-criteria. Confirm every project fact in it matches *your* app (test command, file names, the off-limits `tasks.json`). A skill with wrong facts misdirects the AI worse than no skill. 3. **Commit it.** This is the point — the procedure now lives in version control: ```bash git add add-command.md git commit -m "Add skill: add a tasks-app command end to end" ``` ### Part B — Invoke it 4. Start a **fresh** AI session in your editor and invoke the skill the way your tool does it — its slash command / skill name, or plainly: *"Follow `add-command.md` to add a `clear` command that removes all tasks."* Crucially, **don't list the steps yourself.** The skill is supposed to supply them. 5. Watch it perform the procedure. A correctly-followed skill will, without you saying any of it: - add `clear()` to `tasks.py` and wire a `clear` branch into `cli.py` (logic in the right file); - add a real test to `test_tasks.py` that asserts the list is empty afterward (not just "no crash"); - run `python -m unittest` and show it green; - smoke-test `python cli.py clear` and show the output; - add a `CHANGELOG.md` line; - stage code + test + changelog into one commit, **without** `tasks.json`. ### Part C — Verify it followed the playbook 6. Don't take the AI's word for it. Check against the skill's own done-criteria: ```bash python -m unittest # green, and a clear-related test is present python cli.py add "x" && python cli.py clear && python cli.py list # -> (no tasks yet) git show --stat HEAD # one commit: tasks.py, cli.py, test_tasks.py, CHANGELOG.md — no tasks.json ``` If a step was skipped, that's the lab working: it shows you exactly where your wording was too soft. Tighten that line, commit the skill change, and run it again on a second command (`high ` to flag a task, say). **A skill you improve once and reuse forever is the deliverable** — not the one `clear` command. ### Part D — See it as a reviewable, reusable asset 7. Look at what you built: ```bash git log --oneline add-command.md # the procedure's own history git diff HEAD~1 add-command.md # if you tightened it in Part C — your workflow change as a diff ``` That diff *is* a change to how your team adds commands — readable, attributable, revertable. In a team repo (Modules 8, 11) it reaches everyone on `git pull`; behind review (Module 10) it lands as a PR someone approves. You've turned a procedure you used to narrate into a versioned capability. --- ## Where it breaks - **A skill is guidance, not enforcement — same caveat as Module 5.** It strongly biases the AI; it doesn't bind it. The agent can still skip a step, especially a soft one, especially late in a long session. The steps that *can't* be skipped are the ones backed by **CI (Module 14)** — the test the skill tells it to write only truly gates anything once a pipeline runs it on every push. Write the done-criteria as hard checks, and let CI be the backstop. - **Skills rot.** A playbook that says "tests run with X" after you've moved to Y will confidently march the AI off a cliff. Skills are code-adjacent: review them, update them, delete the ones you no longer run. Committing them (so changes are visible) is what makes that maintainable. - **Don't skillify everything.** A skill earns its place when a procedure is *repeated*, *multi-step*, and *gets done wrong without one*. A one-off task doesn't need a playbook, and a pile of near-duplicate skills is its own kind of bloat — now you're maintaining ten files and the AI has to pick the right one. Promote a prompt to a skill the third time you've typed it, not the first. - **Overlap with the always-on file causes drift.** If a fact lives in both your Module 5 instructions file *and* a skill, you'll eventually update one and not the other. Keep general facts in the always-on file and *reference* them from skills; don't duplicate them. - **A skill is not a security boundary.** "Don't stage `tasks.json`" is a convention, not a permission. An installed third-party skill is untrusted code that runs against your repo — vetting, permissions, and prompt-injection defense are **Module 22's** job, immediately next, for exactly this reason. --- ## Check for understanding **You're done when:** - Your `tasks-app` repo has a committed skill file for "add a command," with `git log` showing the commit that added it. - You've invoked that skill and watched a fresh AI session produce **all four** parts — code, a real test, a changelog entry, and one clean commit — *without you listing the steps that session*. - You've verified it against the skill's done-criteria (tests green, command works, the commit contains the right files and not `tasks.json`) rather than trusting the AI's summary. - You can state, in one sentence, when to put knowledge in the always-on instructions file (Module 5) versus a skill: general facts go in the file that's always read; a specific repeatable procedure goes in a playbook invoked on demand. When adding the *next* command is "invoke the skill" instead of "re-explain the seven steps," the playbook is doing its job. Module 22 comes next, and not by accident: Unit 4 just gave the AI hands — MCP servers and skills — and the very next thing is securing them, because an installed skill or server is untrusted code running in your environment. --- ## Verify-before-publish This is expansion-zone material; the *concept* is durable but tool specifics drift. Re-check at build time: - [ ] **Skill terminology and mechanics.** Confirm how mainstream agentic tools name and load skills (skills / custom commands / slash commands / recipes / prompts), whether they auto-discover a folder or need an explicit pointer, and any required file format/frontmatter — without pinning the lesson to one vendor. Update the "Naming the pattern" paragraph if the common vocabulary has shifted. - [ ] **No vendor leaked in.** Verify the module still names the *pattern*, not one implementation, and that the example skill format stays generic (when-to-use / inputs / steps / done-criteria). - [ ] **Dependency chain intact.** Confirm Module 20 (MCP) and Module 22 (securing servers/skills) are still numbered as referenced, and that nothing here leans on a tool introduced after Module 20. - [ ] **Lab still runs.** `python -m unittest` is green in `lab/tasks-app/`, and the `clear`-command walkthrough still matches the starter files (`add`/`list`/`done`/`count`, `test_tasks.py`, `CHANGELOG.md`).