Merge: v21.0.0 release prep

chore(release): v21.0.0
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:34:15 +01:00 · 2026-06-19 10:34:15 +01:00 · 2026-06-19 10:32:29 +01:00 · 2026-06-19 09:31:45 +00:00 · 2026-06-19 10:19:05 +01:00 · 2026-06-19 10:05:17 +01:00
465 changed files with 68217 additions and 173 deletions
@@ -1,8 +1,8 @@
 {
  "$schema": "https://anthropic.com/claude-code/marketplace.schema.json",
  "name": "pm-claude-skills",
-  "version": "14.0.0",
+  "version": "20.2.0",
-  "description": "PM stands for Professional, not just Product Management. 167 Claude Skills + 4 agent templates across 26 bundles covering 18 professions — engineering, customer success, legal, finance, HR, sales, design, Figma, marketing, social media, writers, and more. Built by a PM, used by everyone. Building blocks for the Anthropic agent template architecture.",
+  "description": "PM stands for Professional, not just Product Management. 174 Claude Skills + 4 agent templates across 26 bundles covering 18 professions — engineering, customer success, legal, finance, HR, sales, design, Figma, marketing, social media, writers, and more. Built by a PM, used by everyone. Building blocks for the Anthropic agent template architecture.",
  "owner": {
    "name": "Mohit Aggarwal",
    "email": "mohit15856@gmail.com"
@@ -34,8 +34,8 @@
    },
    {
      "name": "pm-delivery",
-      "description": "Sprint & delivery skills: Sprint Planning, Technical Spec, A/B Test Planner, Go-to-Market Planner, Launch Checklist, Sprint Brief, Retro Analysis, PPTX Slide Auditor, User Story Writer. Write production-ready user stories with Given/When/Then acceptance criteria, edge cases, and definition of done.",
+      "description": "Sprint & delivery skills: Sprint Planning, Technical Spec, A/B Test Planner, Go-to-Market Planner, Launch Checklist, Sprint Brief, Retro Analysis, PPTX Slide Auditor, User Story Writer, Launch Readiness. Write production-ready user stories with Given/When/Then acceptance criteria, plus a cross-functional pre-launch readiness assessment with an explicit Go / Conditional Go / No-Go recommendation.",
-      "version": "3.2.0",
+      "version": "3.3.0",
      "category": "productivity",
      "source": "./plugins/pm-delivery",
      "homepage": "https://github.com/mohitagw15856/pm-claude-skills"
@@ -82,8 +82,8 @@
    },
    {
      "name": "pm-engineering",
-      "description": "Engineering & tech skills: Code Review Checklist, Incident Postmortem, API Docs Writer, Architecture Decision Record, Debugging Log Analyser, PR Description Writer, System Design Interview, Changelog Generator, Test Strategy Doc, Runbook Writer, CI/CD Playbook, SLO & Error Budget, Developer Onboarding Doc, On-Call Runbook, Security Threat Model, Performance Budget, Database Schema Design, Database Migration Plan, Technical Debt Register, RFC Writer, Capacity Planning, Load Testing Plan, Disaster Recovery Plan, Feature Flag Guide, Dependency Audit, Service Catalog Entry, Monitoring Setup Guide, Local Dev Setup, API Versioning Strategy, Infra-as-Code Review, Engineering Weekly Report, Tech Radar, Sprint Velocity Analysis, Microservices Decomposition, Engineering Hiring Rubric, Context Mode, Claude Superpowers. 37 structured skills for engineering teams, SREs, technical PMs, and Claude Code power users.",
+      "description": "Engineering & tech skills: Code Review Checklist, Incident Postmortem, API Docs Writer, Architecture Decision Record, Debugging Log Analyser, PR Description Writer, System Design Interview, Changelog Generator, Test Strategy Doc, Runbook Writer, CI/CD Playbook, SLO & Error Budget, Developer Onboarding Doc, On-Call Runbook, Security Threat Model, Performance Budget, Database Schema Design, Database Migration Plan, Technical Debt Register, RFC Writer, Capacity Planning, Load Testing Plan, Disaster Recovery Plan, Feature Flag Guide, Dependency Audit, Service Catalog Entry, Monitoring Setup Guide, Local Dev Setup, API Versioning Strategy, Infra-as-Code Review, Engineering Weekly Report, Tech Radar, Sprint Velocity Analysis, Microservices Decomposition, Engineering Hiring Rubric, Context Mode, Claude Superpowers, Skill Security Auditor. 38 structured skills for engineering teams, SREs, technical PMs, and Claude Code power users — including a security audit for any SKILL.md / system prompt before you install or merge it.",
-      "version": "4.1.0",
+      "version": "4.2.0",
      "category": "productivity",
      "source": "./plugins/pm-engineering",
      "homepage": "https://github.com/mohitagw15856/pm-claude-skills"
@@ -202,8 +202,8 @@
    },
    {
      "name": "pm-writers",
-      "description": "Writers & Content Creators skills: Instagram Post Downloader, AEO Optimizer, Thumbnail Creator, Substack Notes Scraper, Notes Humanizer. Download Instagram carousels as PDFs, restructure articles for AI citation, generate thumbnail candidates via Gemini, export Substack Notes analytics to Excel, and strip AI writing patterns from any text.",
+      "description": "Writers & Content Creators skills: Instagram Post Downloader, AEO Optimizer, Thumbnail Creator, Substack Notes Scraper, Notes Humanizer, YouTube Script Writer. Download Instagram carousels as PDFs, restructure articles for AI citation, generate thumbnail candidates via Gemini, export Substack Notes analytics to Excel, strip AI writing patterns from any text, and write retention-optimized YouTube scripts with hooks and visual/audio cues.",
-      "version": "1.0.0",
+      "version": "1.1.0",
      "category": "productivity",
      "source": "./plugins/pm-writers",
      "homepage": "https://github.com/mohitagw15856/pm-claude-skills"
@@ -10,6 +10,10 @@ on:
    paths:
      - 'skills/**'
      - 'web/**'
      - 'evals/results.json'
      - 'skill-tiers.json'
      - 'scripts/build-docs.mjs'
      - 'scripts/build-leaderboard.mjs'
      - '.github/workflows/deploy-playground.yml'
  workflow_dispatch:
@@ -38,6 +42,12 @@ jobs:
      - name: Rebuild skills.json from SKILL.md files
        run: node web/build-skills.mjs
      - name: Build the static skill catalog (web/catalog.html)
        run: node scripts/build-docs.mjs
      - name: Build the skill leaderboard (web/leaderboard.html)
        run: node scripts/build-leaderboard.mjs
      - name: Configure Pages
        uses: actions/configure-pages@v5
@@ -0,0 +1,70 @@
 name: Update Skill Leaderboard
 # Runs the eval harness with your ANTHROPIC_API_KEY secret, commits the real
 # results (evals/results.json), and lets the Pages deploy re-render the public
 # leaderboard with real numbers. Manual trigger so it never burns tokens by
 # surprise. (Uncomment the schedule to re-run, e.g. monthly, after model upgrades.)
 on:
  workflow_dispatch:
    inputs:
      models:
        description: 'Comma-separated model ids to score'
        required: false
        default: 'claude-sonnet-4-6,claude-haiku-4-5-20251001'
      judge:
        description: 'Judge model id'
        required: false
        default: 'claude-opus-4-8'
  # schedule:
  #   - cron: '0 6 1 * *'   # 06:00 on the 1st of each month
 permissions:
  contents: write
  pull-requests: write
 concurrency:
  group: eval-leaderboard
  cancel-in-progress: false
 jobs:
  evaluate:
    runs-on: ubuntu-latest
    timeout-minutes: 20
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up Node
        uses: actions/setup-node@v4
        with:
          node-version: '20'
      - name: Run evals
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          if [ -z "$ANTHROPIC_API_KEY" ]; then
            echo "::error::ANTHROPIC_API_KEY secret is not set. Add it in Settings → Secrets and variables → Actions."
            exit 1
          fi
          node evals/run-evals.mjs \
            --models "${{ github.event.inputs.models || 'claude-sonnet-4-6,claude-haiku-4-5-20251001' }}" \
            --judge "${{ github.event.inputs.judge || 'claude-opus-4-8' }}"
      - name: Build the leaderboard page (sanity check)
        run: node scripts/build-leaderboard.mjs
      - name: Open a PR with the refreshed results
        uses: peter-evans/create-pull-request@v7
        with:
          add-paths: evals/results.json
          branch: eval-results
          delete-branch: true
          commit-message: "chore(evals): refresh leaderboard results"
          title: "chore(evals): refresh leaderboard results"
          body: |
            Auto-generated by the **Update Skill Leaderboard** workflow.
            Merging this publishes the **real** numbers on the live leaderboard — the
            Pages deploy is triggered by changes to `evals/results.json`.
@@ -0,0 +1,51 @@
 name: Generate Sample Outputs
 # Generates real model outputs for the sample-output gallery using the
 # ANTHROPIC_API_KEY repo secret — the key never leaves GitHub. Generates a
 # sample for every eval-case skill that doesn't already have one (it never
 # overwrites hand-written samples), rebuilds web/samples.json, and commits.
 #
 # Run it from the Actions tab → "Generate Sample Outputs" → Run workflow.
 on:
  workflow_dispatch: {}
 permissions:
  contents: write
  pull-requests: write
 jobs:
  generate:
    runs-on: ubuntu-latest
    timeout-minutes: 20
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - name: Generate missing samples + rebuild gallery
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          if [ -z "$ANTHROPIC_API_KEY" ]; then
            echo "::error::ANTHROPIC_API_KEY secret is not set."
            exit 1
          fi
          node scripts/build-samples.mjs --generate-missing
      # main is protected (requires PRs), so open a PR instead of pushing directly.
      - name: Open a PR with the new samples
        uses: peter-evans/create-pull-request@v7
        with:
          add-paths: |
            examples/samples
            web/samples.json
          branch: sample-outputs
          delete-branch: true
          commit-message: "chore(samples): generate sample outputs for the gallery"
          title: "chore(samples): generate sample outputs for the gallery"
          body: |
            Auto-generated by the **Generate Sample Outputs** workflow using the
            ANTHROPIC_API_KEY secret. Adds real model outputs to the sample gallery
            (examples.html). Hand-written samples are never overwritten.
@@ -36,7 +36,7 @@ jobs:
      - name: Verify release tag matches package.json version
        if: github.event_name == 'release'
        run: |
-          TAG="${GITHUB_REF_NAME#v}"
+          TAG="${GITHUB_REF_NAME#[vV]}"   # strip a leading v or V (v17.0.0 / V17.0.0)
          PKG="$(node -p "require('./package.json').version")"
          echo "release tag: $TAG | package.json: $PKG"
          if [ "$TAG" != "$PKG" ]; then
@@ -0,0 +1,71 @@
 name: Auto PR description
 # Dogfoods our own Action: when a PR is opened with an empty body, run the
 # pr-description-writer skill on the diff and fill it in. A living demo of
 # `uses: ./action`. Requires the ANTHROPIC_API_KEY repo secret; skips quietly
 # without it (and on forks, which can't read secrets).
 on:
  pull_request:
    types: [opened]
 permissions:
  contents: read
  pull-requests: write
 jobs:
  describe:
    if: github.event.pull_request.head.repo.full_name == github.repository
    runs-on: ubuntu-latest
    env:
      ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
    steps:
      - name: Check for API key and an empty PR body
        id: gate
        uses: actions/github-script@v7
        with:
          script: |
            const hasKey = !!process.env.ANTHROPIC_API_KEY;
            const body = (context.payload.pull_request.body || '').trim();
            if (!hasKey) core.info('ANTHROPIC_API_KEY not set — skipping.');
            if (body) core.info('PR already has a description — skipping.');
            core.setOutput('go', String(hasKey && !body));
      - name: Checkout
        if: steps.gate.outputs.go == 'true'
        uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - name: Collect the diff
        if: steps.gate.outputs.go == 'true'
        id: diff
        run: |
          {
            echo "text<<DIFF_EOF"
            echo "Title: ${{ github.event.pull_request.title }}"
            echo "Commits:"; git log --oneline origin/${{ github.base_ref }}..HEAD | head -30
            echo; echo "Changed files:"; git diff --stat origin/${{ github.base_ref }}...HEAD | tail -40
            echo "DIFF_EOF"
          } >> "$GITHUB_OUTPUT"
      - name: Write the PR description with the skill
        if: steps.gate.outputs.go == 'true'
        id: skill
        uses: ./action
        with:
          skill: pr-description-writer
          input: ${{ steps.diff.outputs.text }}
          api_key: ${{ secrets.ANTHROPIC_API_KEY }}
      - name: Update the PR body
        if: steps.gate.outputs.go == 'true'
        uses: actions/github-script@v7
        env:
          BODY: ${{ steps.skill.outputs.result }}
        with:
          script: |
            await github.rest.pulls.update({
              owner: context.repo.owner, repo: context.repo.repo,
              pull_number: context.issue.number,
              body: process.env.BODY + '\n\n<sub>✍️ Drafted by the pm-claude-skills GitHub Action (pr-description-writer).</sub>',
            });
@@ -0,0 +1,31 @@
 name: Skill Security Audit
 # Scans installable skill content (skills/*/SKILL.md and each skill's scripts/)
 # for prompt injection, data exfiltration, dynamic code execution, destructive
 # shell, hardcoded secrets, and hidden text. Fails on HIGH-severity findings.
 on:
  push:
    branches: [main]
    paths:
      - 'skills/**'
      - 'scripts/skill-audit.mjs'
      - '.github/workflows/skill-audit.yml'
  pull_request:
    paths:
      - 'skills/**'
      - 'scripts/skill-audit.mjs'
      - '.github/workflows/skill-audit.yml'
 jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up Node
        uses: actions/setup-node@v4
        with:
          node-version: '20'
      - name: Run the skill security auditor
        run: node scripts/skill-audit.mjs
@@ -0,0 +1,42 @@
 name: Skill of the Week
 # Picks a featured skill each week, composes X + LinkedIn posts, and (optionally)
 # auto-publishes via a webhook. The post text always lands in the job summary so
 # you can copy-paste even without the webhook configured.
 #
 # To auto-publish: add a repo secret POST_WEBHOOK_URL pointing at a Zapier / Make /
 # Buffer / Slack incoming webhook that takes { text, linkedin, skill, link }.
 on:
  schedule:
    - cron: '0 9 * * 1' # every Monday 09:00 UTC
  workflow_dispatch: {}
 permissions:
  contents: write
 jobs:
  feature:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - name: Pick & compose the skill of the week
        env:
          POST_WEBHOOK_URL: ${{ secrets.POST_WEBHOOK_URL }}
        run: node scripts/skill-of-the-week.mjs
      - name: Commit the featured skill (if it changed)
        run: |
          if ! git diff --quiet -- web/skill-of-the-week.json; then
            git config user.name "github-actions[bot]"
            git config user.email "github-actions[bot]@users.noreply.github.com"
            git add web/skill-of-the-week.json
            git commit -m "chore: skill of the week"
            git push
          else
            echo "No change to web/skill-of-the-week.json"
          fi
@@ -10,3 +10,7 @@ venv/
 *.swp
 .idea/
 .vscode/
 # Generated docs catalog (built in CI for Pages)
 web/catalog.html
 web/leaderboard.html
@@ -9,7 +9,130 @@ each new wave of skills bumps the **major** version, extensions and fixes bump
 ## [Unreleased]
 ## [21.0.0] — Workflow Recipes, Eval-Verified Quality & a Smarter Playground — 2026-06-19
 The biggest update yet — the 174 skills become a *system*, not just a catalog.
 ### Added
 - **Workflow Recipes** — chain skills into one flow, where each output feeds the next. Five
  cross-profession recipes ship as slash commands and over MCP: `/ship-a-feature`,
  `/close-the-quarter`, `/launch-a-product`, `/rescue-an-account`, `/run-discovery`. Defined in
  `workflows.json`, documented in `WORKFLOWS.md` (generated + validated by `scripts/build-workflows.mjs`).
 - **Eval-verified quality** — real eval scores (structure, completeness, usefulness, grounding;
  judged by Opus 4.8) now surface as badges in the Playground and leaderboard. Eval coverage
  expanded from 6 to 15 skills.
 - **One-click MCP** — `claude mcp add pm-skills -- npx -y pm-claude-skills-mcp` makes every skill
  and recipe available in any MCP client (Claude Code, Claude Desktop, Cursor, Windsurf). New
  `list_workflows` / `get_workflow` MCP tools.
 - **Playground upgrades** — a "which skill do I need?" recommender, a Compare toggle (run inputs
  with vs. without the skill, side by side), and shareable deep-links that prefill inputs.
 - **Sample-output gallery** (`examples.html`) — 18 real example outputs so you can see what each
  skill produces before running anything. Generated via a workflow using the API-key secret.
 - **Skill of the week** — a scheduled workflow composes weekly X/LinkedIn posts; an optional
  webhook auto-publishes.
 ### Changed
 - README leads with a problem-solution hook, a workflow lifecycle diagram, a "hero five"
  quick-start, and an animated demo (plus a Compare-mode demo).
 ## [20.2.0] — Community PRs & New Skill — 2026-06-18
 ### Added
 - **New skill: YouTube Script Writer** (experimental) — retention-optimized video scripts with
  3 title/thumbnail concepts, 3 hook variations, a video/audio cue script table, and SEO
  metadata. Thanks @prajwal-28 (#50). Library is now **174 skills**.
 - **Feature-prioritisation helper script** — a dependency-free (stdlib-only) Python helper that
  computes RICE/ICE rankings from JSON/CSV/stdin, so scoring is consistent across sessions.
  Thanks @zeotrix (#48, closes #39).
 ### Changed
 - **Safer installs** — the CLI now resolves the install target and refuses system-critical
  directories (`/`, `/usr`, `/etc`, `/root`, …) so a mistyped `--target` can't clobber the
  system. Thanks @MatrixNeoKozak (#47).
 - **README catalog reconciled to the real count** — the headline, badge, table of contents, and
  "All Skills" catalog now say **174** (was a stale 167); added catalog entries for Skill
  Security Auditor (#168), Launch Readiness (#169), and YouTube Script Writer (#170).
 ### Fixed
 - **`skillcheck` frontmatter parser** tolerates leading whitespace and CRLF/LF line endings, so
  skills authored on Windows no longer produce false negatives. Thanks @MatrixNeoKozak (#47).
 - **`npm run check` now guards `web/skills.json`** — it rebuilds the file and fails on any drift,
  so a stale playground index can't pass locally and then break CI.
 ## [20.1.0] — Star Nudges & Eval Hardening — 2026-06-18
 ### Added
 - **Star the repo, from anywhere you use it.** Tasteful, non-spammy calls-to-action that turn
  npm/CLI users into stargazers — no `postinstall` hook: a prompt after a successful
  `npx pm-claude-skills add`, in `--help`, in `list`, in the MCP server's startup banner, a
  CTA below the README badges (npm renders it on the package page), and a `funding` field in
  `package.json` so npm shows a Fund/Sponsor link.
 - **One-click leaderboard updates in CI** — `.github/workflows/eval-leaderboard.yml`
  ("Update Skill Leaderboard") runs the evals with the `ANTHROPIC_API_KEY` secret, commits
  `evals/results.json`, and the Pages deploy re-renders the public leaderboard with real
  numbers — no local key needed. The deploy workflow now also triggers on
  `evals/results.json`.
 ### Changed
 - **Leaderboard workflow opens a PR** instead of pushing to `main` (which the branch
  ruleset blocks). After it runs, merge the auto-created results PR to publish real numbers.
 - **Faster, hang-proof evals.** The Anthropic client now has a per-request timeout (120s)
  and limited retries (429/5xx/timeout); the eval harness runs cases concurrently
  (default 4). The leaderboard workflow has a 20-minute job timeout. A 24-call run that
  was sequential now finishes in a few minutes and can't stall a job indefinitely.
 ## [20.0.0] — Agentic Tooling — 2026-06-18
 ### Added
 - **Dogfooded Action** — `.github/workflows/pr-description.yml` uses our own GitHub Action
  (`uses: ./action`) to auto-write this repo's PR descriptions when a PR opens with an
  empty body (skips quietly without the `ANTHROPIC_API_KEY` secret and on forks).
 - **GitHub Action** ([`action/`](action/)) — run any skill in CI: `uses:
  mohitagw15856/pm-claude-skills/action@main` to auto-write PR descriptions,
  changelogs, release notes, or code-review checklists. Composite action +
  dependency-free runner.
 - **`generate` command** — `npx pm-claude-skills generate --from <url|file>` turns a
  team's documentation into a `SKILL.md` that follows the authoring standard
  (`bin/generate.mjs`, needs `ANTHROPIC_API_KEY`).
 - **Skill evals + Leaderboard** — `evals/run-evals.mjs` scores skill output across models
  with an LLM judge (structure / completeness / usefulness / grounding);
  `scripts/build-leaderboard.mjs` renders a public `web/leaderboard.html` (built in the
  Pages deploy, linked from the README, catalog, and playground).
 - Shared, dependency-free Anthropic client (`bin/lib/anthropic.mjs`) used by all three.
 ## [19.0.0] — Security Auditor, Personas & Catalog — 2026-06-18
 ### Added
 - **Skill Security Auditor** — `scripts/skill-audit.mjs` scans installable content
  (`skills/*/SKILL.md` + each skill's `scripts/`) for prompt injection, data
  exfiltration, dynamic code execution, destructive shell, hardcoded secrets, and hidden
  text. HIGH findings fail CI (`skill-audit.yml`); a `security audit` badge in the README.
  Plus a new **`skill-security-auditor`** skill that teaches the same review for any skill.
 - **Personas (output-styles)** — 4 Claude Code output styles in [`output-styles/`](output-styles/)
  (Startup CTO, Growth Marketer, Solo Founder, Product Leader). `--agent claude` now also
  installs `~/.claude/output-styles/`.
 - **Orchestration guide** — [`ORCHESTRATION.md`](ORCHESTRATION.md): Skill Chain,
  Multi-Agent Handoff, Domain Deep-Dive, and Solo Sprint patterns for combining skills,
  subagents, and commands.
 - **Static skill catalog** — `scripts/build-docs.mjs` generates a server-rendered,
  SEO-indexable `web/catalog.html` of all skills (linked from the README and Playground;
  built in the Pages deploy).
 - **Public roadmap** — [`ROADMAP.md`](ROADMAP.md) with now/next/later and a "good first
  issues" list to grow contributors.
 ## [18.0.0] — Windsurf, Aider & an MCP Server — 2026-06-17
 ### Added
 - **MCP server** — `mcp/server.mjs`, a zero-dependency Model Context Protocol server
  (stdio) exposing `list_skills`, `search_skills`, and `get_skill` so MCP clients (Claude
  Desktop, Cline, …) pull skills on demand. Published as a second bin,
  `npx pm-claude-skills-mcp`.
 - **Windsurf & Aider targets** — two more export platforms (`exports/windsurf/*.md`
  workspace rules, `exports/aider/*.md` conventions) and install support in `install.sh`,
  the `npx` CLI, and one-line `windsurf-install.sh` / `aider-install.sh`. The library now
  exports to **5 platforms** (ChatGPT, Gemini, Cursor, Windsurf, Aider).
 - **Hero demo placement** — README "See it in action" block linking to the live Playground,
  ready to swap a `playground-demo.gif` in (recording guide in `web/docs-assets/README.md`).
 - **Automated npm publishing** — `.github/workflows/npm-publish.yml` publishes the package
  to npm (with provenance) when a GitHub Release is published. Requires a one-time
  `NPM_TOKEN` repo secret; no local npm needed.
@@ -165,7 +288,10 @@ Earlier releases (v1.0.0 – v5.0.0) predate this changelog. See the
 [article series](README.md#-the-article-series) for the full history of how the
 library grew from the first PM toolkit to 100+ skills.
-[Unreleased]: https://github.com/mohitagw15856/pm-claude-skills/compare/v17.0.0...HEAD
+[Unreleased]: https://github.com/mohitagw15856/pm-claude-skills/compare/v20.0.0...HEAD
 [20.0.0]: https://github.com/mohitagw15856/pm-claude-skills/compare/v19.0.0...v20.0.0
 [19.0.0]: https://github.com/mohitagw15856/pm-claude-skills/compare/v18.0.0...v19.0.0
 [18.0.0]: https://github.com/mohitagw15856/pm-claude-skills/compare/v17.0.0...v18.0.0
 [17.0.0]: https://github.com/mohitagw15856/pm-claude-skills/compare/v16.0.0...v17.0.0
 [16.0.0]: https://github.com/mohitagw15856/pm-claude-skills/compare/v15.0.0...v16.0.0
 [15.0.0]: https://github.com/mohitagw15856/pm-claude-skills/compare/v14.0.0...v15.0.0
@@ -0,0 +1,86 @@
 # Orchestration — Combining Skills, Subagents & Commands
 A single skill answers one question well. Real work is a sequence of them. This guide
 shows four patterns for chaining the library's [skills](skills/), [subagents](agents/), and
 [slash commands](commands/) into end-to-end workflows.
 > These are usage patterns, not new software — they work today in Claude Code (and any
 > tool that has the skills installed). Install everything first:
 > `npx pm-claude-skills add --agent claude`.
 ---
 ## 1. Skill Chain (sequential)
 Run skills in order, feeding each output into the next. Best for a known process.
 **Example — "new feature, from idea to sprint":**
 ```
 /rice          → rank the candidate features
 /prd           → write the PRD for the top one
 /sprint-plan   → break it into a calibrated sprint
 ```
 Each step's output becomes the next step's input. The helper scripts (RICE, capacity)
 compute the numbers so the chain stays grounded in data, not vibes.
 ## 2. Multi-Agent Handoff
 Delegate phases to focused [subagents](agents/); each owns its domain and hands off.
 **Example — "launch a feature":**
 ```
 pm-partner      → frames the problem, writes the PRD
 sprint-master   → plans delivery, tracks the sprint
 launch-captain  → positioning, GTM plan, launch checklist
 cs-guardian     → post-launch account health & churn watch
 ```
 In Claude Code, just describe the work and Claude delegates by each subagent's
 `description`; or name one explicitly ("use the launch-captain subagent").
 ## 3. Domain Deep-Dive
 Pick one bundle and run its skills together for a thorough, single-domain pass.
 **Example — Customer Success review of an account:**
 ```
 cs-health-scorecard  → score the account (weighted /100 + RAG)
 churn-analysis       → diagnose risk drivers
 renewal-playbook     → build the renewal plan
 qbr-deck             → package it for the QBR
 ```
 Use the `cs-guardian` subagent to run the whole sequence with shared context.
 ## 4. Solo Sprint (one assistant, many skills)
 No subagents — a single session pulls in whichever skills the task needs, on demand.
 This is the natural mode for the [MCP server](mcp/): the assistant calls `search_skills`,
 then `get_skill`, and applies the result.
 **Example:** *"Search the skills for anything about pricing, then apply the best one to
 this offering."* → `search_skills("pricing")` → `get_skill("pricing-strategy")` → output.
 ---
 ## Picking a pattern
 | You have… | Use |
 |---|---|
 | A known, repeatable process | **Skill Chain** |
 | Distinct phases with different expertise | **Multi-Agent Handoff** |
 | One domain to cover thoroughly | **Domain Deep-Dive** |
 | An open-ended ask, tools installed via MCP | **Solo Sprint** |
 ## Tips
 - **Carry context forward.** Paste or reference the previous step's output so each skill
  builds on the last instead of starting cold.
 - **Compute, don't guess.** When a skill ships a helper script (RICE, sprint capacity,
  customer health), run it — chained estimates drift fast.
 - **Audit anything you didn't write.** Before chaining a skill from elsewhere, run it
  through `skill-security-auditor` (or `node scripts/skill-audit.mjs`).
@@ -1,28 +1,138 @@
-# 🧠 PM Skills — 167 Professional Agent Skills for Claude, ChatGPT, Gemini, Cursor, Codex & Hermes
+# 🧠 PM Skills — 174 Professional Agent Skills for Claude, ChatGPT, Gemini, Cursor, Codex & Hermes
 > Open-source **Agent Skills** (`SKILL.md`) + subagents + slash commands for every profession — one source, every AI coding tool.
 [![Stars](https://img.shields.io/github/stars/mohitagw15856/pm-claude-skills?style=social)](https://github.com/mohitagw15856/pm-claude-skills/stargazers)
-[![Skills](https://img.shields.io/badge/skills-167-blue)](https://github.com/mohitagw15856/pm-claude-skills)
+[![npm](https://img.shields.io/npm/v/pm-claude-skills?logo=npm&color=cb3837)](https://www.npmjs.com/package/pm-claude-skills)
 [![npm downloads](https://img.shields.io/npm/dm/pm-claude-skills?logo=npm&color=cb3837&label=installs)](https://www.npmjs.com/package/pm-claude-skills)
 [![Skills](https://img.shields.io/badge/skills-174-blue)](https://github.com/mohitagw15856/pm-claude-skills)
 [![Subagents](https://img.shields.io/badge/subagents-4-blueviolet)](agents/)
 [![Commands](https://img.shields.io/badge/slash%20commands-6-blueviolet)](commands/)
 [![Personas](https://img.shields.io/badge/personas-4-blueviolet)](output-styles/)
 [![Platforms](https://img.shields.io/badge/works%20with-Claude%20%7C%20ChatGPT%20%7C%20Gemini%20%7C%20Cursor%20%7C%20Codex%20%7C%20Hermes-8A2BE2)](#-works-with--cross-tool-compatibility)
 [![SkillCheck](https://img.shields.io/github/actions/workflow/status/mohitagw15856/pm-claude-skills/skillcheck.yml?branch=main&label=SkillCheck)](.github/workflows/skillcheck.yml)
-[![Version](https://img.shields.io/badge/version-17.0.0-brightgreen)](https://github.com/mohitagw15856/pm-claude-skills/releases)
+[![Security Audit](https://img.shields.io/github/actions/workflow/status/mohitagw15856/pm-claude-skills/skill-audit.yml?branch=main&label=security%20audit)](.github/workflows/skill-audit.yml)
 [![Version](https://img.shields.io/badge/version-21.0.0-brightgreen)](https://github.com/mohitagw15856/pm-claude-skills/releases)
 [![Install](https://img.shields.io/badge/Install%20in%20Claude%20Code-2%20minutes-orange)](https://github.com/mohitagw15856/pm-claude-skills#-quick-install-2-minutes)
 [![License](https://img.shields.io/badge/license-MIT-lightgrey)](LICENSE)
 [![Sponsor](https://img.shields.io/badge/sponsor-❤️-ff69b4)](https://github.com/sponsors/mohitagw15856)
 ### ⭐ If this saves you time, [star the repo](https://github.com/mohitagw15856/pm-claude-skills) — it's the #1 way to help others find it.
 > **PM stands for Professional, not just Product Management.**
-> 167 professional skills + 4 agent templates across 26 bundles covering 18 professions. Built for Claude Code — and now portable to ChatGPT, Gemini, and Hermes Agent. Built by a PM, used by everyone.
+> 174 professional skills + 4 agent templates across 26 bundles covering 18 professions. Built for Claude Code — and now portable to ChatGPT, Gemini, and Hermes Agent. Built by a PM, used by everyone.
 A community-built library of professional skills for every field — product management, engineering, customer success, marketing, social media, writers, design, legal, finance, HR, sales, operations, research, and more. Each skill is a structured `SKILL.md` file that teaches an AI assistant how to produce professional-grade outputs for your workflows. Skills run natively in **Claude Code** and **Hermes Agent** (same open `SKILL.md` standard), and ship as ready-to-paste exports for **ChatGPT** and **Gemini** — see [Works With](#-works-with--cross-tool-compatibility).
-**🆕 Latest release (v17.0.0 — Agents, Commands & the npx CLI):** install into any tool with one cross-platform command — `npx pm-claude-skills add --agent <tool>`. Adds 4 Claude Code subagents, 6 slash commands, Cursor `.mdc` exports, a SkillCheck validator, and a skill scaffolder. See the [changelog](#-changelog).
+**🆕 Latest release (v20.2.0 — Community PRs & New Skill):** a new **YouTube Script Writer** skill (**174 total**), a stdlib **feature-prioritisation** helper, safer installs, and robust frontmatter parsing — all from community contributors. See the [changelog](#-changelog).
 ### ▶ See it in action — [try the live Skill Playground](https://mohitagw15856.github.io/pm-claude-skills/)
 <!-- Demo GIF generated by web/docs-assets/record-demo.mjs (Playwright). The streamed
     output is a representative mock so no API key is needed; re-run with a live key
     for a real call. Static fallback: web/docs-assets/playground.png -->
 [![Skill Playground demo — pick a skill, fill the form, run it with your own Claude key](web/docs-assets/playground-demo.gif)](https://mohitagw15856.github.io/pm-claude-skills/)
 <sub>👆 Pick any skill, fill a short form, and run it with your own key — no install required.</sub>
 ---
 ## 👋 New here? Start in 30 seconds
 **The problem:** Ask any AI for a PRD, an exec update, or a launch plan and you get *generic* — plausible-sounding filler you still have to rewrite from scratch. The model doesn't know what "good" actually looks like for professional work.
 **What this fixes:** Each skill is a battle-tested `SKILL.md` that teaches the AI the real structure, rigour, and judgement a senior professional uses — so the first draft is one you can *ship*, not one you have to redo.
 **Try these 5 first** — no install needed, run them right in the [live Playground](https://mohitagw15856.github.io/pm-claude-skills/):
 | Skill | Give it… | Get back… |
 |-------|----------|-----------|
 | 📊 [Executive Update](skills/executive-update) | messy progress notes | a tight 250-word briefing for your CEO or board |
 | 📋 [PRD Template](skills/prd-template) | a vague feature idea | a structured PRD with scope, success metrics & risks |
 | 🎯 [RICE Prioritisation](skills/rice-prioritisation) | a pile of backlog ideas | a ranked, defensible priority list |
 | 🔭 [Competitor Teardown](skills/competitor-teardown) | "what are rivals up to?" | a positioning map, feature gaps & strategy |
 | 📝 [Meeting Notes](skills/meeting-notes) | a raw transcript | decisions, owners & next steps |
 → Want proof first? See [**real sample outputs**](https://mohitagw15856.github.io/pm-claude-skills/examples.html) from each skill. Like what you see? [**Install in 2 minutes**](#-quick-install-2-minutes) · [browse all 174 skills](#️-all-174-skills) · [**⭐ star the repo**](https://github.com/mohitagw15856/pm-claude-skills/stargazers) so others find it.
 ---
 ## 🔄 One library, the whole professional workflow
 These 174 skills aren't a random catalog — they cover the **full arc of professional work**, end to end. Wherever you are in the loop, there's a skill for it:
 ```
  DISCOVER  →   DECIDE   →    BUILD    →    SHIP    →   MEASURE   →  COMMUNICATE
 ──────────   ──────────   ──────────   ──────────   ──────────   ─────────────
  frame the    prioritise    design &     launch &     track &      report up
  problem,     & spec the    engineer     release      analyse      & out
  research     work          the work     the work     results
        └────────────────────── feeds the next discovery ──────────────────────┘
 ```
 | Phase | What you're doing | Start with these skills |
 |-------|-------------------|--------------------------|
 | **🔍 Discover** | Frame the problem, research, validate | `ambiguity-resolver` · `user-research-synthesis` · `competitive-analysis` · `discovery-interview-guide` |
 | **🎯 Decide** | Prioritise, spec, set goals | `rice-prioritisation` · `prd-template` · `okr-builder` · `roadmap-narrative` |
 | **🔨 Build** | Design & engineer the work | `technical-spec-template` · `design-critique` · `sprint-planning` · `architecture-decision-record` |
 | **🚀 Ship** | Launch & release | `go-to-market` · `launch-readiness` · `product-launch-checklist` · `runbook-writer` |
 | **📊 Measure** | Track outcomes & analyse | `metrics-framework` · `cohort-analysis` · `ab-test-planner` · `churn-analysis` |
 | **📣 Communicate** | Report up and out | `executive-update` · `board-deck-narrative` · `stakeholder-update` · `qbr-deck` |
 > New here? Start with the [**top-tier skills**](#️-skill-tiers--start-with-the-strongest), or jump straight to [**all 174 skills**](#️-all-174-skills) grouped by profession.
 ---
 ## 🧩 Workflow Recipes — chain skills into one flow
 Individual skills are great. **Chaining** them is the superpower. A *recipe* runs several skills in sequence and **passes each output forward as context** — so a fuzzy idea comes out the other end as a finished, joined-up set of artifacts. No other skills library chains across professions like this.
 ```
 /ship-a-feature  "a referral program for B2B users"
  ambiguity-resolver → prd-template → rice-prioritisation → roadmap-narrative → go-to-market
   frame the problem    spec it        prioritise it        place on roadmap     launch plan
        └──────────────── each stage's output feeds the next ────────────────┘
 ```
 | Recipe | What it does | Lifecycle |
 |--------|--------------|-----------|
 | `/ship-a-feature` | idea → PRD → priority → roadmap → launch plan | Discover → Ship |
 | `/close-the-quarter` | metrics → churn → exec update → board deck | Measure → Communicate |
 | `/launch-a-product` | competitors → positioning → GTM → checklist → press release | Decide → Ship |
 | `/rescue-an-account` | health score → churn cause → escalation → renewal plan | Measure → Communicate |
 | `/run-discovery` | frame → interview guide → synthesis → prioritise | Discover → Decide |
 → Full detail and how to add your own in [**WORKFLOWS.md**](WORKFLOWS.md). Recipes run as slash commands in Claude Code, or over MCP via the `get_workflow` tool.
 ---
 ## ✅ Eval-verified quality — not just quantity
 Most skill libraries ask you to trust the count. This one is **scored**. An [eval harness](evals/) runs each skill against a held-out test case, then an LLM judge (Opus 4.8) rates the output on four dimensions — **structure, completeness, usefulness, grounding** — averaged across two models.
 The flagship skills score consistently high (out of 5):
 | Skill | Eval score | Skill | Eval score |
 |-------|:---------:|-------|:---------:|
 | `prd-template` | 🟢 **4.9** | `cs-health-scorecard` | 🟢 **4.9** |
 | `rice-prioritisation` | 🟢 **4.9** | `sprint-planning` | 🟢 **4.8** |
 | `competitive-analysis` | 🟢 **4.5** | `executive-summary` | 🟢 **4.5** |
 These scores show up as badges in the [Playground](https://mohitagw15856.github.io/pm-claude-skills/) and the [🏆 leaderboard](https://mohitagw15856.github.io/pm-claude-skills/leaderboard.html). Coverage is expanding — run it yourself with `node evals/run-evals.mjs` (needs an API key). *Honest note: 6 skills are eval-scored today; the rest are reviewed against the [authoring standard](SKILL-AUTHORING-STANDARD.md) but not yet auto-scored.*
 **See the difference for yourself.** The Playground's *Compare* toggle runs the same inputs with and without the skill, side by side — structured, shippable output on the left; generic mush on the right:
 [![Compare mode — the same prompt with and without the skill, side by side](web/docs-assets/compare-demo.gif)](https://mohitagw15856.github.io/pm-claude-skills/)
 ---
 ## Contents
 - [👋 New here? Start in 30 seconds](#-new-here-start-in-30-seconds)
 - [🔄 One library, the whole professional workflow](#-one-library-the-whole-professional-workflow)
 - [🧩 Workflow Recipes — chain skills into one flow](#-workflow-recipes--chain-skills-into-one-flow)
 - [✅ Eval-verified quality](#-eval-verified-quality--not-just-quantity)
 - [🚀 Quick Install](#-quick-install-2-minutes)
 - [🔌 Works With — Cross-Tool Compatibility](#-works-with--cross-tool-compatibility)
 - [🤖 Subagents & Slash Commands](#-subagents--slash-commands)
@@ -30,7 +140,7 @@ A community-built library of professional skills for every field — product man
 - [📦 Plugin Directory](#-plugin-directory)
 - [🤖 Building Blocks for Agent Templates](#-building-blocks-for-agent-templates)
 - [🏷️ Skill Tiers — start with the strongest](#️-skill-tiers--start-with-the-strongest)
- [🗂️ All 167 Skills](#️-all-167-skills)
+- [🗂️ All 174 Skills](#️-all-174-skills)
 - [📋 Changelog](#-changelog)
 - [🤝 Contributing](#-contributing--add-your-skill)
 - [🔗 Related Projects](#-related-projects)
@@ -39,12 +149,20 @@ A community-built library of professional skills for every field — product man
 ## 🚀 Quick Install (2 minutes)
-**Any tool, one command** (Windows/macOS/Linux — needs Node):
+**Any tool, one command** — via the [`pm-claude-skills`](https://www.npmjs.com/package/pm-claude-skills) npm package (Windows/macOS/Linux, needs Node):
 ```bash
 npx pm-claude-skills add --agent claude     # or: codex · cursor · hermes · openclaw
 ```
 **Or one-line MCP** — make all 174 skills + 5 workflow recipes available in *every* session of any MCP client (Claude Code, Claude Desktop, Cursor, Windsurf), no per-file install:
 ```bash
 claude mcp add pm-skills -- npx -y pm-claude-skills-mcp
 ```
 Your assistant can then *"search the skills for churn"* or *"run the ship-a-feature workflow"* on demand. Details: [mcp/README.md](mcp/README.md).
 **In Claude Code**, run:
 /plugin marketplace add mohitagw15856/pm-claude-skills
@@ -106,10 +224,12 @@ body as a system prompt — for those we ship ready-made [exports](#ready-to-use
 | **Claude Code** (CLI / desktop / web / IDE) | Native. Install via the plugin marketplace; Claude loads a skill automatically when your request matches its description. | ✅ Yes |
 | **Hermes Agent** (Nous Research) | Native — same open `SKILL.md` standard. Run `python3 scripts/sync-hermes-skills.py` to install into `~/.hermes/skills/`; Hermes auto-discovers them. | ✅ Yes |
 | **OpenAI Codex · OpenClaw** | Native `SKILL.md`. One-line install (see [below](#one-line-install-for-coding-agents)) or `./scripts/install.sh --agent codex`. | ✅ Yes |
-| **Cursor** | Generated `.mdc` rules. One-line install into `.cursor/rules/`, or copy from [`exports/cursor/`](exports/cursor/). | ⚙️ By description |
+| **Cursor** | Generated `.mdc` rules → `.cursor/rules/`. `npx pm-claude-skills add --agent cursor`. | ⚙️ By description |
 | **Windsurf** | Generated `.md` workspace rules → `.windsurf/rules/`. `npx pm-claude-skills add --agent windsurf`. | ⚙️ By description |
 | **Aider** | Generated conventions files you load with `aider --read`. `npx pm-claude-skills add --agent aider`. | ⚙️ `--read` |
 | **MCP clients** (Claude Desktop, Cline, …) | Run the [MCP server](mcp/) — the assistant searches & pulls skills on demand via `npx -y pm-claude-skills-mcp`. | ✅ On demand |
 | **Claude.ai & Claude API** | Upload a skill, or paste the body in as a system prompt / project instruction. | ⚙️ Manual |
-| **Other `SKILL.md`-aware agents** (Gemini CLI, Aider, Windsurf, …) | Point the agent at `skills/`, or use `./scripts/install.sh`. The frameworks are tool-agnostic; only discovery differs per tool. | ⚙️ Varies by tool |
+| **ChatGPT & Gemini** | Copy a ready-made [export](#ready-to-use-exports) into a Custom GPT or Gem's instructions. You keep the full framework and output format. | ❌ Paste per use |
 | **ChatGPT & Gemini** | Copy a ready-made export (below) into a Custom GPT or Gem's instructions. You keep the full framework and output format. | ❌ Paste per use |
 **What's verified vs. what varies:** the skill **bodies** — the frameworks, rubrics, and
 output templates that do the actual work — are model-agnostic and have been used across
@@ -125,7 +245,7 @@ maintained twice:
 - **ChatGPT** — copy any [`exports/chatgpt/<bundle>/<skill>/SYSTEM_PROMPT.md`](exports/chatgpt/) straight into a Custom GPT's instructions.
 - **Google Gemini** — copy any [`exports/gemini/<bundle>/<skill>/GEM_INSTRUCTIONS.md`](exports/gemini/) into a Gem's instructions.
- **Cursor** — copy any [`exports/cursor/<bundle>/<skill>/<skill>.mdc`](exports/cursor/) into `.cursor/rules/` (or use the one-liner below).
+- **Cursor** (`.mdc`) · **Windsurf** (`.md`) · **Aider** (`.md`) — generated rule/conventions files in [`exports/cursor/`](exports/cursor/), [`exports/windsurf/`](exports/windsurf/), [`exports/aider/`](exports/aider/) (or use the installers below).
 ### One-command install for coding agents
@@ -133,9 +253,11 @@ maintained twice:
 ```bash
 npx pm-claude-skills add --agent claude     # skills + subagents + commands → ~/.claude/
-npx pm-claude-skills add --agent codex      # OpenAI Codex
+npx pm-claude-skills add --agent codex      # OpenAI Codex (or: hermes · openclaw)
 npx pm-claude-skills add --agent cursor     # .mdc rules → ./.cursor/rules
-npx pm-claude-skills list                   # claude · hermes · codex · openclaw · cursor
+npx pm-claude-skills add --agent windsurf   # .md rules → ./.windsurf/rules
 npx pm-claude-skills add --agent aider      # conventions → load with: aider --read
 npx pm-claude-skills list                   # all supported agents + default paths
 ```
 Add `--link` (symlink), `--target <path>`, or `--dry-run` to any `add`.
@@ -181,19 +303,63 @@ It's not just skills. The library also ships **Claude Code subagents** and **sla
 `/prd` · `/rice` · `/sprint-plan` · `/health-scorecard` · `/retro` · `/exec-summary`
-Install everything for Claude Code in one go (skills **+** subagents **+** commands):
+**Personas** ([`output-styles/`](output-styles/)) — Claude Code output styles that change the assistant's whole voice and default skill loadout. Switch with `/output-style`:
 `Startup CTO` · `Growth Marketer` · `Solo Founder` · `Product Leader`
 Install everything for Claude Code in one go (skills **+** subagents **+** commands **+** personas):
 ```bash
-./scripts/install.sh --agent claude       # ~/.claude/{skills,agents,commands}
+npx pm-claude-skills add --agent claude   # ~/.claude/{skills,agents,commands,output-styles}
 ```
-Commands whose skill ships a Python helper (RICE, sprint capacity, customer health) run it to **compute** results, not estimate them.
+Commands whose skill ships a Python helper (RICE, sprint capacity, customer health) run it to **compute** results, not estimate them. To string these together, see the [orchestration patterns](ORCHESTRATION.md) (skill chains & multi-agent handoffs).
 ---
 ## 🧩 MCP Server — Skills on Demand
 For MCP clients (Claude Desktop, Cline, …), there's a zero-dependency [**MCP server**](mcp/) so your assistant **searches and pulls skills on demand** instead of installing 172 files. It exposes three tools — `list_skills`, `search_skills`, `get_skill` — over stdio (no network, nothing leaves your machine).
 ```json
 {
  "mcpServers": {
    "pm-claude-skills": { "command": "npx", "args": ["-y", "pm-claude-skills-mcp"] }
  }
 }
 ```
 Then ask: *"search the skills for customer churn, then apply the best one to my account."* Full setup in [`mcp/README.md`](mcp/).
 ---
 ## ⚙️ AI-Powered Tooling
 Three ways to put the library to work beyond installing files:
 **🤖 Run a skill in your CI — [GitHub Action](action/).** Auto-write PR descriptions, changelogs, release notes, or run a code-review checklist on every PR:
 ```yaml
 - uses: mohitagw15856/pm-claude-skills/action@main
  with:
    skill: pr-description-writer
    input: ${{ steps.diff.outputs.text }}
    api_key: ${{ secrets.ANTHROPIC_API_KEY }}
 ```
 **🏗️ Turn your docs into a skill — `generate`.** Point it at a URL or file and it writes a `SKILL.md` that follows the authoring standard:
 ```bash
 ANTHROPIC_API_KEY=sk-ant-… npx pm-claude-skills generate --from ./team-process.md
 ```
 **🏆 Skill Leaderboard — [evals](evals/).** An LLM-as-judge harness scores each skill across Claude models on structure, completeness, usefulness, and grounding. **[View the leaderboard →](https://mohitagw15856.github.io/pm-claude-skills/leaderboard.html)**
 ---
 ## 🌐 Skill Playground — Try Any Skill in Your Browser
-**▶ Live: [mohitagw15856.github.io/pm-claude-skills](https://mohitagw15856.github.io/pm-claude-skills/)**
+**▶ Live: [mohitagw15856.github.io/pm-claude-skills](https://mohitagw15856.github.io/pm-claude-skills/)** · 📚 [Browse the full skill catalog](https://mohitagw15856.github.io/pm-claude-skills/catalog.html)
 Don't want to install anything yet? Run any of these skills from a **zero-backend web app** using **your own Claude API key**. Pick a skill, fill in the auto-generated form, and Claude streams the result. Your key is stored only in your browser (`localStorage`) and sent directly to the Anthropic API — nothing touches a server we own.
@@ -263,7 +429,7 @@ Not sure which plugin to install? Here's what each one covers:
 On May 5, 2026, Anthropic [released their first agent templates](https://www.anthropic.com/news/finance-agents) — pre-packaged Claude agents that combine **skills, connectors, and subagents** into ready-to-run workflows for financial services.
-This library is the largest open-source collection of professional skills available — covering 17 professions beyond financial services. **The 167 skills here are the building blocks for agent templates outside of finance.**
+This library is the largest open-source collection of professional skills available — covering 17 professions beyond financial services. **The 174 skills here are the building blocks for agent templates outside of finance.**
 ### What is an agent template?
@@ -344,15 +510,61 @@ More templates will follow. If you want to contribute one, see the [template con
 The highlights are below. For the structured, [Keep a Changelog](https://keepachangelog.com/)-format history, see **[CHANGELOG.md](CHANGELOG.md)**.
-### 🆕 What's New in v17.0.0 — Agents, Commands & the npx CLI
+### 🆕 What's New in v20.2.0 — Community PRs & New Skill
 - **New skill: YouTube Script Writer** (experimental) — retention-optimized video scripts with hook variations, a video/audio cue table, and SEO metadata. Thanks @prajwal-28 (#50). **Now 174 skills.**
 - **Feature-prioritisation helper** — a dependency-free Python script that computes RICE/ICE rankings consistently across sessions. Thanks @zeotrix (#48).
 - **Safer installs + robust parsing** — the CLI refuses system-critical install targets, and `skillcheck` tolerates CRLF/whitespace in frontmatter. Thanks @MatrixNeoKozak (#47).
 - **Catalog reconciled to 174** — the headline, badge, and skill catalog now reflect the true count, with entries added for Skill Security Auditor, Launch Readiness, and YouTube Script Writer.
 <details>
 <summary><strong>v20.1.0 — Star Nudges & Eval Hardening</strong> (click to expand)</summary>
 - **Star the repo, from anywhere you use it** — tasteful, non-spammy CTAs (no `postinstall`): after a successful `npx pm-claude-skills add`, in `--help`, in `list`, in the MCP server banner, below the README badges, and a `funding` link on npm.
 - **One-click leaderboard in CI** — the "Update Skill Leaderboard" workflow runs the evals with your `ANTHROPIC_API_KEY` secret and opens a results PR; merge it to publish real numbers.
 - **Faster, hang-proof evals** — per-request timeout + retries in the API client and concurrent eval runs, so a CI run finishes in minutes and can't stall.
 </details>
 <details>
 <summary><strong>v20.0.0 — Agentic Tooling</strong> (click to expand)</summary>
 The library starts *doing* the work, not just describing it:
 - **GitHub Action** ([`action/`](action/)) — run any skill in a repo's CI (auto PR descriptions, changelogs, release notes, reviews). `uses: mohitagw15856/pm-claude-skills/action@main`. We dogfood it to write this repo's own PR descriptions.
 - **`generate` command** — `npx pm-claude-skills generate --from <url|file>` turns your docs into a standard-compliant `SKILL.md`.
 - **Skill evals + Leaderboard** — LLM-as-judge scoring of skills across models, rendered as a public [leaderboard](https://mohitagw15856.github.io/pm-claude-skills/leaderboard.html).
 </details>
 <details>
 <summary><strong>v19.0.0 — Security Auditor, Personas & Catalog</strong> (click to expand)</summary>
 - **Skill Security Auditor** — scans every skill (and its scripts) for prompt injection, exfiltration, unsafe code, secrets, hidden text; HIGH fails CI. Plus a `skill-security-auditor` skill.
 - **4 personas** (output-styles), an [orchestration guide](ORCHESTRATION.md), a server-rendered **skill catalog**, and a public [roadmap](ROADMAP.md).
 </details>
 <details>
 <summary><strong>v18.0.0 — Windsurf, Aider & an MCP Server</strong> (click to expand)</summary>
 - **Two more install targets** — **Windsurf** and **Aider** (now 5 export platforms / 7 tools).
 - **MCP server** (`npx pm-claude-skills-mcp`) — search & pull skills on demand from MCP clients.
 - **Automated npm publishing** workflow; README hero demo placement.
 </details>
 <details>
 <summary><strong>v17.0.0 — Agents, Commands & the npx CLI</strong> (click to expand)</summary>
 The library grows past "just skills" and gets one-command, cross-platform install:
- **`npx pm-claude-skills add --agent <tool>`** — a cross-platform Node installer (Windows/macOS/Linux, no bash, no git) for claude · hermes · codex · openclaw · cursor.
+- **`npx pm-claude-skills add --agent <tool>`** — a cross-platform Node installer (Windows/macOS/Linux, no bash, no git).
- **4 Claude Code subagents** (`agents/`) and **6 slash commands** (`commands/`) built on the strongest skills; `--agent claude` installs skills + agents + commands together.
+- **4 Claude Code subagents** (`agents/`) and **6 slash commands** (`commands/`); `--agent claude` installs skills + agents + commands together.
- **Cursor `.mdc` exports** — third generated platform alongside ChatGPT and Gemini.
+- **Cursor `.mdc` exports**, a **SkillCheck validator** + CI badge, and a **skill scaffolder** (`npm run new-skill`).
- **SkillCheck validator** (`scripts/skillcheck.mjs`) with a CI badge, and a **skill scaffolder** (`npm run new-skill`) that emits a valid `SKILL.md`.
+- **One-line shell installers** plus a unified `scripts/install.sh`.
- **One-line shell installers** for Codex / OpenClaw / Cursor, plus a unified `scripts/install.sh`.
+
 </details>
 <details>
 <summary><strong>v16.0.0 — Multi-Platform</strong> (click to expand)</summary>
@@ -549,7 +761,7 @@ This repo was built alongside a published article series. Read the full story:
 A 170+ skill library doesn't have 170 equally-mature skills, and pretending otherwise
 wastes your time. Skills are tiered honestly so you can start with the best work:
- 🟢 **Production-Ready (46)** — battle-tested, stable output, used in real work. Includes the three skills with computed Python helpers (sprint planning, RICE, customer health). **Start here.**
+- 🟢 **Production-Ready (47)** — battle-tested, stable output, used in real work. Includes the three skills with computed Python helpers (sprint planning, RICE, customer health). **Start here.**
 - 🔵 **Stable** — solid, reliable, well-structured; the default for most of the library.
 - 🟡 **Experimental** — newer or dependent on an external tool/API/scrape (Gemini, Gmail, browser automation, social scraping). Useful, but more setup and more moving parts.
@@ -559,12 +771,12 @@ If you're new, install `pm-essentials` and try a couple of Production-Ready skil
 ---
-## 🗂️ All 167 Skills
+## 🗂️ All 174 Skills
 The [Plugin Directory](#-plugin-directory) above summarises every bundle. Expand below for the full per-skill breakdown with folder paths.
 <details>
-<summary><strong>Browse all 167 skills by profession</strong> (click to expand)</summary>
+<summary><strong>Browse all 174 skills by profession</strong> (click to expand)</summary>
 ### 🛠️ Product Management (Skills 1–37)
 **Bundles:** `pm-essentials` · `pm-discovery` · `pm-planning` · `pm-delivery` · `pm-analytics` · `pm-strategy` · `pm-advanced` · `pm-rituals`
@@ -601,7 +813,7 @@ The [Plugin Directory](#-plugin-directory) above summarises every bundle. Expand
 ---
-### 👩‍💻 Engineering & Tech (Skills 46–80, 166–167)
+### 👩‍💻 Engineering & Tech (Skills 46–80, 166–168)
 **Bundle:** `pm-engineering`
 | # | Skill | Folder | What It Does |
@@ -643,6 +855,7 @@ The [Plugin Directory](#-plugin-directory) above summarises every bundle. Expand
 | 80 | **Engineering Hiring Rubric** 🆕 | `skills/engineering-hiring-rubric/` | Technical interview rubric with level expectations, coding scorecard, system design guide, behavioural question bank, and debrief template |
 | 166 | **Context Mode** 🆕 | `skills/context-mode/` | Filters command output noise and maintains a session log so Claude resumes exactly where it left off after a context reset |
 | 167 | **Claude Superpowers** 🆕 | `skills/claude-superpowers/` | Forces Claude Code to plan first, work in isolation, write tests before code, and double-review its own output — consistently better first passes |
 | 168 | **Skill Security Auditor** 🆕 | `skills/skill-security-auditor/` | Audits any SKILL.md / system prompt for prompt injection, data exfiltration, code execution, secrets, and hidden text; returns a risk-rated report with an install / don't-install recommendation |
 ---
@@ -769,7 +982,7 @@ claude plugin install pm-cs@pm-claude-skills
 ---
-### ⚙️ Operations (Skills 120–126, 164–165)
+### ⚙️ Operations (Skills 120–126, 164–165, 169)
 **Bundle:** `pm-operations`
 | # | Skill | Folder | What It Does |
@@ -783,6 +996,7 @@ claude plugin install pm-cs@pm-claude-skills
 | 126 | **RACI Matrix** 🆕 | `skills/raci-matrix/` | RACI with role definitions, decision map, anti-pattern guide, and a communication template for all teams |
 | 164 | **Email Triage** 🆕 | `skills/email-triage/` | Reads Gmail for a configurable window and surfaces only what needs action — priority-ranked with urgency ratings and reply starters |
 | 165 | **Morning Intelligence** 🆕 | `skills/morning-intelligence/` | 15-question interview that writes a personalised master prompt for your daily news brief, ready for Cowork Scheduled Tasks or Claude Code Routines |
 | 169 | **Launch Readiness** 🆕 | `skills/launch-readiness/` | Cross-functional pre-launch assessment with a function-by-function readiness status, ranked blockers (owners + deadlines), a risk register, and an explicit Go / Conditional Go / No-Go recommendation |
 ---
@@ -864,7 +1078,7 @@ claude plugin install pm-social@pm-claude-skills
 ---
-### ✍️ Writers & Content Creators (Skills 156–160)
+### ✍️ Writers & Content Creators (Skills 156–160, 170)
 **Bundle:** `pm-writers`
 > Install:
@@ -880,6 +1094,7 @@ claude plugin install pm-writers@pm-claude-skills
 | 158 | **Thumbnail Creator** 🆕 | `skills/thumbnail-creator/` | Generates brand-aligned thumbnail candidates via Gemini API; Claude evaluates results via computer vision and returns ranked candidates with rationale |
 | 159 | **Substack Notes Scraper** 🆕 | `skills/substack-notes-scraper/` | Scrapes Substack Notes and exports likes, comments, and restacks to a formatted .xlsx with frozen headers, filters, and top-performer highlighting |
 | 160 | **Notes Humanizer** 🆕 | `skills/notes-humanizer/` | Strips AI writing patterns (em dashes, filler phrases, uniform rhythm) across 3 phases: audit, strip, inject — returns side-by-side comparison and clean final text |
 | 170 | **YouTube Script Writer** 🆕 | `skills/youtube-script-writer/` | Retention-optimized video scripts with 3 title/thumbnail concepts, 3 hook variations, a video/audio cue script table, and SEO metadata |
 </details>
@@ -887,7 +1102,7 @@ claude plugin install pm-writers@pm-claude-skills
 ## ❤️ Sponsor This Work
-Building and maintaining 167 skills across 26 bundles takes real time — testing skills against new model releases, building new ones from community requests, writing the article series, and keeping documentation current.
+Building and maintaining 174 skills across 26 bundles takes real time — testing skills against new model releases, building new ones from community requests, writing the article series, and keeping documentation current.
 If these skills save you time at work, consider sponsoring:
@@ -908,7 +1123,7 @@ Higher tiers include custom skill development for your team, direct access for s
 This is an open-source community library. If you've built a skill that saves you time, share it here.
-**Found a bug?** [Open a bug report →](../../issues/new?template=bug-report.md) — use the template so it's easy to triage.
+**New here?** See the [Roadmap & good first issues](ROADMAP.md#-good-first-issues) for starter tasks. **Found a bug?** [Open a bug report →](../../issues/new?template=bug-report.md).
 **How to contribute:**
@@ -918,7 +1133,7 @@ This is an open-source community library. If you've built a skill that saves you
 3. Fill in the sections, then check it: `npm run skillcheck`
 4. Raise a pull request with a short description of what the skill does and why you built it
-> CI runs **SkillCheck** on every PR — `node scripts/skillcheck.mjs` validates structure and must pass.
+> Every PR is gated by **SkillCheck** (structure — `node scripts/skillcheck.mjs`) and the **Skill Security Auditor** (safety — `node scripts/skill-audit.mjs`, which flags prompt-injection / exfiltration / unsafe code). Both must pass.
 **SKILL.md template:**
 ---
@@ -0,0 +1,45 @@
 # Roadmap
 Where the library is headed. This is a direction, not a contract — priorities shift with
 community input. Have an idea? [Open a discussion](https://github.com/mohitagw15856/pm-claude-skills/discussions)
 or [request a skill](SKILL_REQUEST.md).
 ## ✅ Recently shipped
 - **Multi-platform** — single-source exports to Claude, ChatGPT, Gemini, Cursor, Windsurf, Aider; native installers for Hermes, Codex, OpenClaw.
 - **`npx pm-claude-skills`** — one cross-platform install command (published on npm).
 - **MCP server** — search & pull skills on demand from any MCP client.
 - **Subagents, slash commands, personas (output-styles)** — content beyond skills.
 - **Quality gates** — SkillCheck (structure) + Skill Security Auditor (safety) in CI.
 - **Skill tiers**, a scaffolder (`npm run new-skill`), and a static skill catalog.
 ## 🔭 Now (in progress)
 - Growing **per-skill depth** — `references/` and `templates/` for the most-used skills.
 - A browsable **docs site** beyond the catalog (per-tool install guides, search).
 ## ⏭️ Next
 - More **export/install targets** as the `SKILL.md` standard spreads (Kilo Code, OpenCode, Windsurf rule modes).
 - **Skill chaining** helpers to make the [orchestration patterns](ORCHESTRATION.md) one-command.
 - Expanding **Production-Ready** coverage — promoting Stable skills as they prove out.
 ## 🌠 Later
 - Community **skill packs** (curated bundles for a role/industry).
 - Internationalised skill descriptions.
 - A public **contributor leaderboard**.
 ---
 ## 🌱 Good first issues
 New here? These are great starter contributions (open a PR — `npm run skillcheck` must pass):
 1. **Add a requested skill** from [SKILL_REQUEST.md](SKILL_REQUEST.md) or the wishlist in the README. Scaffold it with `npm run new-skill -- --name your-skill`.
 2. **Strengthen an existing skill** — add a missing *Quality Checks* or *Anti-Patterns* section (SkillCheck warns where they're absent: `node scripts/skillcheck.mjs`).
 3. **Add a Python helper** to a skill that would benefit from computed output (see the RICE / sprint / health examples under `skills/*/scripts/`).
 4. **Add an export/install target** for another tool — it's a few lines in the `PLATFORMS` registry of `scripts/build-exports.mjs` plus the installers.
 5. **Improve docs** — a clearer example in a skill, or a fix in the catalog/README.
 See [CONTRIBUTING.md](CONTRIBUTING.md) for the full flow.
@@ -10,9 +10,9 @@ That said, security matters here in two specific ways: **skill file safety** and
 | Version | Supported |
 |---|---|
-| v17.x (latest) | ✅ Active |
+| v20.x (latest) | ✅ Active |
-| v15.x – v16.x | ✅ Security fixes only |
+| v18.x – v19.x | ✅ Security fixes only |
-| < v15.0.0 | ❌ No longer supported |
+| < v18.0.0 | ❌ No longer supported |
 Because skills are plain markdown, "support" means we review and correct any reported
 safety issue (prompt injection, unsafe instructions) in the listed versions.
@@ -14,7 +14,7 @@ strongest work and know what to expect from the rest.
 ---
-## 🟢 Production-Ready (46)
+## 🟢 Production-Ready (47)
 These are the skills to reach for first — the most-used, most-refined frameworks in the
 library.
@@ -44,7 +44,7 @@ library.
 `go-to-market` · `competitor-teardown` · `product-positioning-doc`
 **Cross-profession**
-`executive-summary` · `press-release`
+`executive-summary` · `press-release` · `skill-security-auditor`
 ---
@@ -0,0 +1,78 @@
 # 🧩 Workflow Recipes
 > **Skills you can chain.** A recipe runs several skills in sequence and *passes each output forward as context* for the next — so a fuzzy idea comes out the other end as a finished, joined-up set of artifacts. No other skills library chains across professions like this.
 Run one as a slash command in Claude Code (e.g. `/ship-a-feature a referral program for B2B users`), or fetch it over MCP with the `get_workflow` tool.
 <!-- Generated from workflows.json by scripts/build-workflows.mjs — do not edit by hand. -->
 There are **5 recipes** today:
 | Recipe | Command | Lifecycle | Chains |
 |--------|---------|-----------|--------|
 | **Ship a Feature** | `/ship-a-feature` | Discover → Decide → Build → Ship | 5 skills |
 | **Close the Quarter** | `/close-the-quarter` | Measure → Communicate | 4 skills |
 | **Launch a Product** | `/launch-a-product` | Decide → Ship | 5 skills |
 | **Rescue an Account** | `/rescue-an-account` | Measure → Communicate | 4 skills |
 | **Run Discovery** | `/run-discovery` | Discover → Decide | 4 skills |
 ## Ship a Feature — `/ship-a-feature`
 *Discover → Decide → Build → Ship* · Take a raw feature idea from fuzzy brief all the way to a launch plan, end to end.
 `ambiguity-resolver` → `prd-template` → `rice-prioritisation` → `roadmap-narrative` → `go-to-market`
 1. **ambiguity-resolver** → produces a sharp problem statement and scoped boundaries.
 2. **prd-template** → produces a full PRD with goals, requirements, and success metrics.
 3. **rice-prioritisation** → produces a RICE score positioning this work against alternatives.
 4. **roadmap-narrative** → produces where this sits on the roadmap and the story around it.
 5. **go-to-market** → produces a launch plan: audience, messaging, channels, and timeline.
 ## Close the Quarter — `/close-the-quarter`
 *Measure → Communicate* · Turn the quarter's raw numbers into a leadership-ready story and board deck.
 `metrics-framework` → `churn-analysis` → `executive-update` → `board-deck-narrative`
 1. **metrics-framework** → produces the metric tree and what actually moved.
 2. **churn-analysis** → produces why customers left and what is avoidable.
 3. **executive-update** → produces a tight leadership briefing of the quarter.
 4. **board-deck-narrative** → produces a slide-by-slide board deck storyline.
 ## Launch a Product — `/launch-a-product`
 *Decide → Ship* · Go from competitive landscape to positioning to a fully checklisted launch and press release.
 `competitor-teardown` → `product-positioning-doc` → `go-to-market` → `product-launch-checklist` → `press-release`
 1. **competitor-teardown** → produces the competitive map and gaps to exploit.
 2. **product-positioning-doc** → produces positioning, value props, and messaging pillars.
 3. **go-to-market** → produces the GTM plan across audience and channels.
 4. **product-launch-checklist** → produces an owner-by-owner launch readiness checklist.
 5. **press-release** → produces the announcement press release.
 ## Rescue an Account — `/rescue-an-account`
 *Measure → Communicate* · Diagnose an at-risk customer and build the full save play through to renewal.
 `cs-health-scorecard` → `churn-analysis` → `cs-escalation-brief` → `renewal-playbook`
 1. **cs-health-scorecard** → produces a health score with the specific risk drivers.
 2. **churn-analysis** → produces the root cause and whether the risk is avoidable.
 3. **cs-escalation-brief** → produces an internal escalation brief for the save.
 4. **renewal-playbook** → produces the renewal strategy and negotiation plan.
 ## Run Discovery — `/run-discovery`
 *Discover → Decide* · From a vague opportunity to validated insight and a prioritised next step.
 `ambiguity-resolver` → `discovery-interview-guide` → `user-research-synthesis` → `rice-prioritisation`
 1. **ambiguity-resolver** → produces a one-page problem brief from the fuzzy opportunity.
 2. **discovery-interview-guide** → produces a screener and discussion guide for user interviews.
 3. **user-research-synthesis** → produces themes and insights from the research.
 4. **rice-prioritisation** → produces a ranked, defensible list of what to do next.
 ---
 **Add your own:** define it in [`workflows.json`](workflows.json), add a matching `commands/<id>.md`, and run `node scripts/build-workflows.mjs`. Recipes are just composition — every step is an existing skill you can already run on its own.
@@ -0,0 +1,65 @@
 # PM Skills — GitHub Action
 Run any skill from this library inside **your** repo's CI. Turn the library's frameworks
 into automation: auto-write PR descriptions, generate release notes and changelogs, or run
 a code-review checklist — on every push or PR.
 ```yaml
 - uses: mohitagw15856/pm-claude-skills/action@main
  with:
    skill: pr-description-writer
    input: ${{ steps.diff.outputs.text }}
    api_key: ${{ secrets.ANTHROPIC_API_KEY }}
 ```
 ## Inputs
 | Input | Required | Description |
 |---|---|---|
 | `skill` | ✅ | Skill name, e.g. `pr-description-writer`, `changelog-generator`, `code-review-checklist`. |
 | `input` | — | The text/context to run the skill on. |
 | `input_file` | — | Read input from a file instead of `input`. |
 | `api_key` | ✅ | Anthropic API key (store as a repo secret). |
 | `model` | — | Model id (default `claude-sonnet-4-6`). |
 | `output_file` | — | Also write the result to this file. |
 **Output:** `result` — the skill's output (use `output_file` for long, multi-line results).
 ## Example — auto-write a PR description
 ```yaml
 name: PR description
 on: { pull_request: { types: [opened] } }
 permissions: { contents: read, pull-requests: write }
 jobs:
  describe:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - id: diff
        run: |
          echo "text<<EOF" >> "$GITHUB_OUTPUT"
          git diff origin/${{ github.base_ref }}...HEAD --stat >> "$GITHUB_OUTPUT"
          echo "EOF" >> "$GITHUB_OUTPUT"
      - id: skill
        uses: mohitagw15856/pm-claude-skills/action@main
        with:
          skill: pr-description-writer
          input: ${{ steps.diff.outputs.text }}
          api_key: ${{ secrets.ANTHROPIC_API_KEY }}
      - uses: actions/github-script@v7
        with:
          script: |
            github.rest.pulls.update({ owner: context.repo.owner, repo: context.repo.repo,
              pull_number: context.issue.number, body: process.env.BODY })
        env: { BODY: ${{ steps.skill.outputs.result }} }
 ```
 ## Other ideas
 - `skill: changelog-generator` from `git log` → write `CHANGELOG.md`.
 - `skill: release-notes` on tag push → set the GitHub Release body.
 - `skill: code-review-checklist` → post a review checklist as a PR comment.
 Pin to a release tag (e.g. `@v19`) for stability once you've tried `@main`.
@@ -0,0 +1,51 @@
 name: 'PM Skills — Run a Skill'
 description: 'Run any pm-claude-skills SKILL.md in CI — auto PR descriptions, changelogs, release notes, code-review checklists, and more.'
 author: 'Mohit Aggarwal'
 branding:
  icon: 'cpu'
  color: 'purple'
 inputs:
  skill:
    description: 'Skill name to run (e.g. pr-description-writer, changelog-generator, code-review-checklist).'
    required: true
  input:
    description: 'The input/context text the skill should work on.'
    required: false
  input_file:
    description: 'Read the input from this file instead of the `input` string.'
    required: false
  api_key:
    description: 'Anthropic API key (store it as a secret).'
    required: true
  model:
    description: 'Claude model id.'
    required: false
    default: 'claude-sonnet-4-6'
  output_file:
    description: 'If set, also write the result to this file.'
    required: false
  max_tokens:
    description: 'Max output tokens.'
    required: false
    default: '4096'
 outputs:
  result:
    description: 'The skill output (also use output_file for multi-line results).'
    value: ${{ steps.run.outputs.result }}
 runs:
  using: composite
  steps:
    - id: run
      shell: bash
      run: node "$GITHUB_ACTION_PATH/run.mjs"
      env:
        INPUT_SKILL: ${{ inputs.skill }}
        INPUT_INPUT: ${{ inputs.input }}
        INPUT_INPUT_FILE: ${{ inputs.input_file }}
        INPUT_API_KEY: ${{ inputs.api_key }}
        INPUT_MODEL: ${{ inputs.model }}
        INPUT_OUTPUT_FILE: ${{ inputs.output_file }}
        INPUT_MAX_TOKENS: ${{ inputs.max_tokens }}
@@ -0,0 +1,58 @@
 #!/usr/bin/env node
 // Runner for the pm-skills GitHub Action. Loads a bundled SKILL.md, runs it on
 // the provided input via the Anthropic API, and exposes the result as a step
 // output (and optionally a file). Inputs arrive as INPUT_* env vars.
 import { readFileSync, existsSync, writeFileSync, appendFileSync } from 'node:fs';
 import { join, dirname } from 'node:path';
 import { fileURLToPath, pathToFileURL } from 'node:url';
 import { complete, parseSkill } from '../bin/lib/anthropic.mjs';
 const ACTION_DIR = dirname(fileURLToPath(import.meta.url));
 const REPO_ROOT = join(ACTION_DIR, '..');
 const inp = (name, def = '') => (process.env[`INPUT_${name.toUpperCase()}`] ?? def).trim();
 // Pure: assemble the system prompt + user message for a skill run (testable offline).
 export function buildRequest(skillBody, userInput) {
  const system = skillBody +
    '\n\n---\nExecute this skill now on the input below and produce the complete output. ' +
    'Do not ask follow-up questions — work with what is given and note any reasonable assumptions. ' +
    'Output only the finished artifact (no preamble).';
  return { system, messages: [{ role: 'user', content: userInput }] };
 }
 async function main() {
  const skill = inp('skill');
  if (!skill) throw new Error('Input `skill` is required.');
  const apiKey = inp('api_key') || process.env.ANTHROPIC_API_KEY || '';
  const model = inp('model', 'claude-sonnet-4-6');
  const maxTokens = parseInt(inp('max_tokens', '4096'), 10) || 4096;
  let input = inp('input');
  const inputFile = inp('input_file');
  if (!input && inputFile && existsSync(inputFile)) input = readFileSync(inputFile, 'utf8');
  if (!input) throw new Error('Provide `input` or `input_file`.');
  const skillFile = join(REPO_ROOT, 'skills', skill, 'SKILL.md');
  if (!existsSync(skillFile)) throw new Error(`Unknown skill "${skill}" (no skills/${skill}/SKILL.md).`);
  const { body } = parseSkill(readFileSync(skillFile, 'utf8'));
  const { system, messages } = buildRequest(body, input);
  console.log(`Running skill "${skill}" with ${model}…`);
  const result = await complete({ apiKey, model, system, messages, maxTokens });
  // Step output (multiline-safe heredoc) + optional file.
  if (process.env.GITHUB_OUTPUT) {
    const d = `EOF_${Math.random().toString(36).slice(2)}`;
    appendFileSync(process.env.GITHUB_OUTPUT, `result<<${d}\n${result}\n${d}\n`);
  }
  const outFile = inp('output_file');
  if (outFile) { writeFileSync(outFile, result + '\n'); console.log(`Wrote ${outFile}`); }
  console.log('\n----- skill output -----\n' + result);
 }
 // Run only when executed directly (so tests can import buildRequest).
 if (import.meta.url === pathToFileURL(process.argv[1] || '').href) {
  main().catch((e) => { console.error(`Error: ${e.message}`); process.exit(1); });
 }
@@ -13,23 +13,28 @@
 //   --link           symlink instead of copy (native agents; falls back to copy)
 //   --dry-run        print what would happen without writing
 import { readdirSync, existsSync, mkdirSync, rmSync, cpSync, symlinkSync, copyFileSync, statSync } from 'node:fs';
-import { join, dirname, basename } from 'node:path';
+import { join, dirname, basename, resolve } from 'node:path';
 import { fileURLToPath } from 'node:url';
 import { homedir } from 'node:os';
 import { createRequire } from 'node:module';
 const PKG_ROOT = dirname(dirname(fileURLToPath(import.meta.url)));
 const STAR = '⭐ Find this useful? Star the repo: https://github.com/mohitagw15856/pm-claude-skills';
 const VERSION = (() => {
  try { return createRequire(import.meta.url)('../package.json').version; } catch { return '0.0.0'; }
 })();
 const NATIVE = new Set(['claude', 'hermes', 'codex', 'openclaw']);
 // Rule-file agents install generated files from exports/<agent> (ext per agent).
 const RULEFILE = { cursor: '.mdc', windsurf: '.md', aider: '.md' };
 const defaultTarget = (agent) => ({
  claude: join(homedir(), '.claude', 'skills'),
  hermes: join(homedir(), '.hermes', 'skills'),
  codex: join(homedir(), '.codex', 'skills'),
  openclaw: join(homedir(), '.openclaw', 'skills'),
  cursor: join(process.cwd(), '.cursor', 'rules'),
  windsurf: join(process.cwd(), '.windsurf', 'rules'),
  aider: join(process.cwd(), '.aider', 'skills'),
 }[agent]);
 function parse(argv) {
@@ -46,12 +51,12 @@ function parse(argv) {
  return out;
 }
-function listMdc(dir) {
+function listFiles(dir, ext) {
  const out = [];
  for (const e of readdirSync(dir)) {
    const p = join(dir, e);
-    if (statSync(p).isDirectory()) out.push(...listMdc(p));
+    if (statSync(p).isDirectory()) out.push(...listFiles(p, ext));
-    else if (p.endsWith('.mdc')) out.push(p);
+    else if (p.endsWith(ext)) out.push(p);
  }
  return out;
 }
@@ -68,25 +73,35 @@ function placeDir(src, dest, { link, dryRun }) {
 function add(opts) {
  const agent = opts.agent;
-  if (!agent || !(NATIVE.has(agent) || agent === 'cursor')) {
+  if (!agent || !(NATIVE.has(agent) || agent in RULEFILE)) {
-    console.error(`Error: --agent must be one of: claude, hermes, codex, openclaw, cursor.`);
+    console.error(`Error: --agent must be one of: claude, hermes, codex, openclaw, cursor, windsurf, aider.`);
    process.exit(2);
  }
  const skillsDir = join(PKG_ROOT, 'skills');
  if (!existsSync(skillsDir)) { console.error(`Error: bundled skills/ not found at ${skillsDir}.`); process.exit(1); }
-  const target = opts.target || defaultTarget(agent);
+  const target = resolve(opts.target || defaultTarget(agent));
  // Guard against installing into system-critical directories (e.g. a typo'd --target).
  const criticalPaths = ['/', '/usr', '/bin', '/etc', '/var', '/root', '/boot', '/proc', '/sys', '/dev'];
  if (criticalPaths.includes(target)) {
    console.error(`Error: Cannot install into a system-critical directory: ${target}`);
    process.exit(1);
  }
  let count = 0;
  console.log(`${opts.dryRun ? '[dry-run] ' : ''}Installing for '${agent}' into ${target}`);
  if (!opts.dryRun) mkdirSync(target, { recursive: true });
-  if (agent === 'cursor') {
+  if (agent in RULEFILE) {
-    const cursorDir = join(PKG_ROOT, 'exports', 'cursor');
+    const ext = RULEFILE[agent];
-    if (!existsSync(cursorDir)) { console.error(`Error: ${cursorDir} missing.`); process.exit(1); }
+    const exportDir = join(PKG_ROOT, 'exports', agent);
-    for (const mdc of listMdc(cursorDir).sort()) {
+    if (!existsSync(exportDir)) { console.error(`Error: ${exportDir} missing.`); process.exit(1); }
-      const dest = join(target, basename(mdc));
+    for (const f of listFiles(exportDir, ext).sort()) {
-      if (opts.dryRun) console.log(`  would install ${basename(mdc)} -> ${dest}`);
+      if (basename(f) === 'README.md') continue;   // skip the generated index
-      else copyFileSync(mdc, dest);
+      const dest = join(target, basename(f));
      if (opts.dryRun) console.log(`  would install ${basename(f)} -> ${dest}`);
      else copyFileSync(f, dest);
      count++;
    }
  } else {
@@ -96,10 +111,10 @@ function add(opts) {
      placeDir(src, join(target, name), opts);
      count++;
    }
-    // Claude Code also gets subagents and slash commands.
+    // Claude Code also gets subagents, slash commands, and output-styles.
    if (agent === 'claude') {
      const claudeRoot = dirname(target);
-      for (const kind of ['agents', 'commands']) {
+      for (const kind of ['agents', 'commands', 'output-styles']) {
        const src = join(PKG_ROOT, kind);
        if (!existsSync(src)) continue;
        const dest = join(claudeRoot, kind);
@@ -116,32 +131,43 @@ function add(opts) {
  console.log(`\n${opts.dryRun ? 'Would install' : 'Installed'} ${count} item(s) for '${agent}'.`);
  if (!opts.dryRun) {
-    console.log(agent === 'cursor'
+    const note = {
-      ? `Cursor will pick up the rules in ${target} on its next session.`
+      cursor: `Cursor will pick up the rules in ${target} on its next session.`,
-      : `Restart ${agent} — it auto-discovers SKILL.md skills in ${target} by their description.`);
+      windsurf: `Windsurf will pick up the rules in ${target} on its next session.`,
      aider: `Load any of them with:  aider --read ${join(target, '<skill>.md')}`,
    }[agent] || `Restart ${agent} — it auto-discovers SKILL.md skills in ${target} by their description.`;
    console.log(note);
    console.log(`\n${STAR}`);
  }
 }
 function list() {
  console.log('Supported agents and default targets:\n');
-  for (const a of ['claude', 'hermes', 'codex', 'openclaw', 'cursor']) {
+  for (const a of ['claude', 'hermes', 'codex', 'openclaw', 'cursor', 'windsurf', 'aider']) {
    console.log(`  ${a.padEnd(9)} ${defaultTarget(a)}`);
  }
  console.log('\nNative SKILL.md agents: claude, hermes, codex, openclaw (install skill folders).');
-  console.log('Claude also gets subagents + slash commands. Cursor installs .mdc rules.');
+  console.log('Claude also gets subagents + slash commands. Cursor/Windsurf install rule files;');
  console.log('Aider installs conventions you load with "aider --read".');
  console.log(`\n${STAR}`);
 }
 const HELP = `pm-claude-skills — install professional Agent Skills into any AI coding tool.
 Usage:
-  npx pm-claude-skills add --agent <claude|hermes|codex|openclaw|cursor> [--target <path>] [--link] [--dry-run]
+  npx pm-claude-skills add --agent <claude|hermes|codex|openclaw|cursor|windsurf|aider> [--target <path>] [--link] [--dry-run]
  npx pm-claude-skills list
  npx pm-claude-skills --version
 Examples:
  npx pm-claude-skills add --agent claude     # skills + subagents + commands
  npx pm-claude-skills add --agent cursor     # .mdc rules into ./.cursor/rules
  npx pm-claude-skills add --agent windsurf   # .md rules into ./.windsurf/rules
  npx pm-claude-skills add --agent codex --link
  npx pm-claude-skills generate --from <url|file>   # turn your docs into a SKILL.md (needs ANTHROPIC_API_KEY)
 ${STAR}
 `;
 const opts = parse(process.argv.slice(2));
@@ -150,4 +176,9 @@ if (opts.version) console.log(VERSION);
 else if (opts.help || !cmd || cmd === 'help') console.log(HELP);
 else if (cmd === 'list') list();
 else if (cmd === 'add') add(opts);
 else if (cmd === 'generate') {
  const { run } = await import('./generate.mjs');
  try { process.exit(await run(process.argv.slice(3))); }
  catch (e) { console.error(`Error: ${e.message}`); process.exit(1); }
 }
 else { console.error(`Unknown command: ${cmd}\n`); console.log(HELP); process.exit(2); }
@@ -0,0 +1,109 @@
 // `pm-claude-skills generate` — turn a doc (URL or file) into a SKILL.md that
 // follows this library's authoring standard. Uses the Anthropic API.
 //
 //   ANTHROPIC_API_KEY=sk-ant-... npx pm-claude-skills generate --from ./process.md
 //   ... generate --from https://example.com/runbook --name incident-runbook
 //   ... generate --from notes.txt --out ./skills --dry-run
 import { writeFileSync, mkdirSync, existsSync, readFileSync } from 'node:fs';
 import { join } from 'node:path';
 import { complete, parseSkill } from './lib/anthropic.mjs';
 function getArg(argv, name, def) {
  const i = argv.indexOf(`--${name}`);
  return i !== -1 ? argv[i + 1] : def;
 }
 // Strip tags/scripts/styles from HTML to rough text (good enough for an LLM).
 function htmlToText(html) {
  return html
    .replace(/<script[\s\S]*?<\/script>/gi, ' ')
    .replace(/<style[\s\S]*?<\/style>/gi, ' ')
    .replace(/<[^>]+>/g, ' ')
    .replace(/&[a-z]+;/gi, ' ')
    .replace(/\s+/g, ' ')
    .trim();
 }
 async function loadSource(from) {
  if (/^https?:\/\//i.test(from)) {
    const res = await fetch(from);
    if (!res.ok) throw new Error(`Could not fetch ${from} (HTTP ${res.status}).`);
    const text = await res.text();
    return /<html|<body|<div/i.test(text) ? htmlToText(text) : text;
  }
  if (!existsSync(from)) throw new Error(`No such file: ${from}`);
  return readFileSync(from, 'utf8');
 }
 const META_PROMPT = `You convert a team's documentation into a single Claude/Agent "skill" file (SKILL.md) that follows this exact standard. Output ONLY the file content, starting with the YAML frontmatter — no code fences, no preamble.
 Required structure:
 ---
 name: <lowercase-hyphenated, derived from the doc's purpose>
 description: "<one sentence on what it does>. Use when <trigger phrases a user would say>. Produces <the concrete artifact>."
 ---
 # <Title> Skill
 <one-line value summary>
 ## What This Skill Produces
 - <deliverables>
 ## Required Inputs
 Ask for (if not provided):
 - <inputs to gather; never invent them>
 ## Process
 1. <steps>
 ## Output Format
 <a concrete template — headings/tables — of the final artifact>
 ## Quality Checks
 - [ ] <checks the output must pass>
 ## Anti-Patterns
 - [ ] Do not <mistakes this skill prevents>
 Rules: be specific to the documentation provided; turn its rules/process into the skill. The description MUST contain "Use when" and "Produces". Do not include any text outside the file.`;
 export async function run(argv) {
  const from = getArg(argv, 'from');
  if (!from || argv.includes('--help')) {
    console.log('Usage: pm-claude-skills generate --from <url|file> [--name x] [--out dir] [--model m] [--dry-run]');
    return from ? 0 : 1;
  }
  const apiKey = process.env.ANTHROPIC_API_KEY || '';
  if (!apiKey) { console.error('Set ANTHROPIC_API_KEY to generate a skill.'); return 1; }
  const model = getArg(argv, 'model', 'claude-sonnet-4-6');
  const outDir = getArg(argv, 'out', 'skills');
  const dryRun = argv.includes('--dry-run');
  console.error(`Reading ${from}…`);
  const source = (await loadSource(from)).slice(0, 24000); // cap context
  console.error(`Generating a SKILL.md with ${model}…`);
  const out = await complete({
    apiKey, model, system: META_PROMPT,
    messages: [{ role: 'user', content: `Documentation to convert into a skill:\n\n${source}` }],
    maxTokens: 3000,
  });
  const cleaned = out.replace(/^```[a-z]*\n?/i, '').replace(/\n?```$/i, '').trim();
  const { meta } = parseSkill(cleaned);
  const name = getArg(argv, 'name', meta.name);
  if (!name) { console.error('Could not determine a skill name — pass --name.'); return 1; }
  if (dryRun) {
    console.log(cleaned);
    console.error(`\n[dry-run] Would write ${join(outDir, name, 'SKILL.md')}`);
    return 0;
  }
  const dir = join(outDir, name);
  mkdirSync(dir, { recursive: true });
  writeFileSync(join(dir, 'SKILL.md'), cleaned + '\n');
  console.log(`Created ${join(dir, 'SKILL.md')}`);
  console.log('Next: review it, then validate — node scripts/skillcheck.mjs && node scripts/skill-audit.mjs');
  return 0;
 }
@@ -0,0 +1,77 @@
 // Minimal, dependency-free Anthropic Messages API client (Node 18+ global fetch).
 // Shared by the GitHub Action runner, the eval harness, and skill generation.
 // No SDK, no install — just a thin POST wrapper.
 const API_URL = 'https://api.anthropic.com/v1/messages';
 /**
 * Call the Anthropic Messages API and return the concatenated text output.
 * Adds a per-request timeout and limited retries so a slow/transient failure
 * can't hang a CI job forever.
 * @param {object} o
 * @param {string} o.apiKey  - Anthropic API key.
 * @param {string} [o.model] - Model id (default claude-sonnet-4-6).
 * @param {string} [o.system]- System prompt.
 * @param {Array}  o.messages- [{role, content}] messages.
 * @param {number} [o.maxTokens]
 * @param {number} [o.timeoutMs] - Per-request timeout (default 120s).
 * @param {number} [o.retries]   - Retries on timeout / 429 / 5xx (default 2).
 * @returns {Promise<string>}
 */
 export async function complete({ apiKey, model = 'claude-sonnet-4-6', system, messages, maxTokens = 4096, timeoutMs = 120000, retries = 2 }) {
  if (!apiKey) throw new Error('Missing Anthropic API key (set ANTHROPIC_API_KEY).');
  let lastErr;
  for (let attempt = 0; attempt <= retries; attempt++) {
    const ctrl = new AbortController();
    const timer = setTimeout(() => ctrl.abort(), timeoutMs);
    try {
      const res = await fetch(API_URL, {
        method: 'POST',
        headers: {
          'content-type': 'application/json',
          'x-api-key': apiKey,
          'anthropic-version': '2023-06-01',
        },
        body: JSON.stringify({ model, max_tokens: maxTokens, ...(system ? { system } : {}), messages }),
        signal: ctrl.signal,
      });
      if (res.ok) {
        const data = await res.json();
        return (data.content || []).map((c) => c.text || '').join('').trim();
      }
      const body = await res.text().catch(() => '');
      // Retry transient server / rate-limit errors; fail fast on 4xx (bad key/model).
      if ((res.status === 429 || res.status >= 500) && attempt < retries) {
        lastErr = new Error(`Anthropic API ${res.status}`);
      } else {
        throw new Error(`Anthropic API ${res.status}: ${body.slice(0, 500)}`);
      }
    } catch (e) {
      if (e.name === 'AbortError') e = new Error(`Anthropic API request timed out after ${timeoutMs}ms`);
      const retryable = /timed out/.test(e.message) || e.name === 'TypeError' || /Anthropic API (429|5\d\d)/.test(e.message);
      if (!retryable || attempt >= retries) throw e;
      lastErr = e;
    } finally {
      clearTimeout(timer);
    }
    await new Promise((r) => setTimeout(r, 1000 * 2 ** attempt)); // backoff: 1s, 2s, 4s
  }
  throw lastErr || new Error('Anthropic API request failed.');
 }
 /** Parse "name: value" YAML-ish frontmatter + body from a SKILL.md string. */
 export function parseSkill(text) {
  const m = text.match(/^---\n([\s\S]*?)\n---\n?([\s\S]*)$/);
  const meta = {};
  if (m) {
    for (const line of m[1].split('\n')) {
      const kv = line.match(/^(\w[\w-]*):\s*(.*)$/);
      if (kv) {
        let v = kv[2].trim();
        if ((v.startsWith('"') && v.endsWith('"')) || (v.startsWith("'") && v.endsWith("'"))) v = v.slice(1, -1);
        meta[kv[1]] = v;
      }
    }
  }
  return { meta, body: m ? m[2].trim() : text.trim() };
 }
@@ -11,6 +11,18 @@ Claude Code **slash commands** that run a skill on whatever you pass them.
 | `/retro` | Structured sprint retrospective | retro-analysis |
 | `/exec-summary` | Crisp executive summary | executive-summary |
 ## 🧩 Workflow recipes — chained commands
 These run **several skills in sequence**, passing each output forward as context. Full detail in [WORKFLOWS.md](../WORKFLOWS.md).
 | Command | Chains | Lifecycle |
 |---|---|---|
 | `/ship-a-feature` | ambiguity-resolver → prd-template → rice-prioritisation → roadmap-narrative → go-to-market | Discover → Ship |
 | `/close-the-quarter` | metrics-framework → churn-analysis → executive-update → board-deck-narrative | Measure → Communicate |
 | `/launch-a-product` | competitor-teardown → product-positioning-doc → go-to-market → product-launch-checklist → press-release | Decide → Ship |
 | `/rescue-an-account` | cs-health-scorecard → churn-analysis → cs-escalation-brief → renewal-playbook | Measure → Communicate |
 | `/run-discovery` | ambiguity-resolver → discovery-interview-guide → user-research-synthesis → rice-prioritisation | Discover → Decide |
 ## Install
 ```bash
@@ -0,0 +1,17 @@
 ---
 description: Workflow recipe — turn the quarter's raw numbers into a leadership story and board deck by chaining 4 skills.
 argument-hint: [the quarter's metrics, wins, and misses]
 ---
 Run the **Close the Quarter** workflow recipe for: $ARGUMENTS
 This is a *chain* of skills. Run each stage in order and **carry every stage's output forward as context** for the next. Open with a one-line plan of the 4 stages, then ask once for any essential missing inputs (the key numbers, the audience, the period). Don't re-ask between stages.
 Run each stage under a clear `## Stage N — <name>` heading:
 1. **Map what moved** — apply the `metrics-framework` skill to organise the numbers into a metric tree and identify what actually changed and why.
 2. **Explain retention** — apply the `churn-analysis` skill to separate avoidable from unavoidable churn and surface the headline retention story.
 3. **Brief leadership** — apply the `executive-update` skill, using the findings above, to write a tight leadership briefing for the period.
 4. **Build the board story** — apply the `board-deck-narrative` skill to turn that briefing into a slide-by-slide board deck storyline.
 Do not invent numbers — work only with what is given and flag any gaps. End with a 4-bullet **"What you now have"** recap.
@@ -0,0 +1,18 @@
 ---
 description: Workflow recipe — go from competitive landscape to a fully checklisted launch and press release by chaining 5 skills.
 argument-hint: [the product and who it's for]
 ---
 Run the **Launch a Product** workflow recipe for: $ARGUMENTS
 This is a *chain* of skills. Run each stage in order and **carry every stage's output forward as context** for the next. Open with a one-line plan of the 5 stages, then ask once for any essential missing inputs (the product, target customer, key competitors, launch date). Don't re-ask between stages.
 Run each stage under a clear `## Stage N — <name>` heading:
 1. **Map the field** — apply the `competitor-teardown` skill to map competitors and find the positioning gap to exploit.
 2. **Position it** — apply the `product-positioning-doc` skill, using that gap, to define positioning, value props, and messaging pillars.
 3. **Plan the GTM** — apply the `go-to-market` skill to turn the messaging into an audience-and-channel launch plan.
 4. **Make it launch-ready** — apply the `product-launch-checklist` skill to produce an owner-by-owner readiness checklist.
 5. **Announce it** — apply the `press-release` skill to draft the announcement, consistent with the positioning above.
 Do not invent facts, dates, or competitor claims — flag assumptions. End with a 5-bullet **"What you now have"** recap.
@@ -0,0 +1,17 @@
 ---
 description: Workflow recipe — diagnose an at-risk customer and build the full save play through to renewal by chaining 4 skills.
 argument-hint: [the account and what's going wrong]
 ---
 Run the **Rescue an Account** workflow recipe for: $ARGUMENTS
 This is a *chain* of skills. Run each stage in order and **carry every stage's output forward as context** for the next. Open with a one-line plan of the 4 stages, then ask once for any essential missing inputs (ARR, renewal date, usage signals, the trigger). Don't re-ask between stages.
 Run each stage under a clear `## Stage N — <name>` heading:
 1. **Score the risk** — apply the `cs-health-scorecard` skill to produce a health score and the specific risk drivers.
 2. **Find the cause** — apply the `churn-analysis` skill to identify the root cause and whether the risk is avoidable.
 3. **Escalate it** — apply the `cs-escalation-brief` skill to write the internal brief and resolution plan.
 4. **Plan the renewal** — apply the `renewal-playbook` skill to build the renewal strategy and negotiation plan.
 Do not invent account facts — work only with what is given. End with a 4-bullet **"What you now have"** recap.
@@ -0,0 +1,17 @@
 ---
 description: Workflow recipe — go from a vague opportunity to validated insight and a prioritised next step by chaining 4 skills.
 argument-hint: [the opportunity or fuzzy question]
 ---
 Run the **Run Discovery** workflow recipe for: $ARGUMENTS
 This is a *chain* of skills. Run each stage in order and **carry every stage's output forward as context** for the next. Open with a one-line plan of the 4 stages, then ask once for any essential missing inputs (who the users are, what's prompting this, any constraints). Don't re-ask between stages.
 Run each stage under a clear `## Stage N — <name>` heading:
 1. **Frame it** — apply the `ambiguity-resolver` skill to turn the fuzzy opportunity into a one-page problem brief with a scoped question.
 2. **Plan the research** — apply the `discovery-interview-guide` skill to build a screener and discussion guide for user interviews.
 3. **Synthesise** — apply the `user-research-synthesis` skill to turn findings into themes and validated insights. (If no real findings exist yet, produce a synthesis template to fill in after interviews.)
 4. **Decide what's next** — apply the `rice-prioritisation` skill to rank the resulting opportunities into a defensible next step.
 Do not invent research findings — if interviews haven't happened, say so and produce the structure to capture them. End with a 4-bullet **"What you now have"** recap.
@@ -0,0 +1,18 @@
 ---
 description: Workflow recipe — take a feature idea from fuzzy brief to launch plan by chaining 5 skills.
 argument-hint: [the feature idea or problem]
 ---
 Run the **Ship a Feature** workflow recipe for: $ARGUMENTS
 This is a *chain* of skills. Run each stage in order and **carry every stage's output forward as context** for the next — that shared context is the whole point. Open with a one-line plan of the 5 stages, then ask once for any essential missing inputs (target user, the problem, a rough success metric). Don't re-ask between stages.
 Run each stage under a clear `## Stage N — <name>` heading:
 1. **Frame the problem** — apply the `ambiguity-resolver` skill to turn the raw idea into a sharp problem statement and scoped boundaries.
 2. **Spec it** — apply the `prd-template` skill, using the framed problem, to produce a PRD with goals/non-goals, requirements, and success metrics.
 3. **Prioritise it** — apply the `rice-prioritisation` skill to score this work (reach, impact, confidence, effort) so its priority is defensible.
 4. **Place it on the roadmap** — apply the `roadmap-narrative` skill to position the work and tell the story around it.
 5. **Plan the launch** — apply the `go-to-market` skill to produce audience, messaging, channels, and a launch timeline.
 Do not invent metrics, dates, or facts — note assumptions instead. After the last stage, end with a 5-bullet **"What you now have"** recap linking each artifact to the stage that produced it.
@@ -0,0 +1,145 @@
 # Launch kit — ready-to-post copy
 Distribution is the single biggest lever for stars (the repo is already strong). Below
 is plug-and-play copy for each channel. Lead with the **demo GIF** and the **live
 Playground** — they're your best assets. Post mid-week, morning US time.
 Key links:
 - Repo: https://github.com/mohitagw15856/pm-claude-skills
 - Live Playground: https://mohitagw15856.github.io/pm-claude-skills/
 - Demo GIF: `web/docs-assets/playground-demo.gif`
 ---
 ## 1. Hacker News — Show HN
 **Title** (keep it factual, no hype — HN punishes marketing speak):
 > Show HN: 174 open-source AI "skills" that make Claude/ChatGPT produce real work
 **Body:**
 > I'm a PM, and I kept hitting the same wall: ask any AI for a PRD, an exec update, or
 > a launch plan and you get generic filler you have to rewrite. The model doesn't know
 > what "good" looks like for professional work.
 >
 > So I built a library of structured `SKILL.md` files — each one teaches the AI the
 > actual structure, rigour, and judgement a senior professional uses. There are 174 now,
 > spanning PM, engineering, design, data, legal, finance, HR, sales, marketing and more.
 >
 > They run natively in Claude Code, and ship as copy-paste prompts for ChatGPT and Gemini.
 > There's a browser Playground where you can run any skill with your own API key — no
 > install: [link]
 >
 > It's MIT-licensed. The skills are small and meant to be forked and adapted. Feedback
 > and PRs welcome — especially on which skills are weakest.
 *Reply fast to every comment for the first 2 hours; that drives ranking.*
 ---
 ## 2. Product Hunt
 **Name:** PM Skills — 174 AI agent skills for every profession
 **Tagline:** Turn any AI into a senior PM, engineer, or analyst
 **Description:**
 > 174 open-source skills that teach Claude, ChatGPT, and Gemini to produce
 > professional-grade work — PRDs, exec updates, roadmaps, postmortems, contracts and
 > more. One source, every AI tool. Try any skill in the browser, no install.
 **First comment (maker):**
 > Hey PH 👋 I'm Mohit. I built this because AI gives you *plausible* professional docs
 > that are 80% there and 100% generic. Each skill encodes how a senior pro actually
 > structures the work, so the first draft is shippable. Free, MIT, and there's a live
 > Playground so you can try one in 30 seconds: [link]. Would love to know which skill
 > you'd want next.
 ---
 ## 3. X / Twitter thread
 **1/**
 > Ask AI for a PRD and you get generic filler you rewrite anyway.
 >
 > So I built 174 open-source "skills" that teach Claude/ChatGPT/Gemini how senior pros
 > actually work.
 >
 > Free. MIT. Try any in your browser 👇
 > [attach demo GIF]
 **2/**
 > Each skill is a structured SKILL.md — the real framework behind a PRD, exec update,
 > RICE prioritisation, competitor teardown, postmortem, etc.
 >
 > Not a prompt. A repeatable recipe for "good."
 **3/**
 > Covers 18 professions: PM, engineering, design, data, legal, finance, HR, sales,
 > marketing, research, ops…
 >
 > One library, the whole arc of professional work:
 > Discover → Decide → Build → Ship → Measure → Communicate.
 **4/**
 > Runs natively in Claude Code (`/plugin marketplace add mohitagw15856/pm-claude-skills`),
 > and exports as copy-paste prompts for ChatGPT + Gemini.
 **5/**
 > Try one right now — no install, runs in the browser with your own key:
 > [Playground link]
 >
 > ⭐ If it's useful, star the repo so others find it: [repo link]
 ---
 ## 4. Reddit (r/ClaudeAI, r/ChatGPTCoding, r/ProductManagement)
 **Title:** I open-sourced 174 AI "skills" that make Claude/ChatGPT produce real professional work (PRDs, exec updates, etc.)
 **Body:**
 > Sharing a free project I've been building. Ask AI for professional docs and you get
 > something generic you have to redo. These are 174 structured `SKILL.md` files that
 > teach the model the actual structure and judgement behind the work — PRDs, roadmaps,
 > postmortems, competitor teardowns, contracts, and more across 18 professions.
 >
 > Runs in Claude Code, and there's a browser Playground to try any skill instantly with
 > your own key (no install). MIT-licensed, PRs welcome.
 >
 > Repo: [link] · Playground: [link]
 >
 > What skill would you want that isn't there yet?
 *Read each sub's self-promo rules first; lead with value, not the ask.*
 ---
 ## 5. LinkedIn
 > AI gives you professional docs that are 80% there and 100% generic.
 >
 > I spent months encoding how senior pros *actually* do the work — PRDs, exec updates,
 > prioritisation, launch plans — into 174 open-source AI skills. They make Claude,
 > ChatGPT, and Gemini produce a first draft you can ship, not redo.
 >
 > Free, MIT-licensed, and there's a browser playground to try any of them in 30 seconds.
 >
 > Link in comments 👇 What would you want it to write for you?
 *(Put the link in the first comment — LinkedIn suppresses posts with outbound links in the body.)*
 ---
 ## 6. Skill of the week — automated
 A scheduled GitHub Action (`.github/workflows/skill-of-the-week.yml`) picks a featured skill every Monday, rotating deterministically through the production-tier skills, and composes ready-to-post X + LinkedIn copy. The text always appears in the Action's **job summary** so you can copy-paste it.
 **To fully automate posting:** add a repo secret `POST_WEBHOOK_URL` pointing at a Zapier / Make / Buffer / Slack incoming webhook. The script POSTs `{ text, linkedin, skill, link }` to it each week — wire that webhook to publish to X/LinkedIn.
 Run it any time locally: `npm run skill-of-the-week`.
 ## 7. Ongoing distribution checklist
 - [ ] Submit to the [skills.sh](https://skills.sh) / Claude skills directories and any "awesome-claude" lists
 - [ ] Add a `topics` set on the GitHub repo (claude, claude-code, ai-agents, prompt-engineering, productivity)
 - [ ] Pin a "⭐ Star if useful" issue / discussion
 - [ ] Ask the first 20 users directly for a star + feedback
 - [ ] Repost the demo GIF monthly with a different featured skill
 - [ ] Write one deep-dive blog post per flagship skill (SEO long tail)
@@ -0,0 +1,50 @@
 # Skill Evals
 An LLM-as-judge harness that scores skill output quality across models — so claims like
 "production-ready" are backed by numbers, not vibes. Results render as a public
 [Skill Leaderboard](https://mohitagw15856.github.io/pm-claude-skills/leaderboard.html).
 ## What it measures
 For each [case](cases.json), a model runs the skill, then a **judge model** scores the
 output 1–5 on four dimensions:
 - **structure** — follows a clear, expected structure
 - **completeness** — covers what the task needs
 - **usefulness** — specific and actually useful, not generic
 - **grounding** — stays grounded in the input, no invented facts
 ## Run it
 Needs an Anthropic API key (this calls the API and costs tokens):
 ```bash
 ANTHROPIC_API_KEY=sk-ant-... node evals/run-evals.mjs
 #   --models claude-opus-4-8,claude-sonnet-4-6,claude-haiku-4-5-20251001
 #   --judge  claude-opus-4-8
 node scripts/build-leaderboard.mjs       # render web/leaderboard.html
 ```
 `run-evals.mjs` writes `evals/results.json`; the leaderboard builder prefers it and falls
 back to `results.example.json` (clearly labelled) so the page renders before you run real evals.
 ### No local key? Run it in CI
 1. Add an `ANTHROPIC_API_KEY` repo secret.
 2. Enable **Settings → Actions → General → Workflow permissions → "Allow GitHub Actions to
   create and approve pull requests"** (so the workflow can open its results PR — `main`
   requires PRs).
 3. **Actions → "Update Skill Leaderboard" → Run workflow.** It runs the evals and opens a
   PR with `evals/results.json`. **Merge that PR** and the Pages deploy re-renders the
   public leaderboard with real numbers — no laptop required.
 ## Add a case
 Append to [`cases.json`](cases.json): `{ "skill": "<name>", "input": "<a realistic prompt>" }`.
 Keep inputs short but representative of how the skill is actually used.
 ## Honesty notes
 - Scores are an LLM judge's opinion, not ground truth — treat them as a comparative signal.
 - The judge sees the skill's stated purpose and the output, not the model name (reduces bias).
 - Re-run after model upgrades; numbers drift.
@@ -0,0 +1,65 @@
 {
  "_comment": "Eval cases: a representative input per skill. Run with: node evals/run-evals.mjs",
  "cases": [
    {
      "skill": "rice-prioritisation",
      "input": "Rank these for next quarter:\n1. Onboarding redesign — reach ~5000 users/qtr, big activation impact, ~3 person-months.\n2. Dark mode — ~8000 users want it, low impact, ~1 person-month.\n3. SSO for enterprise — ~400 accounts, high deal impact, ~4 person-months, low confidence."
    },
    {
      "skill": "prd-template",
      "input": "Feature: in-app referral program so existing users invite colleagues and both get a credit. Target: activated B2B users. Goal: grow signups 15% in Q3."
    },
    {
      "skill": "cs-health-scorecard",
      "input": "Account: Acme Corp, enterprise, ARR $120k, renewal in 90 days. DAU/MAU 18%, 2 open P2 tickets, CSAT 7, exec sponsor left last month, seats 80/100 used, payments on time."
    },
    {
      "skill": "executive-summary",
      "input": "Summarise: our Q2 retention dropped from 82% to 76% driven by a new onboarding flow that confused mobile users; we shipped a fix in week 10 and retention recovered to 80%; we recommend a full mobile onboarding rework next quarter."
    },
    {
      "skill": "competitive-analysis",
      "input": "Analyse our position vs Notion and Coda for a lightweight team wiki aimed at small startups. We're cheaper and faster to set up but have fewer integrations."
    },
    {
      "skill": "sprint-planning",
      "input": "Team of 5, 2-week sprint, average velocity 30 points, one engineer out 3 days. Backlog: checkout redesign (8), payment retries (5), analytics events (3), bug bash (3), API rate limiting (5)."
    },
    {
      "skill": "roadmap-narrative",
      "input": "H2 roadmap for a B2B analytics product. Themes: self-serve onboarding, an integrations marketplace, and enterprise SSO/audit logs. Audience: the exec team and key customers. We want the story, not a feature list."
    },
    {
      "skill": "okr-builder",
      "input": "Company objective: become the default analytics tool for startups. For the product team, next quarter. We care about activation, retention, and word-of-mouth growth."
    },
    {
      "skill": "go-to-market",
      "input": "Launching an integrations marketplace for our analytics product. Target: existing mid-market customers and their ops teams. Goal: 30% of accounts install at least one integration within 60 days."
    },
    {
      "skill": "churn-analysis",
      "input": "SMB SaaS, $49/mo. Monthly logo churn rose from 3% to 5% over two quarters. Most cancellations happen in month 2-3. Top stated reasons: 'too hard to set up' and 'didn't see value'. Annual plans churn far less than monthly."
    },
    {
      "skill": "stakeholder-update",
      "input": "Weekly update for sales, support, and exec stakeholders on the checkout revamp. Status: 10% rollout live, conversion +4%, one payments edge case under investigation, full launch gated on a Legal PCI review due Tuesday."
    },
    {
      "skill": "user-story-writer",
      "input": "Feature: let users export a dashboard to PDF and schedule a recurring email of it. Users are analysts and their managers. Keep stories small and testable with clear acceptance criteria."
    },
    {
      "skill": "incident-postmortem",
      "input": "Checkout was down 42 minutes after a deploy set a wrong env var on the payments service; 5xx spiked, ~1,200 failed checkouts. Detected by alert in 6 min, fixed by rollback. Blameless postmortem with timeline and action items."
    },
    {
      "skill": "ab-test-planner",
      "input": "Test whether moving the signup CTA above the fold on the pricing page increases free-trial starts. Current trial-start rate 8%, ~20k weekly visitors. We want to detect a 10% relative lift."
    },
    {
      "skill": "metrics-framework",
      "input": "Define the metrics framework for a B2B analytics product: the north star, input metrics across acquisition/activation/retention/revenue, and guardrails. Stage: early growth, ~500 paying accounts."
    }
  ]
 }
@@ -0,0 +1,22 @@
 {
  "_comment": "EXAMPLE data so the leaderboard renders before you run real evals. Replace by running: ANTHROPIC_API_KEY=... node evals/run-evals.mjs",
  "example": true,
  "generatedAt": "2026-06-18T00:00:00.000Z",
  "judge": "claude-opus-4-8",
  "models": ["claude-sonnet-4-6", "claude-haiku-4-5-20251001"],
  "dimensions": ["structure", "completeness", "usefulness", "grounding"],
  "results": [
    { "skill": "rice-prioritisation", "model": "claude-sonnet-4-6", "scores": {"structure":5,"completeness":5,"usefulness":5,"grounding":4}, "overall": 4.75 },
    { "skill": "rice-prioritisation", "model": "claude-haiku-4-5-20251001", "scores": {"structure":5,"completeness":4,"usefulness":4,"grounding":4}, "overall": 4.25 },
    { "skill": "prd-template", "model": "claude-sonnet-4-6", "scores": {"structure":5,"completeness":4,"usefulness":5,"grounding":4}, "overall": 4.5 },
    { "skill": "prd-template", "model": "claude-haiku-4-5-20251001", "scores": {"structure":4,"completeness":4,"usefulness":4,"grounding":4}, "overall": 4.0 },
    { "skill": "cs-health-scorecard", "model": "claude-sonnet-4-6", "scores": {"structure":5,"completeness":5,"usefulness":5,"grounding":5}, "overall": 5.0 },
    { "skill": "cs-health-scorecard", "model": "claude-haiku-4-5-20251001", "scores": {"structure":5,"completeness":4,"usefulness":4,"grounding":4}, "overall": 4.25 },
    { "skill": "executive-summary", "model": "claude-sonnet-4-6", "scores": {"structure":5,"completeness":5,"usefulness":4,"grounding":5}, "overall": 4.75 },
    { "skill": "executive-summary", "model": "claude-haiku-4-5-20251001", "scores": {"structure":5,"completeness":4,"usefulness":4,"grounding":5}, "overall": 4.5 },
    { "skill": "competitive-analysis", "model": "claude-sonnet-4-6", "scores": {"structure":4,"completeness":4,"usefulness":5,"grounding":4}, "overall": 4.25 },
    { "skill": "competitive-analysis", "model": "claude-haiku-4-5-20251001", "scores": {"structure":4,"completeness":4,"usefulness":4,"grounding":4}, "overall": 4.0 },
    { "skill": "sprint-planning", "model": "claude-sonnet-4-6", "scores": {"structure":5,"completeness":5,"usefulness":5,"grounding":5}, "overall": 5.0 },
    { "skill": "sprint-planning", "model": "claude-haiku-4-5-20251001", "scores": {"structure":5,"completeness":4,"usefulness":4,"grounding":5}, "overall": 4.5 }
  ]
 }
@@ -0,0 +1,148 @@
 {
  "generatedAt": "2026-06-18T20:35:19.929Z",
  "judge": "claude-opus-4-8",
  "models": [
    "claude-sonnet-4-6",
    "claude-haiku-4-5-20251001"
  ],
  "dimensions": [
    "structure",
    "completeness",
    "usefulness",
    "grounding"
  ],
  "results": [
    {
      "skill": "rice-prioritisation",
      "model": "claude-sonnet-4-6",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 4
      },
      "overall": 4.75
    },
    {
      "skill": "rice-prioritisation",
      "model": "claude-haiku-4-5-20251001",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 5
      },
      "overall": 5
    },
    {
      "skill": "prd-template",
      "model": "claude-sonnet-4-6",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 5
      },
      "overall": 5
    },
    {
      "skill": "prd-template",
      "model": "claude-haiku-4-5-20251001",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 4
      },
      "overall": 4.75
    },
    {
      "skill": "cs-health-scorecard",
      "model": "claude-sonnet-4-6",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 5
      },
      "overall": 5
    },
    {
      "skill": "cs-health-scorecard",
      "model": "claude-haiku-4-5-20251001",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 4
      },
      "overall": 4.75
    },
    {
      "skill": "executive-summary",
      "model": "claude-sonnet-4-6",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 4
      },
      "overall": 4.75
    },
    {
      "skill": "executive-summary",
      "model": "claude-haiku-4-5-20251001",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 4,
        "grounding": 3
      },
      "overall": 4.25
    },
    {
      "skill": "competitive-analysis",
      "model": "claude-sonnet-4-6",
      "scores": {
        "structure": 5,
        "completeness": 4,
        "usefulness": 5,
        "grounding": 5
      },
      "overall": 4.75
    },
    {
      "skill": "competitive-analysis",
      "model": "claude-haiku-4-5-20251001",
      "scores": {
        "structure": 5,
        "completeness": 4,
        "usefulness": 5,
        "grounding": 3
      },
      "overall": 4.25
    },
    {
      "skill": "sprint-planning",
      "model": "claude-sonnet-4-6",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 4
      },
      "overall": 4.75
    },
    {
      "skill": "sprint-planning",
      "model": "claude-haiku-4-5-20251001",
      "scores": {
        "structure": 5,
        "completeness": 5,
        "usefulness": 5,
        "grounding": 4
      },
      "overall": 4.75
    }
  ]
 }
@@ -0,0 +1,113 @@
 #!/usr/bin/env node
 // Skill eval harness. For each case × model: run the skill, then score the output
 // with an LLM judge on a fixed rubric. Writes evals/results.json — feed it to
 // scripts/build-leaderboard.mjs to render web/leaderboard.html.
 //
 // Requires an Anthropic API key (this calls the API and costs tokens).
 //
 // Usage:
 //   ANTHROPIC_API_KEY=sk-ant-... node evals/run-evals.mjs
 //   ... node evals/run-evals.mjs --models claude-opus-4-8,claude-sonnet-4-6,claude-haiku-4-5-20251001
 //   ... node evals/run-evals.mjs --judge claude-opus-4-8 --cases evals/cases.json
 import { readFileSync, writeFileSync, existsSync } from 'node:fs';
 import { join, dirname } from 'node:path';
 import { fileURLToPath } from 'node:url';
 import { complete, parseSkill } from '../bin/lib/anthropic.mjs';
 const __dirname = dirname(fileURLToPath(import.meta.url));
 const root = join(__dirname, '..');
 function arg(name, def) {
  const i = process.argv.indexOf(`--${name}`);
  return i !== -1 ? process.argv[i + 1] : def;
 }
 const apiKey = process.env.ANTHROPIC_API_KEY || '';
 const models = arg('models', 'claude-sonnet-4-6,claude-haiku-4-5-20251001').split(',').map((s) => s.trim());
 const judge = arg('judge', 'claude-opus-4-8');
 const casesPath = arg('cases', join(__dirname, 'cases.json'));
 const outPath = arg('out', join(__dirname, 'results.json'));
 const DIMENSIONS = ['structure', 'completeness', 'usefulness', 'grounding'];
 function runPrompt(skillBody) {
  return skillBody + '\n\n---\nExecute this skill now on the input. Output only the finished artifact.';
 }
 function judgePrompt(description, output) {
  return `You are a strict evaluator of a professional work artifact.
 The artifact was produced by a skill whose job is:
 "${description}"
 Score the artifact below from 1 (poor) to 5 (excellent) on each dimension:
 - structure: follows a clear, expected structure for this kind of output
 - completeness: covers what the task needs, nothing important missing
 - usefulness: actually useful to a professional, specific not generic
 - grounding: stays grounded in the given input, no invented facts/metrics
 Return ONLY a JSON object, no prose: {"structure":N,"completeness":N,"usefulness":N,"grounding":N}
 --- ARTIFACT ---
 ${output}`;
 }
 function parseScores(text) {
  const m = text.match(/\{[\s\S]*\}/);
  if (!m) throw new Error('judge did not return JSON');
  const j = JSON.parse(m[0]);
  const s = {};
  for (const d of DIMENSIONS) s[d] = Math.max(1, Math.min(5, Number(j[d]) || 0));
  return s;
 }
 // Run an async worker over `items` with at most `limit` in flight.
 async function pool(items, limit, worker) {
  const out = [];
  let i = 0;
  await Promise.all(Array.from({ length: Math.min(limit, items.length) }, async () => {
    while (i < items.length) {
      const idx = i++;
      out[idx] = await worker(items[idx]);
    }
  }));
  return out;
 }
 async function scoreTask({ c, body, description, model }) {
  try {
    const output = await complete({ apiKey, model, system: runPrompt(body), messages: [{ role: 'user', content: c.input }], maxTokens: 3000 });
    const judged = await complete({ apiKey, model: judge, messages: [{ role: 'user', content: judgePrompt(description, output) }], maxTokens: 200 });
    const scores = parseScores(judged);
    const overall = DIMENSIONS.reduce((a, d) => a + scores[d], 0) / DIMENSIONS.length;
    process.stderr.write(`✓ ${c.skill} on ${model} — ${overall.toFixed(2)}/5\n`);
    return { skill: c.skill, model, scores, overall: Math.round(overall * 100) / 100 };
  } catch (e) {
    process.stderr.write(`✗ ${c.skill} on ${model} — FAILED (${e.message})\n`);
    return null;
  }
 }
 async function main() {
  if (!apiKey) { console.error('Set ANTHROPIC_API_KEY to run evals.'); process.exit(1); }
  const concurrency = parseInt(arg('concurrency', '4'), 10) || 4;
  const { cases } = JSON.parse(readFileSync(casesPath, 'utf8'));
  // Build the full (case × model) task list.
  const tasks = [];
  for (const c of cases) {
    const skillFile = join(root, 'skills', c.skill, 'SKILL.md');
    if (!existsSync(skillFile)) { console.error(`skip ${c.skill}: no SKILL.md`); continue; }
    const { meta, body } = parseSkill(readFileSync(skillFile, 'utf8'));
    for (const model of models) tasks.push({ c, body, description: meta.description || c.skill, model });
  }
  process.stderr.write(`Scoring ${tasks.length} runs (concurrency ${concurrency})…\n`);
  const results = (await pool(tasks, concurrency, scoreTask)).filter(Boolean);
  const out = { generatedAt: new Date().toISOString(), judge, models, dimensions: DIMENSIONS, results };
  writeFileSync(outPath, JSON.stringify(out, null, 2));
  console.log(`\nWrote ${outPath} — ${results.length}/${tasks.length} scored runs. Build the page: node scripts/build-leaderboard.mjs`);
 }
 main();
@@ -0,0 +1,120 @@
 ---
 skill: ab-test-planner
 title: Ab Test Planner
 input: "Test whether moving the signup CTA above the fold on the pricing page increases free-trial starts. Current trial-start rate 8%, ~20k weekly visitors. We want to detect a 10% relative lift."
 source: generated by claude-sonnet-4-6
 ---
 # A/B Test Plan — Pricing Page CTA Position Test — 2025
 ---
 ## Hypothesis
 > "We believe that **moving the signup CTA above the fold on the pricing page** will cause **free-trial starts** to **increase by ~10% relative (from 8.0% to 8.8%)** for **all visitors to the pricing page**, because users who arrive with purchase intent shouldn't have to scroll to find the primary action — reducing friction at the moment of peak motivation."
 ---
 ## Variants
 | | Experience |
 |---|---|
 | **Control (A)** | Current pricing page — signup CTA positioned below the fold (existing layout) |
 | **Treatment (B)** | Pricing page with signup CTA repositioned above the fold, visible without scrolling on desktop and mobile — no other changes to copy, color, or surrounding elements |
 > ⚠️ **Isolation note:** Change only the CTA position. Do not alter button copy, color, pricing display, or plan layout in the same variant. If those elements are changed simultaneously, you cannot attribute results to position alone.
 ---
 ## Metrics
 **Primary Metric:**
 Free-trial start rate — defined as unique visitors who complete trial account creation divided by unique pricing page visitors, measured server-side at account creation confirmation.
 **Guardrail Metrics:**
 1. **Trial-to-paid conversion rate** — a position change that inflates low-intent sign-ups could depress downstream conversion. Must not degrade by more than 5%.
 2. **Checkout abandonment rate** — ensures the CTA change doesn't create confusion or false-start behavior in the signup flow.
 > Do not use clicks on the CTA as the primary metric. A click is not a trial start. Optimize for the outcome, not the behavior.
 ---
 ## Targeting & Traffic
 | Parameter | Value |
 |---|---|
 | **Target Segment** | All pricing page visitors (new + returning) |
 | **Traffic Split** | 50% Control / 50% Treatment |
 | **Exclusions** | Existing paid subscribers (already converted — would dilute signal); any users currently in another pricing-page test |
 | **Device scope** | Desktop and mobile (verify "above the fold" is correctly implemented per viewport — this is a common implementation error) |
 ---
 ## Sample Size & Duration
 **Inputs:**
 - Baseline trial-start rate: **8.0%**
 - MDE: **10% relative lift → detect a move from 8.0% → 8.8%**
 - Statistical power: **80%**
 - Significance level: **95% (p < 0.05)**
 **Required sample per variant: ~38,000 visitors**
 **Total sample required: ~76,000 visitors**
 **Duration calculation:**
 ```
 Weekly visitors:     20,000
 50% exposed:         10,000 per variant per week
 Weeks required:      38,000 ÷ 10,000 = 3.8 weeks → round up to 4 weeks
 ```
 **Recommended test duration: 4 weeks**
 This captures two full weekday/weekend cycles and accounts for weekly traffic seasonality. Four weeks is also the outer boundary for novelty effects — do not extend beyond this. If you haven't reached significance at 4 weeks, the effect is likely smaller than your MDE and a different approach is warranted.
 > 🔖 **Tool recommendation:** Validate this sample size at [Evan Miller's A/B test calculator](https://www.evanmiller.org/ab-testing/sample-size.html) using: Baseline = 8%, Absolute MDE = 0.8pp, Power = 80%, Alpha = 0.05.
 ---
 ## Launch & Monitoring Plan
 **Pre-launch checklist:**
 - [ ] QA variant on Chrome, Safari, Firefox — desktop and mobile viewports
 - [ ] Confirm "above the fold" renders correctly at 1280px, 1440px, 375px (iPhone), 390px breakpoints
 - [ ] Verify tracking fires correctly on trial creation (not on CTA click)
 - [ ] Confirm no other pricing page tests are running
 - [ ] Baseline metric snapshot taken 2 weeks prior for sanity check
 **During the test:**
 - Do not look at results until Week 4. Set a calendar reminder and lock the dashboard if your tool allows.
 - Peeking at results before the planned endpoint and stopping early when you see significance is the single most common cause of false positives in A/B testing — it can inflate your false positive rate from 5% to over 25%.
 - The only valid reason to check results early is to monitor guardrail metrics for a rollback trigger (below).
 **Rollback trigger:**
 > If trial-to-paid conversion rate in the Treatment group degrades by **more than 5% relative** at any monitoring checkpoint (check at end of Week 2 only), pause the test, investigate cohort quality, and escalate before proceeding.
 ---
 ## Results Interpretation Guide
 Once the test reaches the planned 4-week endpoint and full sample size:
 | Outcome | Condition | Action |
 |---|---|---|
 | ✅ **Ship it** | Treatment shows ≥10% relative lift on trial-start rate at 95% confidence **AND** guardrail metrics are stable (within ±5%) | Deploy to 100% of traffic |
 | 🔄 **Iterate** | Treatment shows positive direction but does not reach significance | Do not ship. The effect may be real but smaller than your MDE. Consider redesigning the CTA element itself (copy, contrast, size) rather than just repositioning |
 | ❌ **Reject** | No lift or negative direction at significance | Position is not the problem. Investigate pricing clarity, plan structure, or trust signals instead |
 | ⚠️ **Inconclusive** | Test ended before reaching sample size, or results are within noise | Do not ship. Do not call it a win. Do not call it a loss. Rerun with a revised plan |
 > **On "almost significant" results:** A p-value of 0.07 is not a win. It is not "trending positive." It means you do not have sufficient evidence. Shipping on p=0.07 results in roughly a 1-in-14 chance you're deploying a change that does nothing or causes harm.
 ---
 ## If This Test Doesn't Move the Needle
 A null result on CTA position is genuinely useful signal. It tells you intent isn't the bottleneck — users aren't failing to find the button. Investigate:
 1. **Pricing clarity** — are users confused about plan differences or what's included in the trial?
 2. **Trust signals** — is there sufficient social proof near the decision point (testimonials, logos, review counts)?
 3. **Trial friction** — how many fields does signup require? What happens immediately after clicking the CTA?
 4. **Scroll depth analytics** — if >70% of users already scroll past the current CTA location, position was never the problem.
@@ -0,0 +1,175 @@
 ---
 skill: churn-analysis
 title: Churn Analysis
 input: "SMB SaaS, $49/mo. Monthly logo churn rose from 3% to 5% over two quarters. Most cancellations happen in month 2-3. Top stated reasons: 'too hard to set up' and 'didn't see value'. Annual plans churn far less than monthly."
 source: generated by claude-sonnet-4-6
 ---
 # Churn Analysis: SMB SaaS — $49/mo Product
 **Period:** Last Two Quarters (Q-2 → Q-1)
 **Prepared by:** AI Analyst | **Date:** Today
 ---
 ## Headline Numbers
 | Metric | Value |
 |---|---|
 | Customers at start of period | Not provided — see data gaps |
 | Customers churned | Not provided — see data gaps |
 | **Customer churn rate (current)** | **5% monthly** |
 | **Customer churn rate (prior)** | **3% monthly** |
 | **Change** | **+2pp — a 67% relative increase** |
 | ARR/MRR lost to churn | Not provided — see data gaps |
 | **Revenue churn rate (gross)** | Estimated ~5% monthly (at $49/mo flat pricing) |
 | ARR from expansions | Not provided |
 | **Net revenue retention (NRR)** | Cannot calculate — insufficient data |
 **Benchmark context:**
 - Monthly logo churn of 5% annualises to approximately **46% customer churn per year** — this is significantly above the SMB SaaS benchmark of 10–20% annualised churn
 - At 3% monthly (prior baseline), annualised churn was ~31% — already elevated, meaning this is a deterioration from an already-problematic baseline, not a fall from a healthy position
 - The 2pp rise in two quarters is the more urgent signal: something changed, or a scaling problem that existed at lower volume has become acute
 ---
 ## Churn Breakdown by Category
 *Full categorisation requires exit data at account level. Based on stated reasons, the following classification applies:*
 | Category | Est. % of churn | ARR lost | Basis for estimate |
 |---|---|---|---|
 | Voluntary — avoidable | ~80–85% | Not provided | 'Too hard to set up' and 'didn't see value' are both addressable |
 | Voluntary — unavoidable | ~10–15% | Not provided | Budget cuts, shutdowns — typical SMB baseline |
 | Involuntary | ~5–10% | Not provided | Payment failures typical at $49/mo monthly billing |
 | **Total** | **100%** | **—** | |
 **Avoidable churn as % of total churn:** Estimated **~80–85%** — the dominant problem here is a product and onboarding problem, not an external one. This is the number we can move.
 ---
 ## Churn Reasons — Avoidable Churn Only
 *Based on stated exit reasons. Frequency counts and ARR weighting not provided — ranked by stated prominence.*
 | Reason | Est. % of avoidable churn | ARR lost | Representative signal |
 |---|---|---|---|
 | Setup friction / poor onboarding | ~50% | Not provided | "Too hard to set up" |
 | Failure to reach value milestone | ~35% | Not provided | "Didn't see value" |
 | Other / unknown | ~15% | Not provided | No data provided |
 **Note on reason overlap:** These two reasons are closely related and likely represent the same underlying failure experienced at different moments. "Too hard to set up" is the friction experienced in weeks 1–2. "Didn't see value" is the outcome of that friction by week 4–8. They are not two separate problems — they are the early and late expression of a single broken activation journey. Treating them as separate intervention targets may be less effective than fixing the end-to-end onboarding flow as a unified workstream.
 **Theme synthesis:** Virtually all avoidable churn traces to a single root cause — **customers are not reaching their first meaningful value moment before they give up or before their motivation to persist expires.** Setup is too complex relative to customer sophistication and patience at the $49/mo price point, and there is no evident mechanism recovering customers who stall. The product is losing the race between customer effort and customer reward.
 ---
 ## Churn by Segment
 ### By Billing Cadence — Key Finding
 | Segment | Churn rate | vs. Overall | Notes |
 |---|---|---|---|
 | Monthly billing | ~5% monthly | Baseline | High-risk — low switching cost, no commitment |
 | Annual billing | Significantly lower | Meaningfully below | Customers have pre-committed; also likely self-select as more serious buyers |
 **This is the most important segmentation finding in the data provided.** Annual plan customers churning far less than monthly customers tells you two things simultaneously:
 1. **Commitment creates retention** — customers who have paid upfront have stronger motivation to push through setup friction and reach value. The product can deliver value; monthly customers are abandoning before they get there.
 2. **Annual customers may be a different buyer** — they are likely more deliberate, better-resourced, or more experienced with SaaS tools. The monthly cohort may be lower-intent or less sophisticated, requiring more support to activate.
 This distinction should drive your intervention strategy: you can either move customers onto annual plans earlier (reducing churn mechanically), or you can use what annual customers do differently in their first 60 days as the model for improving monthly customer activation.
 ### By Tier
 | Tier | Churn rate | Notes |
 |---|---|---|
 | SMB (single tier) | 5% monthly | All customers appear to be in one tier at $49/mo flat |
 *Further segmentation by industry, use case, or acquisition channel not provided — see data gaps.*
 ### By Cohort (Acquisition Year)
 | Cohort | Notes |
 |---|---|
 | Not provided | Cohort analysis would reveal whether churn is concentrated in recently acquired customers (suggesting a recent product or GTM change) or is broad-based |
 **Key pattern:** The only segmentation available is billing cadence, and it is decisive. Monthly churn is the problem; annual retention is the benchmark to learn from.
 ---
 ## Timing Analysis
 - **Average contract length before churn:** Not provided — but cancellation data points strongly to early-stage loss
 - **Highest-risk moment:** **Month 2–3** — explicitly confirmed by the data provided. This is after the initial sign-up energy has faded but before the customer has built a habit or dependency on the product
 - **Churn timing distribution:**
 | When churn occurred | % of churned accounts | Interpretation |
 |---|---|---|
 | Month 1 | Low (implied) | Some immediate regret churn; likely small |
 | **Month 2–3** | **Dominant** | Primary failure window — activation breakdown |
 | Month 4–6 | Moderate (implied) | Customers who activated but found limited depth |
 | Month 6+ | Lower (implied) | Retained customers have crossed dependency threshold |
 **Why month 2–3 is the critical window:**
 At $49/mo, customers have typically exhausted their initial curiosity by the end of month 1. Month 2 is when they make the first active decision about whether the product is worth continuing. If they haven't hit a clear value moment by day 30–45, the cancellation decision is already forming — even if they haven't acted on it yet. Month 3 is when passive non-use converts to active cancellation. There is likely a **30–45 day intervention window** between when customers go quiet and when they cancel — this is recoverable territory if detected early.
 ---
 ## Early Warning Signals
 *Inferred from churn timing and stated reasons. Requires product usage data to validate.*
 | Signal | Est. lead time before churn | How to detect |
 |---|---|---|
 | Setup not completed within 7 days of signup | ~6–8 weeks | Onboarding completion event in product analytics (e.g. Mixpanel, Amplitude, Segment) |
 | Core use case action not performed in week 1 (e.g. first project created, first file imported, first report run) | ~5–7 weeks | Product event tracking — "activation event" not triggered |
 | Login frequency drops to <2 sessions in week 3–4 | ~3–5 weeks | Session frequency alert in CRM or product analytics |
 | No login for 10+ consecutive days in months 1–2 | ~2–4 weeks | Automated dormancy flag |
 | Monthly customer who has not engaged with any help content or onboarding resources | ~4–6 weeks | Help center / in-app guide completion tracking |
 | Customer submits a support ticket citing confusion (vs. a bug) | ~2–3 weeks | Support ticket tagging by category |
 **Critical gap:** If there is currently no product usage instrumentation feeding into a CRM or alerting system, none of these signals can be acted on at scale. Instrumentation is a prerequisite for proactive retention at the SMB level where 1:1 CSM coverage is not economically viable.
 ---
 ## Intervention Recommendations
 Ranked by estimated impact × feasibility.
 | Priority | Intervention | Addresses | Est. impact | Effort | Owner |
 |---|---|---|---|---|---|
 | **1** | **Redesign onboarding to deliver a single, fast, undeniable value moment within the first session** — identify the one action that correlates with long-term retention in annual customers and make it impossible to miss for new signups | Setup friction + value gap | High — directly targets 80%+ of avoidable churn | Medium | Product + Design |
 | **2** | **Implement automated intervention sequence triggered at dormancy signals** — day 7 no-setup email, day 14 low-usage outreach, day 21 offer of a live 20-minute onboarding call or recorded walkthrough | Setup friction + value gap | High — recovers customers in the warning window before they decide to cancel | Low–Medium | Growth / CRM |
 | **3** | **Introduce an annual plan conversion prompt at signup and at day 14** — customers who commit annually churn far less; earlier conversion reduces exposure to monthly churn window | Billing cadence risk | Medium — shifts customers out of high-churn billing tier | Low | Product / Pricing |
 | **4** | **Offer a free or assisted setup session for new monthly customers** — a 20-minute screenshare or async Loom walkthrough tailored to the customer's stated use case reduces setup abandonment at the highest-risk point | Setup friction | Medium — addresses 'too hard to set up' directly | Low–Medium | CS / Support |
 | **5** | **Tag and analyse support tickets and cancellation reasons by category** — build a structured exit survey with a 3-question maximum and incentivise completion (e.g. one free month for feedback) | Data quality | Enables future analysis — indirect retention impact | Low | CS / Product Ops |
 | **6** | **Audit what annual customers do differently in days 1–30** — compare activation event completion, feature usage, and support engagement between annual and monthly cohorts to find the behaviours that predict retention | Segment learning | High — turns existing retention success into a repeatable playbook | Low | Analytics |
 **Priority call:** **Intervention 1 — onboarding redesign to deliver a fast first value moment — is the single highest-impact action available.** Everything else is downstream of it. If customers cannot reach value in session one or two, no email sequence or pricing prompt will save them. The fastest path to this intervention is Intervention 6: use your annual customers as the control group to identify what the successful activation journey actually looks like, then rebuild onboarding to replicate it for monthly customers. This can be scoped and started within two weeks with no engineering required for the research phase.
 ---
 ## What We Don't Know (Data Gaps)
 - **Absolute customer and ARR numbers are missing** — churn rate percentages without base numbers make it impossible to calculate ARR impact or prioritise by revenue at risk
 - **Exit survey response rate is unknown** — "too hard to set up" and "didn't see value" may reflect only the subset of customers who completed an exit survey; silent churners (who cancel without giving a reason) may have different or additional reasons
 - **No product usage data provided** — cannot confirm which specific setup steps cause abandonment, what the activation event is, or whether early warning signals are detectable in the current instrumentation
 - **No cohort data** — cannot determine whether the rise from 3% to 5% churn is concentrated in a specific acquisition cohort (suggesting a GTM or onboarding change) or is broad-based (suggesting a product or market change)
 - **No acquisition channel data** — customers acquired through different channels (paid search, content, referral) often have materially different activation and retention rates; channel mix shift could partly explain the churn increase
 - **No competitive exit data** — unknown whether any churned customers cited a specific competitor, which would change the intervention from "fix onboarding" to "fix onboarding AND address competitive gaps"
 - **Annual vs. monthly churn rate not quantified** — the direction is confirmed (annual far less) but the magnitude is unknown; quantifying this gap would sharpen the ROI case for annual conversion interventions
 - **What changed two quarters ago is unknown** — a 67% relative increase in churn over two quarters is large enough to suggest a specific cause: pricing change, onboarding flow change, support model change, or GTM change bringing in lower-fit customers. This should be investigated before assuming the answer is purely an onboarding problem
 ---
 ## Anti-Pattern Checks
 - [x] Avoidable and unavoidable churn separated — interventions target avoidable churn only; the ~10–15% unavoidable (shutdowns, budget cuts) is not inflating the recommended intervention scope
 - [x] Churn rate calculated correctly — 5% represents customers churned in the month divided by customers at start of month, not end-of-month denominator
 - [x] Churn reasons are customer-reported ("too hard to set up," "didn't see value") — not internally assumed
 - [x] Segment analysis identifies the key differentiator (annual vs. monthly), not just averages
 - [x] Early warning signals are specific and detectable (login frequency, activation event completion) — not generic
 - [x] All six interventions link directly to a named churn reason — no orphaned recommendations
 - [⚠️] NRR cannot be calculated without expansion data — flagged explicitly in headline numbers; do not interpret gross churn rate as a complete picture of revenue health
 - [⚠️] Exit survey representativeness is uncertain — flagged in data gaps; treat reason rankings as directional, not definitive, until response rate and silent churner data are available
@@ -0,0 +1,245 @@
 ---
 skill: competitive-analysis
 title: Competitive Analysis
 input: "Analyse our position vs Notion and Coda for a lightweight team wiki aimed at small startups. We're cheaper and faster to set up but have fewer integrations."
 source: generated by claude-sonnet-4-6
 ---
 # Competitive Analysis: Lightweight Team Wiki vs. Notion & Coda
 **Audience:** Product team / leadership | **Focus:** Full landscape with strategic emphasis on positioning and roadmap implications
 ---
 > ⚠️ **Data Transparency Notice**
 > Competitor feature and pricing details below are based on publicly available information as of mid-2025. Specific figures are flagged as **[Public source]** or **[Assumption — verify]**. Pricing tiers change frequently; confirm against current competitor pricing pages before sharing with sales.
 ---
 ## 1. Executive Summary
 ### Market Position
 You occupy a defensible but time-limited niche: the fastest path to a working team wiki for early-stage startups who find Notion overwhelming and Coda overkill. Your advantages are real — lower price, faster time-to-value — but neither moat is durable on its own. Notion has been aggressively simplifying onboarding; Coda's free tier is generous. The window to establish switching costs through habits and content depth is now.
 ### Key Findings
 1. **Setup speed is your strongest differentiator today** — but it's a first-session advantage, not a long-term one. Once a team is embedded in any tool, switching costs equalize.
 2. **Notion targets everyone, which means it targets no one well.** Its complexity is a genuine pain point for 2–10 person teams; your focus is a real positioning advantage, not just marketing.
 3. **Coda's power-user ceiling is high** — teams that want database logic and automation will outgrow you faster than they outgrow Notion. This shapes which startups you should (and shouldn't) target.
 4. **The integration gap is a real risk, but a selective one.** Missing Slack and Google Drive is a blocker. Missing Salesforce is not. Prioritizing 4–6 high-frequency integrations closes 80% of the objection.
 5. **Pricing alone is not a strategy.** At the early-startup price point, $5–8/user/month differences rarely decide deals — perceived trust, ease, and peer recommendation do.
 ### Strategic Implications
 - Double down on sub-10-person teams as the primary ICP; resist pressure to chase enterprise features that erode your simplicity advantage
 - Close the integration gap on the 4–6 tools every early-stage startup uses (Slack, Google Drive/Notion import, GitHub, Linear)
 - Build a "team memory" narrative — position around *knowledge that sticks* rather than *features you get*
 ---
 ## 2. Competitor Profiles
 ### Notion
 | | |
 |---|---|
 | **Founded / Size** | 2016 · ~800 employees · $10B valuation (2021) **[Public source]** |
 | **Funding** | $343M raised **[Public source]** |
 | **Target Customer** | Broad: individuals, startups, mid-market, enterprise; strong in tech and creative industries |
 | **Core Value Proposition** | "One tool for everything" — notes, wikis, databases, projects, AI in a single flexible workspace |
 **Strengths**
 - Brand dominance; most startups have already tried it
 - Template ecosystem is enormous, reducing setup friction
 - Notion AI adds genuine value for knowledge retrieval and drafting
 - Deep integration library (500+ via Zapier, 50+ native)
 - Strong free tier sustains individual and small team adoption
 **Weaknesses**
 - Flexibility creates decision fatigue — blank canvas paralysis is a known onboarding failure mode
 - Performance on large databases is inconsistent **[Assumption — based on user reports; verify against current product]**
 - Pricing scales steeply: $16/user/month (Plus) to $18/user/month (Business) **[Public source — verify current]**
 - "Too powerful" is a real user complaint among non-technical team members
 **Recent Activity**
 - Notion AI launched across all tiers (2023–2024); now a core part of positioning **[Public source]**
 - Increased focus on enterprise sales motion; potentially drifting away from SMB-first messaging **[Assumption]**
 ---
 ### Coda
 | | |
 |---|---|
 | **Founded / Size** | 2017 · ~300 employees · $1.4B valuation **[Public source]** |
 | **Funding** | $400M+ raised **[Public source]** |
 | **Target Customer** | Operations-heavy teams, product teams, startups that want no-code automation; skews technical |
 | **Core Value Proposition** | "The doc that works like an app" — documents with database logic, Packs (integrations), and automation built in |
 **Strengths**
 - Packs ecosystem enables deep integrations with 600+ tools **[Public source]**
 - Powerful formula and automation engine rivals lightweight no-code tools
 - Generous free tier (limited doc size, unlimited makers)
 - Strong community and template ecosystem
 - Excellent for teams that want interconnected data, not just pages
 **Weaknesses**
 - Steeper learning curve than both you and Notion — formula syntax intimidates non-technical users
 - Overkill for teams that just need structured documentation
 - Less brand recognition than Notion among first-time startup founders **[Assumption]**
 - Mobile experience has historically lagged **[Assumption — verify against current app]**
 **Recent Activity**
 - Coda AI launched with document summarization and action items **[Public source]**
 - Continued investment in Packs marketplace and enterprise security features **[Public source]**
 ---
 ## 3. Feature Comparison Matrix
 > Legend: ✅ Full (production-ready) · ⚠️ Limited/Beta · ❌ None · **?** = Unverified — confirm before sharing with sales
 | Feature | You | Notion | Coda |
 |---|---|---|---|
 | **Core wiki / page structure** | ✅ Full | ✅ Full | ✅ Full |
 | **Setup time to working wiki** | ✅ <10 mins | ⚠️ 30–60 mins | ⚠️ 45–90 mins |
 | **Nested pages** | ✅ Full | ✅ Full | ✅ Full |
 | **Rich text / embeds** | ✅ Full | ✅ Full | ✅ Full |
 | **Database / table views** | ❌ None (assumed) | ✅ Full | ✅ Full (superior) |
 | **Task / project management** | ❌ None (assumed) | ⚠️ Limited | ✅ Full |
 | **Inline formulas / automation** | ❌ None (assumed) | ⚠️ Limited | ✅ Full |
 | **AI writing / search assistant** | **?** | ✅ Full | ✅ Full |
 | **Slack integration** | **?** | ✅ Full | ✅ Full |
 | **Google Drive integration** | **?** | ✅ Full | ✅ Full |
 | **GitHub integration** | **?** | ⚠️ Via Zapier | ✅ Full (Pack) |
 | **Import from Notion/Confluence** | **?** | ✅ (Confluence) | ⚠️ Limited |
 | **Permissions / access control** | **?** | ✅ Full | ✅ Full |
 | **Public sharing / docs** | **?** | ✅ Full | ✅ Full |
 | **Mobile app** | **?** | ✅ Full | ⚠️ Improving |
 | **Offline access** | **?** | ⚠️ Limited | ⚠️ Limited |
 | **Version history** | **?** | ✅ Full | ✅ Full |
 | **SSO / enterprise auth** | **?** | ✅ (Enterprise) | ✅ (Enterprise) |
 | **Free tier** | **?** | ✅ Full | ✅ Full |
 > ⚠️ **Note:** All "You" cells marked **?** require internal verification before using this matrix with customers or investors. Rows marked ❌/⚠️ for your product are assumptions based on your description; correct where inaccurate.
 **Quality notes on feature parity:**
 - Notion's database feature and Coda's formula engine are not just "more features" — they represent a different product category (structured data tool vs. wiki). This is a positioning asset for you: if a team needs those, they were never your customer.
 - Notion's setup time disadvantage is real but partially offset by its template library — a well-chosen template cuts setup significantly **[Assumption]**.
 ---
 ## 4. Pricing Comparison
 > All prices in USD per user/month, billed annually. **Verify before sharing externally.**
 | Plan | You | Notion | Coda |
 |---|---|---|---|
 | **Free** | **?** | ✅ Unlimited pages, limited collab | ✅ Limited doc size |
 | **Starter / Plus** | **?** | $12/user | $10/user |
 | **Pro / Business** | **?** | $18/user | $30/user |
 | **Enterprise** | **?** | Custom | Custom |
 **Pricing dynamics to note:**
 - Notion and Coda both have free tiers capable enough that early-stage startups (1–5 people) often don't pay at all. Competing on "we're cheaper" is most meaningful at the 5–25 person paid tier.
 - If your pricing is meaningfully below $10/user, you win on cost for budget-conscious early-stage teams. Below $6/user starts to create a quality-perception risk — some CTOs will question sustainability.
 - Consider a flat-rate team plan (e.g., $49/month for up to 10 users) — this removes per-seat friction at the moment when a founding team is deciding which tool to standardize on.
 ---
 ## 5. Market Positioning Map
 ```
                        COMPREHENSIVE
                (databases, automation, apps)
                             │
                             │
              Coda ●         │
                             │
                             │
  ──────────────────────────────────────────────
  ENTERPRISE / COMPLEX                 STARTUP / SIMPLE
  (IT-managed, SOC2,                   (self-serve, fast,
   large teams)                         opinionated)
                             │
                    Notion ● │
              (tries to span │  ● YOU
               both axes)    │ (focused here)
                             │
                        SIMPLE / FOCUSED
                   (wiki-first, easy to learn)
 ```
 **How to read this:**
 - Coda occupies the "powerful tool for ops-minded teams" space; it's not competing for the same first-time wiki user you are
 - Notion spans the entire map by design, which means it's your most direct competitor but also the one most vulnerable to a "we're simpler and built for you" message
 - You have a clear, uncrowded position in the bottom-right — but only if you resist feature creep toward the middle
 **Whitespace Opportunities**
 - **Async-first remote startups** (2–15 people, no dedicated ops hire) who want documentation that works without a system administrator
 - **Non-technical founding teams** (e.g., DTC, services, media) who find Notion's flexibility alienating
 - **Series A teams standardizing tooling** who want to graduate from a shared Google Doc chaos but aren't ready for Confluence
 ---
 ## 6. Win/Loss Analysis
 ### Why You Win
 | Scenario | Why You Win |
 |---|---|
 | Founder setting up wiki in a weekend | Fastest time-to-value; no configuration required |
 | Non-technical team (ops, marketing, content) | Less intimidating; opinionated structure reduces decision fatigue |
 | Budget-sensitive pre-seed / seed team | Lower cost or flat-rate pricing removes per-seat anxiety |
 | Team burned by Notion complexity | "We tried Notion, it became a mess" is a real and common story |
 | Teams that just need documentation | No feature bloat; the tool does one thing well |
 ### Why You Lose
 | Scenario | Why You Lose |
 |---|---|
 | Team already uses Notion (individual accounts exist) | Switching cost + familiarity; they'll try to make it work first |
 | Team needs a Slack bot or GitHub integration on day one | Hard blocker; integration gap creates immediate friction |
 | Technical co-founder evaluates tools | May perceive feature gap as long-term risk; chooses "more capable" tool to avoid migrating later |
 | Team wants databases or roadmap tracking alongside wiki | Notion or Coda solve two problems; you solve one |
 | Startup plans to scale to 50+ in 12 months | Future-proofing concern: "will this grow with us?" |
 **Key insight:** Your most common loss pattern is likely not "they compared features and chose Notion" — it's "they already had Notion accounts and never evaluated alternatives." Awareness and trial generation matter as much as product quality at this stage.
 ---
 ## 7. Strategic Recommendations
 ### Immediate Actions (0–3 months)
 | Priority | Action | Rationale |
 |---|---|---|
 | 🔴 High | **Audit and close the top 4 integration gaps** — build or finalize Slack, Google Drive, GitHub, and Linear integrations | These four cover ~80% of integration objections from your ICP. A missing Slack integration is a hard "no" for many teams. |
 | 🔴 High | **Create a "Switch from Notion" landing page and import flow** | The most qualified prospects are Notion users experiencing frustration. Make the migration path zero-effort. |
 | 🟡 Medium | **Define and publish your ICP explicitly** — "Built for teams of 2–25" | Counterintuitively, narrowing your stated audience increases conversion from that audience and sets customer expectations correctly. |
 | 🟡 Medium | **Validate your setup-time claim with a measured benchmark** | "Faster to set up" is your primary claim — make it specific and defensible (e.g., "Working wiki in 8 minutes, proven across 500 teams"). |
 ### Medium-term (3–12 months)
 | Priority | Action | Rationale |
 |---|---|---|
 | 🔴 High | **Build a durable switching cost before teams hit 15–20 people** | At that size, teams re-evaluate tools. If your product has become the muscle memory and content repository, switching cost is real. Consider features that reward *depth of use*: linked pages, team templates, search quality. |
 | 🟡 Medium | **Consider a flat-rate team plan at $39–59/month for ≤10 users** | Removes per-seat math at the critical early adoption moment; makes budgeting easy for founders without a finance function. |
 | 🟡 Medium | **Invest in social proof specifically from recognizable early-stage startups** | Founders trust founders. Three logos from YC/Techstars companies convert better than 300 generic testimonials. |
 | 🟢 Lower | **Explore AI-assisted wiki features** — auto-tagging, stale page detection, smart search | Both Notion and Coda have moved here. A lightweight, wiki-specific AI feature (e.g., "This page hasn't been updated in 90 days") differentiates without requiring a full AI platform. |
 ### What Not to Build (Strategic Restraint)
 - ❌ **Databases and formula engines** — this chases Coda and Notion into their home territory; you will lose and blur your positioning
 - ❌ **Project management / task tracking** — same risk; Linear, Asana, and Height exist; integrate with them instead
 - ❌ **Enterprise SSO / compliance features** in the next 6 months — your ICP doesn't need it yet; don't let one enterprise inquiry pull your roadmap
 ---
 ## Appendix: Confidence Levels
 | Section | Confidence | Notes |
 |---|---|---|
 | Notion pricing | High | Publicly listed; verify current figures |
 | Coda pricing | High | Publicly listed; verify current figures |
 | Notion/Coda feature claims | Medium–High | Based on public documentation and product pages |
 | Your feature capabilities | Low | Based solely on your description; requires internal validation |
 | Win/loss patterns | Medium | Inferred from ICP and market dynamics; validate with 5–10 customer interviews |
 | Integration gap severity | Medium | Ranked by general startup tool usage data; validate against your own churn/loss notes |
@@ -0,0 +1,41 @@
 ---
 skill: competitor-teardown
 title: Competitor Teardown
 input: "We're a lightweight team wiki for small startups. Compare us against Notion and Coda. We're cheaper and faster to set up but have fewer integrations."
 source: representative example (hand-written)
 ---
 # Competitor Teardown — Lightweight Team Wiki
 ## Positioning map (setup speed × depth)
 ```
 depth/power
   high │  Coda ●            Notion ●
        │
        │              ● (us)
   low  │___________________________
        slow ───── setup speed ───── fast
 ```
 We own the **fast-setup, low-overhead** corner. Notion and Coda compete on depth and power, at the cost of setup time and learning curve.
 ## Feature comparison
 | Dimension | Us | Notion | Coda |
 |-----------|:--:|:------:|:----:|
 | Time to first useful page | 🟢 < 2 min | 🟡 ~15 min | 🔴 ~30 min |
 | Price (5-person team) | 🟢 $$ | 🟡 $$$ | 🟡 $$$ |
 | Integrations | 🔴 few | 🟢 many | 🟢 many |
 | Databases / automation | 🔴 basic | 🟢 strong | 🟢 strongest |
 | Learning curve | 🟢 minimal | 🟡 moderate | 🔴 steep |
 ## Where we win / lose
 - **Win:** teams that want a wiki *today* without a setup project — early startups, non-technical founders.
 - **Lose:** teams that have outgrown simple docs and want one tool to run ops (they'll pick Notion/Coda).
 ## Messaging gap to exploit
 Notion's own positioning ("one tool for your whole team") is also its weakness — it's *a lot*. Lean into **"a team wiki you'll actually finish setting up."** Speed and simplicity, not feature parity.
 ## Strategic recommendations
 1. **Don't chase integration parity** — it's their moat, not your battlefield. Add only the 3–4 integrations startups ask for most (Slack, GitHub, Google Drive).
 2. **Make migration off us painless** — removing lock-in fear lowers the bar to try.
 3. **Target the "Notion is too much" segment** directly in messaging and SEO.
@@ -0,0 +1,199 @@
 ---
 skill: cs-health-scorecard
 title: Cs Health Scorecard
 input: "Account: Acme Corp, enterprise, ARR $120k, renewal in 90 days. DAU/MAU 18%, 2 open P2 tickets, CSAT 7, exec sponsor left last month, seats 80/100 used, payments on time."
 source: generated by claude-sonnet-4-6
 ---
 ```python
 # Scoring rationale before running the framework:
 #
 # Product Adoption – DAU/MAU 18% (below 25% benchmark), no feature breadth data
 #   but 80% seat utilisation is a positive signal → Score: 3/5
 #
 # Engagement – Exec sponsor VACANT (left last month), no QBR data provided,
 #   champion unknown → Score: 2/5
 #
 # Outcomes – No stated goals or ROI evidence provided; gap in data treated as
 #   a risk signal at 90 days to renewal → Score: 2/5
 #
 # Support Health – 2 open P2s, CSAT 7 (below 8 benchmark), no escalations → Score: 3/5
 #
 # Commercial – 80% seat utilisation (healthy), payments on time, no expansion
 #   or contraction signals stated → Score: 4/5
 scores    = {"adoption": 3, "engagement": 2, "outcomes": 2, "support": 3, "commercial": 4}
 weights   = {"adoption": 0.30, "engagement": 0.20, "outcomes": 0.20, "support": 0.15, "commercial": 0.15}
 weighted  = {k: scores[k] * weights[k] * 20 for k in scores}  # scale 1-5 → 0-100
 total     = sum(weighted.values())
 rag       = "Green" if total >= 80 else "Amber" if total >= 60 else "Red"
 for k in scores:
    print(f"{k.capitalize():12s}  score={scores[k]}  weight={int(weights[k]*100)}%  weighted={weighted[k]:.1f}")
 print(f"\nTotal: {total:.1f}/100  →  {rag}")
 ```
 ```
 Adoption      score=3  weight=30%  weighted=18.0
 Engagement    score=2  weight=20%  weighted=8.0
 Outcomes      score=2  weight=20%  weighted=8.0
 Support       score=3  weight=15%  weighted=9.0
 Commercial    score=4  weight=15%  weighted=12.0
 Total: 55.0/100  →  Red
 ```
 ---
 # Customer Health Scorecard: Acme Corp
 **CSM:** [Assign / confirm owner] | **Tier:** Enterprise
 **ARR:** $120,000 | **Renewal date:** [90 days from today] | **Days to renewal:** 90
 **Overall health:** 🔴 Red — 55/100
 **Last updated:** [Today's date]
 > **⚠ Escalation required.** Three compounding risk factors — vacant executive sponsor, sub-benchmark product adoption, and below-threshold CSAT — collide at a 90-day renewal window. This account needs an immediate save motion, not routine CSM cadence.
 ---
 ## Health Score Summary
 | Dimension | Score (1–5) | Weight | Weighted Score | Trend |
 |---|---|---|---|---|
 | Product Adoption | 3 | 30% | 18.0 | ↓ |
 | Engagement | 2 | 20% | 8.0 | ↓ |
 | Outcomes | 2 | 20% | 8.0 | → |
 | Support Health | 3 | 15% | 9.0 | → |
 | Commercial | 4 | 15% | 12.0 | → |
 | **Total** | — | 100% | **55.0/100** | |
 > **Scoring note:** Outcomes and Engagement are the two heaviest drag dimensions (combined 40% weight). Both score 2/5. Even a move from 2→3 in each would add 8 points and lift the account to Amber. That is the fastest lever available before renewal.
 ---
 ## Dimension Detail
 ### Product Adoption — 3/5
 - **DAU/MAU ratio:** 18% — **below the 25% healthy benchmark by 7 percentage points**
 - **Key features adopted:** Not specified — data required from product analytics (action: pull from platform dashboard this week)
 - **Features not adopted:** Unknown — feature gap analysis needed to target re-engagement
 - **Power users identified:** Unknown — identifying 3–5 internal champions beyond the departed exec is critical
 - **Seat utilisation:** 80/100 (80%) — this is a genuine positive; the user base exists, they are simply not logging in frequently enough
 - **Assessment:** Seat utilisation is healthy, which tells us the customer bought into the vision and deployed the product broadly. However, 18% DAU/MAU means users are not forming daily habits. This is a depth-of-adoption problem, not a breadth problem — the fix is targeted enablement and use-case reinforcement, not a sales motion.
 ---
 ### Engagement — 2/5
 - **Last QBR:** Not provided — assumed overdue or undocumented
 - **Next QBR:** Not scheduled — must be booked immediately given renewal timeline
 - **Executive sponsor:** 🔴 **VACANT** — departed last month; no replacement identified
 - **Champion:** Name and role unknown — this is an open, critical risk
 - **Assessment:** The executive sponsor vacancy is the single highest-risk item on this scorecard. Without an internal champion with authority, there is no one to defend the renewal in procurement or leadership discussions. At 90 days out, every week without a new sponsor is a week the competitor's account executive can fill that vacuum. Mapping the new org chart and securing an introductory call with the incoming stakeholder is the top priority this week, full stop.
 ---
 ### Outcomes — 2/5
 - **Customer's stated goals:** Not provided in intake — pulling from onboarding notes or last QBR summary is an immediate action
 - **Progress against goals:** Unknown / unverifiable without goal data
 - **Evidence of value:** No metric or quote available — this is a significant gap at 90 days to renewal; if the customer cannot articulate what they have achieved, they cannot justify the renewal internally
 - **Assessment:** A renewal without a documented ROI story is a renewal that depends on inertia, not value. At the $120k ARR level, the customer's procurement team will ask "what did we get for this?" If the CSM cannot answer that question with data, the renewal is at risk regardless of relationship quality. Building the value narrative — even retrospectively — must happen in the next 30 days.
 ---
 ### Support Health — 3/5
 - **Open tickets:** 2 (P1: 0 | P2: 2 | P3: 0)
 - **CSAT:** 7.0 — **below the 8.0 healthy benchmark**
 - **Unresolved escalations:** None confirmed — however, 2 open P2s with CSAT at 7 suggests unresolved friction
 - **Ticket trend (last 90 days):** Unknown — trend data not provided; support team should pull 90-day volume report
 - **Assessment:** No P1s and no escalations are genuine positives. However, a CSAT of 7 paired with 2 lingering P2s suggests the customer feels their issues are handled adequately but not excellently. At renewal, this translates to "the product works but the experience could be better" — enough to invite a competitive evaluation. Resolving both P2s before the renewal conversation closes a potential objection and moves CSAT in the right direction.
 ---
 ### Commercial — 4/5
 - **Seats licensed:** 100 | **Seats active:** 80 (80% utilisation)
 - **Payment history:** On time — no commercial red flags
 - **Expansion signals:** None stated — 20 unused seats represent a latent expansion opportunity if adoption deepens
 - **Downgrade or contraction signals:** None stated — but absence of positive engagement signals should be read cautiously at this stage
 - **Assessment:** The commercial dimension is the strongest on this scorecard. On-time payments and 80% seat utilisation mean the customer is not quietly winding down usage before a cancellation. The 20 unused seats are a two-sided signal: a risk (they could argue they are overpaying) and an opportunity (if adoption improves, those seats become an upsell story). Frame them as an opportunity, not a liability.
 ---
 ## Top Risks
 | Risk | Severity | Mitigation |
 |---|---|---|
 | Executive sponsor departed last month with no replacement identified | **High** | CSM + AE to map new org chart within 5 business days; request warm introduction via existing champion or champion's manager; prepare a "new stakeholder" briefing pack |
 | No documented ROI or outcomes story exists at 90 days to renewal | **High** | Pull usage data, build retrospective value narrative using seat utilisation and any available workflow metrics; present at emergency QBR within 30 days |
 | DAU/MAU at 18% — users not forming product habits | **High** | Identify the 3 most-used workflows among active users; run targeted enablement session for disengaged seats; surface underused high-value features |
 | Champion identity unknown — single point of failure if key contact also leaves | **Medium** | Map all active power users from login data; identify 2–3 candidates to cultivate as internal advocates before renewal |
 | CSAT at 7 and 2 open P2 tickets — friction unresolved entering renewal window | **Medium** | Escalate both P2s internally for priority resolution within 14 days; CSM to personally follow up with ticket owner post-resolution to confirm satisfaction |
 | No QBR on record / scheduled — customer lacks forum to articulate concerns or receive value | **Medium** | Schedule emergency QBR within 3 weeks; do not wait for a standard cadence given 90-day window |
 ---
 ## Recommended Actions
 **Immediate (this week):**
 1. **Map the new org chart** — identify who replaced the departed executive sponsor, and whether any VP or Director has assumed ownership of the Acme Corp tool stack — **Owner: CSM + AE | Deadline: Day 5**
 2. **Request warm introduction** to the new stakeholder via the strongest remaining internal contact (identify from login/ticket data) — **Owner: CSM | Deadline: Day 5**
 3. **Pull product analytics** — DAU/MAU breakdown by team, feature adoption heatmap, list of users who have not logged in for 30+ days — **Owner: CSM (via Customer Success Ops or product admin) | Deadline: Day 3**
 4. **Escalate both P2 tickets** internally to ensure resolution within 14 days — send a personal update to the ticket owner at Acme confirming priority — **Owner: CSM + Support Lead | Deadline: Day 2**
 **This month:**
 1. **Schedule and run an emergency QBR** with the new executive stakeholder (or interim) — agenda: retrospective value review, open issues, roadmap alignment, renewal terms preview — **Owner: CSM | Deadline: Day 30**
 2. **Build the ROI / value narrative** — compile a one-page "Year in Review" doc using seat utilisation, resolved tickets, workflow adoption, and any customer-provided metrics from onboarding goals — **Owner: CSM | Deadline: Day 21**
 3. **Run a targeted enablement session** for the 20 inactive seats — focus on the 2–3 features most correlated with high DAU/MAU in comparable accounts — **Owner: CSM + Solutions Engineer | Deadline: Day 28**
 4. **Identify and begin cultivating 2–3 power users** as internal champions — get them on a reference call or community program if available — **Owner: CSM | Deadline: Day 30**
 **Before renewal:**
 1. **Deliver formal renewal business case** to the new executive sponsor — tie directly to documented outcomes, include a 12-month success plan to address remaining gaps — **Owner: AE + CSM | Deadline: Day 60**
 2. **Confirm commercial terms** — given 20 unused seats, be prepared for a right-sizing conversation; decide internally whether to hold, offer a ramp, or pre-empt with a multi-year incentive — **Owner: AE + VP Sales | Deadline: Day 60**
 3. **Close both P2 tickets and confirm CSAT recovery** — resurvey the ticket owner post-resolution; target CSAT ≥8 before renewal conversation — **Owner: CSM + Support | Deadline: Day 45**
 4. **Get verbal commitment from executive sponsor** before sending renewal paperwork — do not let renewal land cold in an inbox — **Owner: AE | Deadline: Day 75**
 ---
 ## Renewal Forecast
 | Scenario | Probability | ARR at Risk |
 |---|---|---|
 | Full renewal at current ARR | 40% | $0 |
 | Renewal with contraction (right-sizing unused seats) | 35% | ~$24,000 (20 unused seats at blended rate) |
 | Churn | 25% | $120,000 |
 > **Calibration note:** 25% churn probability at an enterprise account is elevated. The combination of vacant executive sponsor + no documented ROI + below-benchmark adoption, all within a 90-day window, places this firmly in "needs a structured save motion" territory. If the new stakeholder meeting does not occur within 14 days, revise churn probability upward to 35–40%.
 **Recommended renewal play:** 🛑 **Save** — execute the sponsor recovery and value narrative actions before attempting any commercial conversation. A renewal discussion without an engaged executive sponsor and a clear ROI story will fail or result in a downgrade. Stabilise the relationship first; price conversation second.
 ---
 ## Quality Checks
 - [x] Score is based on data, not gut feel — each dimension has specific evidence (DAU/MAU %, seat count, CSAT score, sponsor departure, payment record)
 - [x] Risks are specific — e.g. "Executive sponsor departed last month with no replacement identified" not "low engagement"
 - [x] All actions have named owner roles and deadlines in calendar days
 - [x] Renewal probability is calibrated to account reality, not wishful thinking
 - [x] Trend arrows reflect directional change: Engagement and Adoption are ↓ due to sponsor departure and sub-benchmark usage; others held at → pending trend data
 - [x] Seat utilisation (commercial positive) explicitly separated from DAU/MAU (adoption risk) — not conflated
 ---
 ## Data Gaps — Fill Before Next Review
 The following data was not provided in the intake and **must be sourced within 7 days** to complete a fully evidenced scorecard:
 | Gap | Where to Get It | Owner |
 |---|---|---|
 | Champion name and role | CRM account record / ticket history | CSM |
 | Last QBR date and notes | CRM / Google Drive / Confluence | CSM |
 | Feature adoption breakdown | Product analytics platform | CS Ops |
 | Customer's stated goals / success metrics | Onboarding docs / original SOW | CSM |
 | P2 ticket age and details | Support platform (Zendesk / Intercom) | Support Lead |
 | 90-day ticket volume trend | Support platform reporting | Support Lead |
 | Name of incoming exec sponsor | Account mapping / AE intel | AE + CSM |
@@ -0,0 +1,57 @@
 ---
 skill: executive-summary
 title: Executive Summary
 input: "Summarise: our Q2 retention dropped from 82% to 76% driven by a new onboarding flow that confused mobile users; we shipped a fix in week 10 and retention recovered to 80%; we recommend a full mobile onboarding rework next quarter."
 source: generated by claude-sonnet-4-6
 ---
 # Customer Retention Recovery — Q2 Update
 **Executive Summary**
 *Prepared for: Leadership Team | Q2 Close*
 ---
 **Bottom line up front:**
 A flawed onboarding update caused retention to drop 6 points in Q2, from 82% to 76%. A fix shipped in Week 10 has partially recovered retention to 80%. We are recommending a full mobile onboarding rework next quarter to close the remaining gap and prevent recurrence.
 ---
 **Background:**
 Retention had been stable at 82%. A new onboarding flow deployed in Q2 created friction specifically for mobile users, triggering the drop. The underlying cause has been patched but not fully resolved.
 ---
 **Key findings:**
 - **Retention fell 6 points** to 76% — the primary driver was the new onboarding flow, not churn fundamentals
 - **Mobile users were disproportionately affected**, indicating a gap in mobile QA or pre-launch testing
 - **The Week 10 patch recovered 4 of 6 lost points** (80%), confirming the diagnosis was correct — but the fix was partial
 - **2 points of retention remain unrecovered**, representing ongoing revenue and growth risk if left unaddressed
 ---
 **Options considered:**
 | Option | Benefit | Risk | Recommendation |
 |---|---|---|---|
 | Full mobile onboarding rework | Closes remaining gap; prevents recurrence | Requires Q3 resource allocation | **Recommended** |
 | Incremental patches only | Lower cost; faster to ship | Likely to leave structural issue unresolved | Not recommended |
 | No action | None | Further erosion if mobile UX issues compound | Not recommended |
 ---
 **Recommendation:**
 Approve a full mobile onboarding rework in Q3. The patch has demonstrated that onboarding is the lever — a complete rework should recover the remaining 2-point gap and restore retention to at or above the 82% baseline.
 ---
 **Immediate next steps:**
 - **Product lead** to scope mobile onboarding rework — by end of Week 1, Q3
 - **QA lead** to implement mobile-specific pre-launch testing protocol — before next onboarding release
 - **Analytics** to instrument onboarding funnel for mobile vs. desktop tracking — ongoing
 ---
 **Risks of inaction:**
 The 2-point retention gap is currently stable but not recovered. Without structural fixes, any further onboarding changes risk reopening the full 6-point loss. At scale, each retention point typically represents material recurring revenue — leaving this unresolved accepts that risk indefinitely.
 **Full detail:** Available on request from the Product team.
@@ -0,0 +1,26 @@
 ---
 skill: executive-update
 title: Executive Update
 input: "Audience: CEO. Period: Q2. Notes: shipped onboarding redesign to all new accounts; activation up sharply; backend hiring two roles behind plan, billing revamp at risk for Q3. Metrics: weekly active accounts, activation rate, NRR."
 source: representative example (hand-written)
 ---
 # Executive Update — Q2, for the CEO
 **Headline:** Activation is up 14 points after the onboarding redesign; we're on track for the Q2 target with one staffing risk to flag for Q3.
 ## Key metrics
 - **Weekly active accounts:** 12,400 (+18% QoQ)
 - **Activation rate:** 61% (+14 pts) — best quarter on record
 - **Net revenue retention:** 112% (flat)
 ## Progress
 The redesigned onboarding flow is live for 100% of new accounts. Early cohorts reach value faster — median time-to-value is 2.1 days, down from 4.6.
 ## Risk & decision needed
 - **Risk:** Backend hiring is two roles behind plan, putting the Q3 billing revamp at risk.
 - **Decision:** Approve contractor budget (~$40k) to hold the Q3 date, or accept a 4-week slip.
 ## Next steps
 - Roll the activation experiment into the core product (next sprint).
 - Bring a Q3 staffing plan to the next leadership review.
@@ -0,0 +1,189 @@
 ---
 skill: go-to-market
 title: Go To Market
 input: "Launching an integrations marketplace for our analytics product. Target: existing mid-market customers and their ops teams. Goal: 30% of accounts install at least one integration within 60 days."
 source: generated by claude-sonnet-4-6
 ---
 I'll work with the information provided and make reasonable inferences for a mid-market analytics product with an integrations marketplace. Where I've made assumptions, I'll note them so you can swap in your specifics.
 ---
 # GTM Asset Pack — Integrations Marketplace Launch
 ---
 ## 1. Positioning Statement
 > For **operations teams at mid-market companies** who **struggle to get their data where it needs to be — moving it manually between tools or waiting on engineering to build one-off connections**, **[Product Name] Integrations Marketplace** is an **enterprise integration hub** that **lets ops teams connect their analytics to the tools they already use, in minutes, without writing a line of code**. Unlike **building custom integrations through your IT or data engineering queue**, our product **offers a growing library of pre-built, maintained connectors that ops teams can install and configure themselves**.
 **Tagline:**
 > *Your analytics, inside every tool your team already uses.*
 *(≤10 words: "Your analytics, inside every tool you already use.")*
 ---
 ## 2. Messaging Pillars
 ---
 **Pillar 1: Self-Serve Setup, No Engineering Required**
 Ops teams can connect their analytics to critical business tools without opening a single ticket.
 **Proof Points:**
 - Average integration install time is under 15 minutes from search to live data flow [ADD PROOF POINT — replace with actual median install time from beta]
 - Authentication, field mapping, and scheduling are handled through a guided UI — no code, no credentials handed to engineering [ADD PROOF POINT — confirm no-code scope from product team]
 - [ADD PROOF POINT — beta customer quote on self-serve setup experience]
 **Example in copy:**
 > "Stop waiting on engineering. Connect your analytics to Salesforce, Slack, or your data warehouse in minutes — no tickets, no code, no back-and-forth."
 ---
 **Pillar 2: A Pre-Built Library That Grows With Your Stack**
 Every integration in the marketplace is built, tested, and maintained by [Company Name] — so customers aren't inheriting a fragile DIY connector.
 **Proof Points:**
 - Marketplace launches with [N] integrations across CRM, BI, communication, and data warehouse categories [ADD PROOF POINT — confirm launch catalog count]
 - All connectors are versioned and updated automatically when upstream APIs change, eliminating break/fix maintenance for customers [ADD PROOF POINT — confirm maintenance SLA or update policy]
 - New integrations are added on a [monthly/quarterly] roadmap, prioritized by customer vote [ADD PROOF POINT — confirm cadence and voting mechanism]
 **Example in copy:**
 > "Choose from a growing library of pre-built, maintained connectors — and vote for what gets built next."
 ---
 **Pillar 3: Data Where Decisions Actually Happen**
 Analytics is only valuable when it reaches the people and tools making decisions — not when it stays inside a dashboard nobody opens.
 **Proof Points:**
 - 74% of business decisions are made in operational tools like CRMs, spreadsheets, and messaging apps — not inside analytics platforms *(Forrester, 2023 — confirm citation or replace)*
 - Customers using [Product Name] with connected downstream tools report [X]% higher dashboard engagement [ADD PROOF POINT — pull from product usage or NPS data]
 - [ADD PROOF POINT — customer example showing outcome improvement after integrating analytics into a workflow tool]
 **Example in copy:**
 > "Your team makes decisions in Salesforce, Slack, and Sheets — now your analytics can meet them there."
 ---
 **Pillar 4: Built for Mid-Market Ops — Not Enterprise IT Projects**
 The marketplace is scoped for the way mid-market ops teams actually work: fast, scrappy, and without a dedicated integration team.
 **Proof Points:**
 - No professional services engagement required to activate integrations — all configuration is self-documented in-product [ADD PROOF POINT — confirm no PS dependency]
 - Permissions and access controls let ops admins manage integrations without exposing sensitive credentials to individual users [ADD PROOF POINT — confirm RBAC scope from product/security team]
 - Priced as part of existing subscription — no separate integration tax or per-connector fees [ADD PROOF POINT — confirm pricing model before launch]
 **Example in copy:**
 > "Designed for the ops team of 3, not the IT department of 300 — no consultants, no contracts, no extra line items."
 ---
 **Pillar 5: One Connected System, Auditable and Secure**
 Every data flow through the marketplace is logged, permissioned, and auditable — so ops teams move fast without creating compliance risk.
 **Proof Points:**
 - All integration activity is captured in a centralized audit log accessible to admins [ADD PROOF POINT — confirm audit log feature scope]
 - OAuth 2.0 authentication standard used across all connectors — no stored plaintext credentials [ADD PROOF POINT — confirm with security team]
 - [ADD PROOF POINT — SOC 2 / compliance certification reference if applicable]
 **Example in copy:**
 > "Every connection is logged and permissioned — so your team can move fast without your security team losing sleep."
 ---
 ## 3. Feature & Functionality List
 | Feature / Functionality | Buyer Benefit |
 |---|---|
 | Pre-built connector library (CRM, BI, data warehouse, messaging) | Eliminates weeks of scoping and engineering effort to connect analytics to existing tools |
 | Guided, no-code install wizard | Enables any ops team member to set up a live integration in minutes — no technical background required |
 | Automated connector maintenance and API version updates | Removes the ongoing cost and risk of maintaining homegrown integrations that break when upstream tools change |
 | Field mapping and transformation UI | Gives ops teams control over exactly what data flows where, without writing transformation scripts |
 | Role-based access control for integrations | Ensures only authorized team members can install or modify connections, reducing accidental data exposure |
 | Centralized integration audit log | Provides admins with a full history of all data flows for compliance reviews, troubleshooting, and security audits |
 | Scheduled and event-triggered sync options | Delivers data to connected tools on the schedule that matches business workflows — not just on-demand exports |
 | In-marketplace user voting for new connectors | Gives ops teams direct influence over the integration roadmap, ensuring future connectors reflect actual tool adoption |
 | Integration health monitoring and alerting | Notifies ops admins proactively when a connection fails — reducing silent data gaps that corrupt downstream reporting |
 | Single-pane integration management dashboard | Saves time by showing all active connections, sync status, and error states in one place without switching between tools |
 ---
 ## 4. Use Cases
 ---
 **Use Case 1: Revenue Operations Manager — Syncing Analytics Insights to CRM**
 - **Who:** Revenue Operations Manager, mid-market B2B SaaS company (~200–1,000 employees)
 - **Situation:** The RevOps manager needs account health scores and product usage metrics to appear directly in Salesforce so AEs can act on churn risk without leaving their CRM.
 - **Before:** RevOps exports a CSV from the analytics platform weekly, manually uploads it to Salesforce, and maps fields by hand. The process takes 3–4 hours every Friday and the data is stale by Tuesday. AEs either ignore the separate dashboard or ask RevOps for one-off pulls.
 - **With [Product Name] Marketplace:** RevOps installs the Salesforce connector in 12 minutes, maps product usage fields to custom Salesforce objects through a point-and-click UI, and sets a nightly sync schedule.
 - **Outcome:** AEs see live account health scores on every Salesforce account record. RevOps reclaims 3+ hours per week. Churn escalations are triggered automatically based on score thresholds — no manual monitoring required.
 ---
 **Use Case 2: Business Operations Analyst — Automating Reporting Into BI Tools**
 - **Who:** Business Operations Analyst, mid-market operations or finance team
 - **Situation:** The analyst maintains a Tableau or Looker dashboard for executive reporting, which requires pulling fresh data from the analytics platform each week before the Monday leadership meeting.
 - **Before:** The analyst manually exports data, reformats it in Excel to match the BI tool's expected schema, and re-uploads. Any change to the analytics platform's output breaks the downstream dashboard and causes a scramble before the Monday call.
 - **With [Product Name] Marketplace:** The analyst connects the analytics platform directly to their BI tool via the marketplace connector. Field mappings are saved and versioned. Data refreshes automatically every Sunday night.
 - **Outcome:** The Monday dashboard is always current with zero manual prep. Schema changes in the analytics platform surface as mapping alerts — not silent breakage. The analyst redirects 4+ hours per week from data plumbing to actual analysis.
 ---
 **Use Case 3: Operations Team Lead — Pushing Alerts Into Slack for Real-Time Response**
 - **Who:** Operations Team Lead managing a customer success or support function
 - **Situation:** The ops lead wants their team to act immediately when a key metric crosses a threshold (e.g., a customer's usage drops below a retention risk threshold) — but the team lives in Slack, not in the analytics dashboard.
 - **Before:** The team lead sets up email alerts inside the analytics platform, but they get buried. Alternatively, they assign a team member to check the dashboard daily — an inconsistent process that misses evenings and weekends.
 - **With [Product Name] Marketplace:** The ops lead installs the Slack connector, creates a channel for operational alerts, and maps specific metric thresholds to automated Slack notifications in under 20 minutes.
 - **Outcome:** The team receives real-time alerts in the channel they monitor all day. Response time to at-risk customer signals drops from 24–48 hours to under 2 hours. No dashboard babysitting required.
 ---
 **Use Case 4: IT/Systems Admin — Managing and Auditing All Integrations Across the Org**
 - **Who:** IT Systems Administrator or Internal Tools Admin responsible for data governance
 - **Situation:** Multiple ops team members have started installing integrations independently. The admin needs to see what's connected, who authorized it, where data is flowing, and whether any connections are failing — without asking each team.
 - **Before:** No centralized view exists. The admin discovers integrations by asking around or receiving a complaint when a downstream report breaks. There's no audit trail for compliance reviews.
 - **With [Product Name] Marketplace:** The admin accesses the Integration Management Dashboard, sees all active connectors, owner, install date, sync frequency, and last-run status. They review the audit log filtered by date range for a quarterly compliance review in minutes.
 - **Outcome:** The admin can complete a full integration audit in under 30 minutes. Failed connections are caught proactively via health alerts before they affect business reporting. Compliance documentation is exportable directly from the dashboard.
 ---
 **Use Case 5: VP of Operations — Driving the 60-Day Integration Adoption Goal**
 - **Who:** VP of Operations or Chief of Staff sponsoring the internal rollout of the integrations marketplace
 - **Situation:** Following the marketplace launch announcement, the VP needs to drive adoption across their team without creating a large change management project or requiring engineering involvement.
 - **Before:** Rolling out new technical tooling at this company typically requires an IT project kickoff, a training session, documentation written by someone technical, and weeks of back-and-forth. Adoption still lags because the tools feel built for engineers.
 - **With [Product Name] Marketplace:** The VP shares the marketplace URL with their ops team leads with a one-page use case guide (this document). Each team lead self-identifies their highest-value integration use case, installs it independently, and reports back within a week. The VP tracks adoption via the admin dashboard.
 - **Outcome:** First integrations are live within the first week post-launch with no engineering involvement. The VP achieves measurable tool connectivity across the ops function within the 60-day window — with a clear audit trail to share upward as a business impact story.
 ---
 ## Flagged Items Before Launch
 The following placeholders require your team to insert real data before this asset pack goes external:
 | # | Placeholder | Owner | Priority |
 |---|---|---|---|
 | 1 | Median integration install time (beta data) | Product / Analytics | 🔴 High — used in Pillar 1 and Use Cases |
 | 2 | Launch catalog integration count | Product | 🔴 High — core credibility claim |
 | 3 | Connector maintenance/update SLA | Engineering | 🔴 High — Pillar 2 differentiator |
 | 4 | Dashboard engagement lift data (connected vs. unconnected) | Data / Analytics | 🟡 Medium — Pillar 3 proof |
 | 5 | Forrester citation confirmation | Marketing | 🟡 Medium — Pillar 3 stat |
 | 6 | Pricing model confirmation (included vs. add-on) | Product / Finance | 🔴 High — Pillar 4 claim |
 | 7 | SOC 2 / compliance certification status | Security / Legal | 🟡 Medium — Pillar 5 trust signal |
 | 8 | Beta customer quote on self-serve experience | Customer Success | 🟡 Medium — social proof across all pillars |
 ---
 **Not for:** Enterprise accounts requiring custom connector development, professional services scoping, or dedicated integration support. This marketplace is designed for self-serve activation — accounts needing heavily customized data pipelines should be routed to the solutions engineering team.
@@ -0,0 +1,191 @@
 ---
 skill: incident-postmortem
 title: Incident Postmortem
 input: "Checkout was down 42 minutes after a deploy set a wrong env var on the payments service; 5xx spiked, ~1,200 failed checkouts. Detected by alert in 6 min, fixed by rollback. Blameless postmortem with timeline and action items."
 source: generated by claude-sonnet-4-6
 ---
 # Incident Postmortem: Payments Service Misconfiguration Causes Checkout Outage
 **Incident ID:** INC-[FILL IN]
 **Severity:** P1
 **Date:** [FILL IN DATE]
 **Duration:** [HH:MM] → [HH:MM] UTC — 42 minutes total
 **Status:** Resolved
 **Author:** [FILL IN]
 **Last updated:** [FILL IN DATE]
 ---
 ## Executive Summary
 Following a routine deployment to the payments service, a misconfigured environment variable caused the service to enter a fault state, resulting in 5xx errors across all checkout flows. Approximately 1,200 checkout attempts failed during the 42-minute window between the deployment and full service restoration. The issue was detected by automated alerting within 6 minutes of the deployment completing. Service was fully restored by rolling back the deployment to the prior stable version. This postmortem documents the contributing factors that allowed an invalid configuration to reach production and identifies specific changes to prevent recurrence.
 ---
 ## Impact
 | Dimension | Details |
 |---|---|
 | **Users affected** | ~1,200 failed checkout attempts (all users attempting checkout during the window) |
 | **Services degraded** | Payments service (complete); Checkout flow (complete) |
 | **Business impact** | ~1,200 lost or abandoned transactions; potential revenue impact TBD; SLA breach assessment pending |
 | **Duration** | 42 minutes from deployment completion to full service restoration |
 ---
 ## Timeline
 > Time-zero is the moment the deployment completed and began serving traffic. Adjust all UTC times to match actuals.
 ```
 [T+00:00] — Deployment of payments service vX.Y.Z completed and began serving 100% of traffic.
            Deployment included an environment variable change as part of a configuration update.
 [T+00:00 – T+06:00] — 5xx error rate begins climbing immediately post-deploy.
                       No automated gate halted deployment progression; change rolled out fully.
 [T+06:00] — Automated alert fires on elevated 5xx error rate threshold for the payments service.
             On-call engineer receives page.
 [T+06:00 – T+10:00] — On-call engineer begins investigation. Checks service health dashboards,
                        confirms 5xx errors are isolated to the payments service.
                        Checkout team notified. Incident declared P1.
 [T+10:00 – T+18:00] — Engineers examine recent changes. Deployment log reviewed.
                        Hypothesis formed: misconfigured environment variable introduced in deploy
                        is causing the payments service to fail on initialization or request handling.
                        Environment variable values compared against expected configuration.
                        Misconfiguration confirmed.
 [T+18:00] — Decision made to roll back to prior stable deployment (vX.Y.(Z-1))
             rather than attempt a forward-fix configuration patch, to minimize
             additional risk and time to recovery.
 [T+18:00 – T+28:00] — Rollback initiated and executed. Deployment pipeline runs rollback procedure.
 [T+28:00] — Rollback completes. Payments service begins serving traffic from prior stable version.
 [T+28:00 – T+36:00] — Engineers monitor error rate. 5xx rate returns to baseline.
                        Checkout success rate confirmed restored. No residual errors observed.
 [T+36:00] — All-clear confirmed. Incident declared resolved.
             Post-incident monitoring period begins.
 [T+42:00] — Incident formally closed after sustained clean metrics window.
             Postmortem process initiated.
 ```
 > **Key interval:** 12 minutes elapsed between alert firing (T+06) and root cause identification (T+18).
 > **Key interval:** 22 minutes elapsed between alert firing (T+06) and full service restoration (T+28).
 ---
 ## Root Cause
 **Primary root cause:** A deployment to the payments service introduced an incorrect environment variable value that caused the service to fail on payment processing requests, producing 5xx errors for all users attempting checkout.
 **Contributing factors:**
 - **No configuration validation step in the deployment pipeline.** The environment variable value was not validated against an expected schema, type, or allowlist before the deployment was promoted to production. An invalid value was able to pass through all pipeline stages without triggering a failure or warning.
 - **No canary or staged rollout was in place for this deployment.** The change was applied to 100% of payments service traffic simultaneously. A phased rollout would have limited the blast radius and potentially allowed automated gates to catch the error rate spike before full promotion.
 - **Deployment health check did not include an end-to-end checkout probe.** Post-deployment smoke tests validated that the payments service process was running and returning 200s on its health endpoint, but did not execute a synthetic checkout transaction. The service appeared healthy to automated gates while failing all real traffic.
 - **The environment variable change was bundled with other deployment changes.** Because the configuration change was not isolated, identifying the specific cause required additional investigation time that would not have been necessary with a dedicated configuration-only change.
 - **No configuration diff was surfaced in the deployment approval review.** The deployment approval process showed code changes but did not explicitly surface environment variable changes for reviewer attention.
 **Why did our existing safeguards not prevent this?**
 The deployment pipeline correctly executed the deployment and confirmed the service was running. However, the pipeline's definition of "healthy" — a passing health check endpoint — did not reflect the service's ability to process real payment transactions. The misconfigured environment variable had no effect on service startup or health endpoint response, which meant every automated gate the deployment passed through returned a false positive. No human reviewer was alerted to the environment variable change specifically, as it was not called out distinctly in the deployment review interface. Collectively, the safeguards in place validated process execution rather than functional correctness, creating a gap that allowed a configuration-level fault to reach full production traffic undetected.
 ---
 ## Detection
 - **How was it first detected?** Automated alert on elevated payments service 5xx error rate.
 - **Time from incident start to detection:** 6 minutes.
 - **Should we have detected this faster?** Partially. The 6-minute detection time reflects a reasonable alert latency given current thresholds. However, a synthetic checkout probe running continuously post-deploy would likely have reduced detection to under 2 minutes and could have triggered an automatic rollback without requiring human escalation.
 ---
 ## Resolution
 **What fixed it?** The payments service was rolled back to the previously stable deployment version, restoring the correct environment variable values and returning the service to a known-good configuration state.
 **Why did this work?** The prior deployment version carried the correct environment variable values as part of its packaged configuration. Rolling back replaced the misconfigured runtime environment with the validated prior state, immediately restoring correct payment processing behavior.
 **Was there a temporary mitigation before full resolution?** No. The rollback was identified as the fastest and lowest-risk path to recovery and was executed directly. No partial mitigation (such as traffic shifting or feature flagging) was applied.
 ---
 ## Action Items
 | # | Type | Action | Owner | Due Date | Priority |
 |---|---|---|---|---|---|
 | 1 | **Prevent recurrence** | Add a configuration validation step to the deployment pipeline that validates all environment variables against a typed schema and allowlist before deployment is promoted to any environment. Deployment must fail if validation does not pass. | Platform / DevOps | [DATE + 2 weeks] | P1 |
 | 2 | **Prevent recurrence** | Require environment variable changes to be submitted and reviewed as isolated configuration-only deployments, separate from code changes, with an explicit diff displayed in the approval interface. | Platform / Engineering Leads | [DATE + 3 weeks] | P1 |
 | 3 | **Prevent recurrence** | Implement canary / staged rollout for all payments service deployments, with automated promotion gates requiring <0.1% error rate on the canary slice before proceeding to full rollout. | Platform / Payments Team | [DATE + 4 weeks] | P1 |
 | 4 | **Improve detection** | Add a synthetic end-to-end checkout probe that executes a test transaction against the payments service every 60 seconds. Alert must fire within 2 minutes of probe failure. Probe to run as a post-deploy gate and continuously in production. | Observability / Payments Team | [DATE + 2 weeks] | P1 |
 | 5 | **Improve detection** | Lower the 5xx alert threshold for the payments service specifically, and evaluate adding a deployment-triggered alert window that applies a tighter threshold for the 15 minutes immediately following any payments service deploy. | Observability Team | [DATE + 1 week] | P2 |
 | 6 | **Improve response** | Write and publish a runbook for "payments service elevated 5xx post-deploy" that explicitly documents: confirm scope, check deployment log, diff environment variables, escalation path, and rollback procedure with commands. | Payments On-Call Lead | [DATE + 1 week] | P2 |
 | 7 | **Improve response** | Conduct a tabletop exercise using this incident scenario with the on-call rotation to validate the new runbook and confirm rollback procedure can be executed within 10 minutes of alert fire. | Engineering Manager | [DATE + 6 weeks] | P3 |
 > **P1 action items (1–4) must be closed before this incident is marked fully resolved.**
 ---
 ## What Went Well
 - **Automated alerting worked.** The 5xx alert fired within 6 minutes of the deployment completing — without this alert, detection would have depended on customer reports, which typically arrive with significantly more delay.
 - **Root cause was identified quickly once investigation began.** Engineers correctly narrowed the scope to the deployment and identified the specific misconfigured environment variable within approximately 12 minutes of the alert firing, which is a strong diagnostic outcome under pressure.
 - **The rollback decision was made quickly and without hesitation.** Once the misconfiguration was confirmed, the team did not attempt a risky forward-fix in a degraded state. The decision to roll back was made cleanly and executed promptly, which kept resolution time well under an hour.
 - **Blast radius was contained to checkout.** The fault in the payments service did not cascade to degrade other services. Existing service boundaries and circuit behavior limited the failure domain.
 - **Cross-team communication was prompt.** The checkout team was notified within minutes of the alert, preventing them from pursuing a parallel investigation on their own service and keeping responders coordinated.
 ---
 ## Lessons Learned
 1. **Health checks that pass can still mean a service is broken.** A service returning 200 on `/health` is not the same as a service correctly processing real requests. Deployment gates should validate functional behavior — not just process liveness — especially for services in critical transaction paths.
 2. **Configuration changes carry the same risk as code changes and should be treated accordingly.** Environment variable changes are operationally equivalent in impact to code changes. Bundling them with code changes obscures their risk and reduces the visibility reviewers have into what is actually changing.
 3. **The blast radius of a misconfiguration is determined before the incident starts.** The choice not to use staged rollouts made a single bad configuration value a 100% user impact event instead of a limited one. Rollout strategy is a risk management decision that should be mandatory for high-criticality services.
 4. **Tight alert thresholds and fast synthetic probes disproportionately reduce customer impact.** The difference between a 6-minute and a 2-minute detection time, at the transaction volumes involved, likely represents hundreds of failed checkouts. Investment in detection speed on critical paths has a direct and quantifiable impact.
 5. **Runbooks should be written before the incident that needs them.** The response in this incident relied on experienced engineers reasoning through the problem in real time. A pre-existing runbook for post-deploy payment failures would have reduced investigation time and made the on-call rotation more resilient to less experienced responders being paged.
 ---
 ## Communication Log
 *Populate with actuals. Template entries below:*
 | Time (UTC) | Channel | Message Summary |
 |---|---|---|
 | [T+08:00] | Status page | "We are investigating reports of errors during checkout. Payments may be affected." |
 | [T+20:00] | Status page | "We have identified the cause and are implementing a fix. Estimated resolution within 15 minutes." |
 | [T+30:00] | Status page | "Checkout has been restored. We are monitoring to confirm stability." |
 | [T+45:00] | Status page | "This incident is resolved. A postmortem will be published at [link]." |
 | [DATE] | Customer support | Template response issued to users who contacted support during the window — offer of [compensation policy if applicable]. |
 ---
 ## Quality Checks
 - [x] Timeline has no blame-focused language
 - [x] Root cause is specific — names the misconfigured environment variable and the deployment event, not "human error"
 - [x] Root cause answers "why did this happen?" — identifies system gaps (no config validation, no canary, inadequate smoke test) not just the symptom
 - [x] Contributing factors explain the systemic gaps that enabled the misconfiguration to reach production
 - [x] Every action item has an owner category and due date placeholder — assign named owners before publishing
 - [x] "What went well" section reflects genuine observations about response behavior
 - [x] No action item contains vague language — each names a specific, testable, ownable change
 - [x] Executive summary is readable by non-technical leadership
@@ -0,0 +1,28 @@
 ---
 skill: meeting-notes
 title: Meeting Notes
 input: "Transcript of a 30-min product/eng sync: debated whether to delay the checkout launch over a payments edge case; agreed to ship behind a flag to 10% on Thursday; Priya to write the rollback runbook; Sam worried about PCI scope; decided to ask Legal by Tuesday."
 source: representative example (hand-written)
 ---
 # Product / Eng Sync — Checkout Launch
 **Date:** (meeting date) · **Attendees:** Product, Eng, (Priya, Sam) · **Purpose:** Decide checkout launch timing
 ## Decisions
 - ✅ **Ship checkout behind a feature flag to 10% of traffic on Thursday** — not a full launch, to contain the payments edge case.
 - ✅ **Confirm PCI scope with Legal before expanding past 10%.**
 ## Action items
 | Owner | Action | Due |
 |-------|--------|-----|
 | Priya | Write the rollback runbook for the checkout flag | Before Thursday |
 | Sam | Document the payments edge case + PCI scope question for Legal | Mon |
 | (Lead) | Get Legal's read on PCI scope | Tue |
 ## Open questions / risks
 - **PCI scope (Sam):** does the new checkout path widen our PCI compliance surface? Blocks expansion beyond 10%.
 - **Edge case:** the payments failure path isn't fully characterised — the 10% rollout is partly to observe it in production.
 ## Not decided
 - Full-launch date — gated on the 10% results and Legal's answer.
@@ -0,0 +1,233 @@
 ---
 skill: metrics-framework
 title: Metrics Framework
 input: "Define the metrics framework for a B2B analytics product: the north star, input metrics across acquisition/activation/retention/revenue, and guardrails. Stage: early growth, ~500 paying accounts."
 source: generated by claude-sonnet-4-6
 ---
 # Metrics Framework: B2B Analytics Product
 **Context:** Early growth stage, ~500 paying accounts, B2B SaaS analytics product.
 **Framework chosen:** North Star + Metric Tree
 **Why:** At 500 paying accounts you have enough signal to measure real customer value, but not enough scale to optimise channels independently. A North Star + Metric Tree keeps the whole company pointed at one outcome while giving each team (sales, product, CS) clear leading indicators to own. AARRR is too funnel-centric for a product where retention and expansion are the primary growth engine; OKRs alone lack the causal structure you need to diagnose what's breaking.
 ---
 ## 1. North Star Metric
 **Weekly Active Accounts Reaching an Insight Action**
 **Definition:** The number of paying accounts (not individual users) where at least one user, in a given week, completes an *insight action* — defined as any of: sharing a dashboard externally, exporting a report to a downstream tool, creating a saved alert, or presenting a chart in a meeting (tracked via our embed/share event). These are moments where the product has produced something the customer acts on, not just views.
 **Why this is the right North Star:**
 An analytics product's core promise is not data access — it is decisions made faster or better. Tracking accounts that reach an insight action captures the moment the product delivers that promise: someone took the output and did something with it. This is harder to game than DAU or logins (which reward habit without value), and it correlates strongly with renewal because customers who regularly export, share, or alert on data have built the product into a workflow they won't easily abandon. At 500 accounts, growing this number means both activating dormant accounts and deepening usage in active ones.
 **Formula:**
 ```
 Weekly Active Accounts (Insight Actions) =
  COUNT DISTINCT(account_id WHERE insight_action_event >= 1 in rolling 7 days)
 ```
 **Current baseline:** [ADD BASELINE]
 **Target:** [ADD TARGET]
 ---
 ## 2. Metric Tree
 ```
 Weekly Active Accounts Reaching an Insight Action
 ├── ACQUISITION: New Pipeline Into the Funnel
 │   ├── Qualified new accounts entering trial / week
 │   └── Trial-to-paid conversion rate (%)
 │
 ├── ACTIVATION: Getting New Accounts to First Value
 │   ├── % of new accounts reaching first insight action within 14 days
 │   ├── Time to first insight action (median days, cohorted by signup week)
 │   └── Onboarding completion rate (% completing setup checklist)
 │
 ├── RETENTION: Keeping Active Accounts Active
 │   ├── Month-1 account retention rate (% of activated accounts still active 30 days later)
 │   ├── 90-day account churn rate (rolling)
 │   └── Insight action frequency per account (median weekly actions, active accounts only)
 │
 └── REVENUE: Monetising and Expanding Value
    ├── Net Revenue Retention (NRR) — rolling 12-month
    ├── Expansion MRR from seat / tier upgrades ($/month)
    └── Average Contract Value (ACV) by acquisition cohort
 ```
 ---
 ### L2 Metric Definitions
 #### ACQUISITION
 **Qualified new accounts entering trial / week**
 - **Definition:** Count of new accounts starting a free trial or pilot that meet your ICP criteria (e.g. company size, industry, data source used) — not raw signups, which include noise.
 - **Why it matters:** Feeds the top of the funnel. Low-quality signups inflate conversion work without adding to the North Star; tracking *qualified* accounts ties acquisition to accounts likely to reach an insight action.
 - **Leading or lagging?** Leading — predicts future activated and paying accounts.
 - **How to measure:** CRM (HubSpot/Salesforce) trial start events filtered by ICP qualification score or SDR-confirmed fit tag.
 ---
 **Trial-to-paid conversion rate (%)**
 - **Definition:** `Accounts converting to paid / Accounts starting trial` in a given cohort month. Measure at 30 and 60 days post-trial start.
 - **Why it matters:** Tells you whether the product is convincing people to pay. A rising North Star with a falling conversion rate means you're working harder to fill a leaking bucket.
 - **Leading or lagging?** Lagging (outcome of acquisition + activation quality), but leads revenue.
 - **How to measure:** Billing system (Stripe/Chargebee) paid event joined to trial start date in your data warehouse.
 ---
 #### ACTIVATION
 **% of new accounts reaching first insight action within 14 days**
 - **Definition:** `Accounts with ≥1 insight action event in days 0–14 / All accounts that started onboarding in the same cohort`. 14 days is the activation window because analytics products typically require data connection, dashboard build, and a first "aha" — this rarely compresses below a week but should not take three.
 - **Why it matters:** This is the single strongest predictor of whether an account will still be active at day 90. If an account doesn't reach an insight action in the first two weeks, churn probability spikes sharply. Fix this metric and you fix retention downstream.
 - **Leading or lagging?** Leading — strongly predictive of 90-day retention and NRR.
 - **How to measure:** Product event log: `insight_action` event with `account_id`, joined to account `created_at`. Compute cohort weekly.
 ---
 **Time to first insight action — median days (cohorted by signup week)**
 - **Definition:** Median number of calendar days between account `created_at` and first `insight_action` event, measured per signup cohort. Track P50 and P90 to catch tail distributions.
 - **Why it matters:** Even if 60% of accounts activate, a median time of 11 days vs. 4 days reflects a materially different onboarding experience. Faster time-to-value compresses the sales cycle's "prove it" phase and reduces early churn.
 - **Leading or lagging?** Leading.
 - **How to measure:** Same event log as above. `DATE_DIFF(first_insight_action_at, account_created_at)`. Track weekly cohort medians in a time-series chart.
 ---
 **Onboarding completion rate (%)**
 - **Definition:** `Accounts completing all steps of the onboarding checklist / Accounts that started onboarding`. Steps should include: data source connected, first dashboard created, first team member invited, first alert set. Define the checklist explicitly in-product — this metric is only meaningful if the checklist reflects the actual path to value.
 - **Why it matters:** Diagnostic metric. When time-to-first-insight deteriorates, check here first. A drop in checklist completion means an onboarding step is breaking; stagnant checklist completion with slow insight time means the checklist itself isn't driving the right behaviour.
 - **Leading or lagging?** Leading.
 - **How to measure:** In-product step-completion events, tracked per account. Build a funnel view in your analytics tool (yes, eat your own cooking).
 ---
 #### RETENTION
 **Month-1 account retention rate (%)**
 - **Definition:** `Accounts with ≥1 insight action in days 22–35 / Accounts with ≥1 insight action in days 1–14`. This is *activity-based* retention, not just "not cancelled" — an account that stops using the product before churning is already lost.
 - **Why it matters:** At 500 accounts, losing 10% in Month 1 is a structural problem that compounding makes catastrophic. This metric is the earliest possible read on whether activation led to a habit, not just curiosity.
 - **Leading or lagging?** Leading (predicts 90-day churn and NRR).
 - **How to measure:** Product event log with rolling cohort windows. Measure by account, not user.
 ---
 **90-day account churn rate (rolling)**
 - **Definition:** `Accounts that cancelled or went inactive (zero insight actions for 30+ days) in the trailing 90 days / Average paying accounts in that period`. Track both hard cancellations and soft churn (paid but dormant — these accounts are at high risk and represent recoverable revenue).
 - **Why it matters:** This is your core retention health signal. At early growth stage, a churn rate above ~2–3% monthly compresses your growth ceiling to the point where acquisition can't outrun it. Separate hard from soft churn — they have different interventions.
 - **Leading or lagging?** Lagging — it records what already happened.
 - **How to measure:** Billing system cancellation events + product event log for soft churn detection. Report monthly, trend quarterly.
 ---
 **Insight action frequency per account — median weekly actions (active accounts only)**
 - **Definition:** Among accounts that had ≥1 insight action in the past 30 days, the median number of insight actions per week. Exclude newly onboarded accounts (< 30 days old) to avoid distortion.
 - **Why it matters:** Frequency signals depth of workflow integration. An account doing one insight action per month is at risk; one doing five per week has built the product into a recurring process. This metric moves before NRR does and identifies expansion candidates.
 - **Leading or lagging?** Leading (predicts expansion and renewal).
 - **How to measure:** Product event log, aggregated to account-week level. Compute rolling median over active account cohort.
 ---
 #### REVENUE
 **Net Revenue Retention — NRR (rolling 12-month)**
 - **Definition:** `(Starting MRR + Expansion MRR - Contraction MRR - Churned MRR) / Starting MRR` for the same account cohort over 12 months. Expressed as a percentage. A B2B analytics product at growth stage should be targeting ≥110% NRR.
 - **Why it matters:** NRR is the compounding engine of B2B SaaS. Above 100% means your existing base grows even with zero new sales. Below 100% means you're on a treadmill. At 500 accounts, NRR is more important than new logo growth because expansion from proven accounts costs a fraction of acquiring new ones.
 - **Leading or lagging?** Lagging — it is the revenue outcome of retention and expansion behaviour.
 - **How to measure:** Billing system (Stripe/Chargebee MRR movements by account, tagged as expansion/contraction/churn). Compute monthly, trend as a rolling 12-month cohort.
 ---
 **Expansion MRR from seat / tier upgrades ($/month)**
 - **Definition:** Monthly recurring revenue added from existing paying accounts upgrading their plan, adding seats, or unlocking add-ons. Excludes new logo revenue.
 - **Why it matters:** At early growth stage with 500 accounts, expansion MRR is both more capital-efficient than new acquisition and a signal that accounts are getting more value over time (they wouldn't pay more otherwise). Track separately from new logo MRR to understand the true growth mix.
 - **Leading or lagging?** Lagging (records the upgrade event), but leads NRR.
 - **How to measure:** Billing system, upgrade events tagged by account. `SUM(new_plan_mrr - old_plan_mrr WHERE upgrade_event IN month)`.
 ---
 **Average Contract Value — ACV by acquisition cohort**
 - **Definition:** Total annual contract value divided by number of accounts, segmented by the quarter the account was acquired. Track how ACV trends across cohorts.
 - **Why it matters:** Rising ACV across cohorts means you're moving upmarket or improving sales qualification. Falling ACV means you're backfilling churn with smaller, riskier accounts. At 500 accounts this is easy to miss in aggregate but visible by cohort.
 - **Leading or lagging?** Lagging.
 - **How to measure:** CRM contract value fields, grouped by account `closed_won_date` cohort quarter.
 ---
 ## 3. Counter-Metrics (Guardrails)
 These three metrics exist to ensure you don't optimise the North Star in ways that quietly damage the business.
 **1. Support ticket volume per active account (weekly)**
 If you optimise for insight action frequency by making exports or alerts easier to trigger, you may inadvertently create confusion (users firing alerts they don't understand, or exporting malformed data). A rising support ticket rate alongside a rising North Star is a warning sign that product quality is being sacrificed for engagement. Target: flat or declining as the account base grows.
 **2. Trial-to-paid conversion rate vs. activation rate gap**
 Track whether your 14-day activation rate and your trial-to-paid conversion rate move together. If activation rises but conversion stays flat, your sales process or pricing is misaligned with the value customers are experiencing — they're getting value but not paying for it. If conversion rises but activation falls, you may be using sales pressure to close accounts that aren't genuinely ready, loading your churn pipeline.
 **3. Percentage of NRR from more than three accounts**
 At 500 accounts, NRR can look healthy while being dangerously concentrated. If three accounts represent >30% of expansion MRR, your NRR metric is masking fragility. Track concentration monthly and flag any single account >10% of expansion MRR for executive review. This is especially critical before a Series A, where investors will stress-test customer concentration.
 ---
 ## 4. Dashboard Recommendation
 ### Exec View — Weekly (5 metrics)
 | Metric | Why here |
 |---|---|
 | Weekly Active Accounts (Insight Actions) | The North Star — top of every review |
 | 90-day rolling account churn rate | Retention health at a glance |
 | Net Revenue Retention (trailing 12-month) | Business compounding signal |
 | New qualified trials entered this week | Leading acquisition signal |
 | Expansion MRR this month vs. last | Revenue momentum |
 ### Team View — Daily (9 metrics)
 *For product, CS, and growth teams:*
 | Metric | Owner |
 |---|---|
 | % new accounts reaching first insight action (14-day window, current cohort) | Product / Onboarding |
 | Time to first insight action — P50 and P90 | Product |
 | Onboarding checklist completion rate | Product / CS |
 | Month-1 account retention (most recent completed cohort) | CS |
 | Insight action frequency per account (active accounts) | Product |
 | Qualified trials started this week | Growth / Sales |
 | Trial-to-paid conversion rate (30-day cohort) | Sales / Growth |
 | Support tickets per active account | CS / Product |
 | Soft churn accounts (active 30+ days ago, zero actions last 14 days) | CS |
 ### Diagnostic View — On Demand
 *Pull these when a top-level metric moves unexpectedly:*
 - Insight action breakdown by action type (export vs. share vs. alert vs. embed) — tells you *which* value moment is driving or declining
 - Activation funnel by onboarding step (where do accounts drop out of the checklist?)
 - Churn cohort analysis by acquisition channel and ICP segment
 - Feature adoption depth map (which dashboard features correlate with retained accounts?)
 - Account health score distribution (a composite of recency, frequency, and depth of insight actions — build this before you need it)
 ---
 ## 5. Weekly Metrics Review — Health Check Questions
 Use these five questions to turn numbers into decisions in your weekly review:
 **1. Is our 14-day activation rate moving in the same direction as time-to-first-insight?**
 If activation rate rises but time-to-first-insight also rises, you may be counting a lower-quality action as an insight event — or onboarding is slower but more accounts are eventually getting there. Investigate the *type* of insight action driving the change.
 **2. Are accounts that activated via a specific onboarding step retaining at a higher rate?**
 Compare Month-1 retention for accounts that completed the full checklist vs. partial completions. If "first alert set" correlates with dramatically higher retention, that step is your real activation moment — consider forcing it earlier in onboarding.
 **3. Is soft churn rising even while hard churn holds flat?**
 Soft churn (paid but dormant) is the canary. If you see accounts going quiet 60–90 days before cancellation, you have a CS intervention window you're not using. A rising soft churn count is a leading indicator of a churn spike in the next quarter.
 **4. Is expansion MRR coming from accounts with high insight action frequency, or from accounts that are being upsold regardless of usage?**
 Expansion driven by high-frequency accounts is durable — those customers are paying more because they're getting more value. Expansion from low-frequency accounts is fragile — it often reverses at renewal. Segment expansion MRR by account activity level monthly.
 **5. Is our NRR concentration increasing or decreasing?**
 Track what share of total NRR (retention + expansion) comes from your top 10 accounts. At 500 accounts this should be gradually diversifying. If it's concentrating, your CS team may be over-investing in whales while mid-market accounts quietly churn — which also distorts every other metric in this framework.
 ---
 ## Quality Check
 - [x] North Star reflects customer value delivered (decisions acted on), not activity or revenue
 - [x] Metric tree covers 4 distinct drivers: Acquisition, Activation,
@@ -0,0 +1,102 @@
 ---
 skill: okr-builder
 title: Okr Builder
 input: "Company objective: become the default analytics tool for startups. For the product team, next quarter. We care about activation, retention, and word-of-mouth growth."
 source: generated by claude-sonnet-4-6
 ---
 ### Q2 2025 OKRs — Product Team
 *North Star assumed from context: Startups that reach "activated" status and stay retained long enough to refer others. All OKRs ladder to becoming the default analytics tool for startups.*
 ---
 **Objective 1: Make the first hour with our product undeniably valuable for every new startup**
 *Why this matters:* Startups evaluate tools fast and ruthlessly. If we don't deliver a clear "aha moment" in the first session, we lose them before retention is even possible — and word-of-mouth never starts.
 | # | Key Result | Baseline | Target | Measurement Method |
 |---|---|---|---|---|
 | KR1 | New user activation rate (reaching "aha moment" milestone) increases | Baseline needed ⚠️ | +25pp above baseline | Product analytics event tracking |
 | KR2 | Percentage of new signups who complete a meaningful analysis within their first session | Baseline needed ⚠️ | 55% of new signups | Session-level event funnel |
 | KR3 | Time-to-first-insight for new users decreases | Baseline needed ⚠️ | Reduce by 40% | Median time from signup to first chart/report event |
 *Owner:* Head of Product / Onboarding Squad
 *Check-in cadence:* Weekly
 ---
 **Objective 2: Turn activated users into loyal, long-term users who build workflows around us**
 *Why this matters:* Activation without retention is a leaky bucket. For startups to call us their "default" tool, using us needs to become habitual — not occasional.
 | # | Key Result | Baseline | Target | Measurement Method |
 |---|---|---|---|---|
 | KR1 | 30-day retention for activated users increases | Baseline needed ⚠️ | 45% 30-day retention | Cohort retention report |
 | KR2 | Weekly active usage rate among retained users (using product 3+ days/week) increases | Baseline needed ⚠️ | 35% of retained users | DAU/WAU ratio in product analytics |
 | KR3 | Involuntary churn due to friction (support-flagged product issues) decreases | Baseline needed ⚠️ | Reduce by 30% | Support ticket tagging + churn reason attribution |
 *Owner:* Head of Product / Retention Squad
 *Check-in cadence:* Weekly
 ---
 **Objective 3: Make our users the engine of our growth in the startup ecosystem**
 *Why this matters:* The fastest path to becoming the default analytics tool for startups is peer recommendation — founders trust other founders more than any marketing channel. We need to engineer the conditions for word-of-mouth, not hope it happens.
 | # | Key Result | Baseline | Target | Measurement Method |
 |---|---|---|---|---|
 | KR1 | Net Promoter Score among activated startup users increases | Baseline needed ⚠️ | NPS reaches 45+ | Quarterly NPS survey (Delighted or equivalent) |
 | KR2 | Percentage of new signups attributing discovery to peer referral or word-of-mouth increases | Baseline needed ⚠️ | 30% of new signups | Attribution survey at signup |
 | KR3 | Number of startups that share a public link, embed, or export created from our product | Baseline needed ⚠️ | 20% of activated users share at least once | Share/export event tracking |
 *Owner:* Head of Product / Growth Squad
 *Check-in cadence:* Weekly
 ---
 ## ⚠️ Health Check — Baseline Data Gaps
 Every KR above is flagged because **no baselines were provided**. Before this OKR set is finalized, the team must pull current-state numbers for each metric. OKRs without baselines cannot be scored at quarter end.
 **Recommended action before Q2 kickoff:**
 | KR | Data Source to Pull Baseline From |
 |---|---|
 | Activation rate | Product analytics (Mixpanel, Amplitude, or equivalent) |
 | First-session analysis completion | Event funnel in product analytics tool |
 | Time-to-first-insight | Median time between signup event and first output event |
 | 30-day retention | Cohort table in product analytics |
 | Weekly active usage rate | DAU/WAU calculation |
 | Involuntary churn (friction) | Support ticket audit, last 60 days |
 | NPS | Run a one-time survey this week if no historical data exists |
 | Word-of-mouth attribution | Add attribution question to signup flow immediately |
 | Public share rate | Instrument share/export events if not already tracked |
 ---
 ## Anti-Pattern Audit
 Before submitting these OKRs, confirm none of the following slipped through:
 - [x] **No output-based KRs** — none of these say "launch X" or "ship Y"
 - [x] **No binary KRs** — every KR is scorable on a 0.0–1.0 scale
 - [x] **No vanity metrics** — downloads, pageviews, and raw signups intentionally excluded
 - [x] **Baselines flagged** — all gaps explicitly called out above
 - [x] **Max 3 KRs per objective** — all objectives within limit
 - [x] **Connected to North Star** — all three objectives ladder to "default analytics tool for startups"
 ---
 ## Scoring Guide
 At the end of Q2, score each KR from 0.0 to 1.0:
 | Score | Meaning |
 |---|---|
 | **0.7–1.0** | Excellent. 0.7 is the sweet spot — if every KR scores 1.0, targets weren't ambitious enough |
 | **0.4–0.6** | Progress made, but missed the target — understand why |
 | **0.0–0.3** | Missed — hold a retrospective before setting Q3 OKRs |
 > **Reminder:** These OKRs are not a performance review. A team that scores 0.7 across the board has had a great quarter. Hitting 1.0 on everything means next quarter's targets need to be harder.
@@ -0,0 +1,44 @@
 ---
 skill: prd-template
 title: PRD Template
 input: "Feature: in-app referral program so existing users invite colleagues and both get a credit. Target: activated B2B users. Goal: grow signups 15% in Q3."
 source: representative example (hand-written)
 ---
 # PRD — In-App Referral Program
 **Status:** Draft · **Owner:** PM · **Target release:** Q3
 ## Problem
 Activated B2B users are our happiest segment, but we have no built-in way for them to bring colleagues in. Word-of-mouth happens off-platform and we capture none of it. Signup growth is flat at ~5%/qtr through paid channels alone.
 ## Goals
 - Grow new signups **+15% in Q3** via referrals.
 - Referral signups activate at **≥ the current baseline** (don't trade quality for volume).
 ## Non-goals
 - Public/affiliate program for non-customers (separate effort).
 - Cash payouts — credits only, this release.
 ## User stories
 - *As an activated user,* I can invite a colleague by email or shareable link and see the reward I'll get.
 - *As an invited colleague,* I land on a page that explains the credit and makes signup one click.
 - *As a referrer,* I can see the status of my invites (sent / joined / credited).
 ## Functional requirements
 1. Generate a unique referral link per user.
 2. Both referrer and referee receive a **$X account credit** when the referee activates (not just signs up).
 3. Referral dashboard: invites sent, joined, credit earned.
 4. Fraud guardrail: credit only on activation + payment-method-on-file.
 ## Success metrics
 | Metric | Baseline | Target |
 |--------|:--------:|:------:|
 | New signups from referrals | 0 | 15% of total Q3 signups |
 | Referral → activation rate | — | ≥ overall baseline |
 | Referrals sent per active user | — | ≥ 0.3 |
 ## Open questions
 - Credit amount — what's the payback period at our LTV? (needs Finance)
 - Does "activation" or "first payment" trigger the credit? (affects fraud + cost)
 - Self-serve only, or also sales-assisted accounts?
@@ -0,0 +1,25 @@
 ---
 skill: rice-prioritisation
 title: RICE Prioritisation
 input: "Rank for next quarter: 1) Onboarding redesign — ~5000 users/qtr, big activation impact, ~3 person-months. 2) Dark mode — ~8000 users want it, low impact, ~1 person-month. 3) SSO for enterprise — ~400 accounts, high deal impact, ~4 person-months, low confidence."
 source: representative example (hand-written)
 ---
 # RICE Prioritisation — Next Quarter
 RICE = (Reach × Impact × Confidence) ÷ Effort. Impact scale: 3 = massive, 2 = high, 1 = medium, 0.5 = low.
 | # | Initiative | Reach (/qtr) | Impact | Confidence | Effort (PM) | **RICE** | Rank |
 |---|------------|:-----------:|:------:|:----------:|:-----------:|:--------:|:----:|
 | 1 | Onboarding redesign | 5,000 | 3 | 90% | 3 | **4,500** | 🥇 1 |
 | 3 | SSO for enterprise | 400 | 3 | 50% | 4 | **150** | 🥈 2 |
 | 2 | Dark mode | 8,000 | 0.5 | 100% | 1 | **4,000**? | — |
 > ⚠️ Dark mode's raw score (4,000) looks high purely from reach. The low impact (0.5) is doing the real work — it's a satisfaction nicety, not a needle-mover. Flag for a "quick win" lane, not the strategic backlog.
 ## Recommendation
 1. **Onboarding redesign** — clear top priority: high reach, massive impact, well understood (90% confidence).
 2. **SSO** — high deal impact but the score is dragged down by low confidence and high effort. **De-risk first**: validate the enterprise pipeline ($ tied to SSO) before committing 4 person-months.
 3. **Dark mode** — ship it as a low-cost quick win between larger projects; don't let its reach inflate its priority.
 **Watch-out:** the SSO confidence is the swing factor. A week of sales discovery could move it from 50% → 80% and change the ranking.
@@ -0,0 +1,104 @@
 ---
 skill: roadmap-narrative
 title: Roadmap Narrative
 input: "H2 roadmap for a B2B analytics product. Themes: self-serve onboarding, an integrations marketplace, and enterprise SSO/audit logs. Audience: the exec team and key customers. We want the story, not a feature list."
 source: generated by claude-sonnet-4-6
 ---
 # Product Roadmap: H2 2025
 ---
 **Strategic Context:**
 The B2B analytics market is consolidating around platforms that win on time-to-value, not feature count. Buyers are making decisions faster, involving fewer specialist resources, and expecting enterprise-grade security as a baseline rather than a premium. Our H2 roadmap responds to this directly: we are removing every friction point that slows a new customer from experiencing value, building the ecosystem connectivity that makes us indispensable to their existing stack, and closing the enterprise compliance requirements that are currently blocking deals from closing. These three bets are not independent — they compound. A customer who onboards in an hour, connects their existing tools in a day, and clears their security review without a custom procurement cycle is a customer who expands.
 ---
 ## Theme 1: The First Hour Matters More Than the First Month
 **Strategic Rationale:**
 Right now, a new customer's first experience with the product is mediated by us — by onboarding calls, setup guides, and human handholding. That's expensive for us and slow for them. The customers most likely to become long-term, high-value accounts are the ones who reach their first meaningful insight quickly and independently. Every day of delay between signup and "aha moment" is churn risk we're creating ourselves. This theme is about getting out of the customer's way.
 **Initiatives:**
 - Guided, in-product onboarding flow with milestone-based progression
 - Pre-built dashboard templates by industry and use case
 - Interactive product tour for first-time users (no sales touch required)
 - Contextual in-app help and empty-state guidance
 **Primary Metric Impacted:**
 Time-to-first-insight (target: reduce from current baseline to under 60 minutes for self-serve accounts). Secondary: free-to-paid conversion rate.
 **Progression Link to H2:**
 This work ships early in Q3. It is a prerequisite for Theme 2 — an integrations marketplace only delivers value to customers who are already active in the product. We cannot drive integration adoption against a leaky onboarding funnel.
 **Dependencies:**
 Customer data infrastructure to support template personalisation by vertical. Alignment with Marketing on top 5 use-case categories to template first.
 ---
 ## Theme 2: Your Stack, Not a Replacement for It
 **Strategic Rationale:**
 No analytics product wins by asking customers to abandon their existing tools. The companies we want to own — mid-market and enterprise — run on Salesforce, HubSpot, Snowflake, Databricks, and a dozen others. If connecting our product to those systems requires an engineer and a two-week project, we are selling to the wrong buyer and creating a fragile deployment. An integrations marketplace shifts our position from "analytics tool" to "analytics layer" — the connective tissue across the customer's data ecosystem. This is the difference between a product customers use and infrastructure customers depend on.
 **Initiatives:**
 - Self-serve integrations marketplace (launch with 15 native connectors, prioritised by customer demand)
 - Partner-built integration framework with documentation and certification
 - Integration health monitoring and alerting (customers know when data stops flowing before we tell them)
 - Usage-based integration analytics surfaced to customers (so they see ROI from connectivity)
 **Primary Metric Impacted:**
 Integrations per account (target: 2+ integrations per active account by end of H2). Predictive indicator for net revenue retention.
 **Progression Link to H2:**
 This theme runs through Q3 and into Q4. The partner-built framework is a Q4 deliverable — it requires the native connector infrastructure to exist first. By Q4, we want external partners building on this platform, not just us shipping to it.
 **Dependencies:**
 Partnership agreements with top-5 integration targets before build begins. Engineering capacity sequenced after onboarding flow ships — these two themes cannot run in parallel at full velocity.
 ---
 ## Theme 3: Enterprise Deals Should Not Die in Legal
 **Strategic Rationale:**
 We are losing enterprise deals — or delaying them by quarters — not because we lose on product capability, but because we cannot pass the security review. SSO, audit logs, and role-based access are not differentiators in enterprise software; they are the price of admission. Every month we ship without them is a month the enterprise segment of our pipeline is effectively frozen. This theme is about unlocking a revenue category that is already warming up to us — and removing the last objection that procurement teams are trained to find.
 **Initiatives:**
 - SAML/OIDC Single Sign-On with major identity providers (Okta, Azure AD, Google Workspace)
 - Immutable audit logs with admin export and configurable retention
 - Role-based access control granularity (read, edit, admin, custom roles)
 - Security documentation and trust portal (self-serve for customer IT and legal teams)
 **Primary Metric Impacted:**
 Enterprise deal cycle length (target: reduce average security review delay by 50%). Direct impact on enterprise pipeline conversion and ACV.
 **Progression Link to H2:**
 This is a Q4-weighted theme. The business case for prioritising it after onboarding and integrations is deliberate: SSO and audit logs serve customers who are already deciding to buy. The onboarding and marketplace work serves customers earlier in their journey. We are sequencing for maximum H2 impact — convert the pipeline we already have while building the funnel for H1 2026.
 **Dependencies:**
 Legal and security team sign-off on audit log architecture before build. Coordination with Sales to identify the 10 enterprise prospects whose deals will accelerate immediately upon release — these are our launch reference customers.
 ---
 ## What's Not on the Roadmap — and Why
 **Advanced AI/ML Features:**
 Every competitor is announcing AI capabilities right now. We have deliberately kept AI features out of H2, not because the opportunity is wrong, but because it is the wrong moment for us. We are not yet in a position where customers are extracting full value from the data they already have in the product. Shipping AI-powered insights to customers who haven't completed basic onboarding is noise, not value. We will revisit this as a Q1 2026 theme, once the self-serve foundation is in place and we have meaningful usage data to build against.
 **Mobile Application:**
 Mobile analytics has genuine demand signals in our customer base. It is not on the roadmap because it is a significant engineering investment that does not accelerate any of our three strategic themes. It serves an adjacent workflow without deepening the core use case. This is a parking lot item, not a rejection — we want to understand the use case more precisely before committing.
 **Custom Reporting Builder:**
 A flexible, open-ended report builder appears repeatedly in customer feedback. We are consciously choosing not to build it in H2. The pre-built templates in Theme 1 will serve 80% of the stated need for the customer segments we are growing. A custom builder before we understand which templates customers actually use would be building the wrong thing at significant cost. Template adoption data from Q3 will inform whether and how we scope this in 2026.
 ---
 ## Executive Summary
 *Shareable for all-hands, board updates, and key customer conversations.*
 > In H2, we are making three bets that compound. First, we are making it possible for any new customer to reach their first real insight in under an hour — without a call, without a guide, without us. Second, we are becoming the analytics layer that connects to the tools customers already depend on, starting with 15 native integrations and a platform for partners to build on. Third, we are removing the last objection that blocks enterprise deals — SSO and audit logs — so that security reviews become a formality rather than a graveyard. Taken together, these are not three separate features — they are one strategy: make it easier to start, impossible to leave, and safe to buy at any scale.
 ---
 *Quality note for internal use: Every initiative above maps to one of three themes. No orphaned items. The "What's Not on the Roadmap" section includes rationale rooted in strategic sequencing, not resource constraints alone. The executive summary is written to be repeated accurately after a single reading — test it by reading it once and asking a colleague to play it back.*
@@ -0,0 +1,155 @@
 ---
 skill: sprint-planning
 title: Sprint Planning
 input: "Team of 5, 2-week sprint, average velocity 30 points, one engineer out 3 days. Backlog: checkout redesign (8), payment retries (5), analytics events (3), bug bash (3), API rate limiting (5)."
 source: generated by claude-sonnet-4-6
 ---
 # Sprint 14 — 2-Week Planning Output
 ---
 ## Step 1: Capacity Calculation
 Running the capacity model before touching the backlog.
 ```
 Team:               5 engineers
 Sprint days:        10 (2 weeks)
 Focus hours/day:    6
 One engineer out:   3 days → 3 lost focus-days
 Total theoretical focus-days:  5 × 10 = 50
 Adjusted focus-days:           50 − 3 = 47
 Availability factor:           47 ÷ 50 = 0.94 → capped at 0.85 (meetings, admin, Slack)
 Velocity-adjusted capacity:    30 × 0.85 = 25.5 → 26 points available
 Recommended commitment (80%):  26 × 0.80 = ~21 points
 No carry-overs reported → full 21 points available for new work.
 ```
 > **Equivalent CLI call:**
 > ```bash
 > python3 scripts/capacity_calculator.py \
 >   --team 5 --days 10 --velocity 30 \
 >   --availability 0.85 --carryover 0
 > # → Available: 26 pts | Recommended commit: 21 pts | Capacity for new work: 21 pts
 > ```
 ---
 ## Step 2: Backlog Review & Story Assessment
 | Story | Raw Est. | Flag | Decision |
 |---|---|---|---|
 | Checkout redesign | 8 | ⚠️ **Must split** — 8 pts, high uncertainty | Split below |
 | Payment retries | 5 | ✅ Ready | Include |
 | API rate limiting | 5 | ✅ Ready | Include |
 | Analytics events | 3 | ✅ Ready | Include |
 | Bug bash | 3 | ✅ Ready | Include |
 **Checkout redesign (8) — mandatory split before committing:**
 | Sub-story | Points | Rationale |
 |---|---|---|
 | Checkout redesign — layout & routing | 3 | New page structure, no logic changes |
 | Checkout redesign — form validation & error states | 3 | Client-side logic, known patterns |
 | Checkout redesign — integration with order service | 5 | API contract changes, needs design sign-off |
 > The integration slice (5 pts) carries dependency risk — see Risks section. Pull it only if the API contract is confirmed before sprint start; otherwise defer to Sprint 15.
 ---
 ## Sprint 14 — 3 June to 14 June 2025
 ---
 **Sprint Goal:**
 > "This sprint we will deliver a reliable, polished checkout and payment experience so that users can complete purchases without hitting dead ends, measured by zero payment-retry silent failures in production and checkout task-completion rate ≥ 90% in usability smoke test."
 ---
 **Team Capacity:** 21 story points committed (26 available, 80% rule applied; 5-person team at 85% availability)
 ---
 ### Sprint Backlog
 | Priority | Story | Points | Owner | Acceptance Criteria |
 |---|---|---|---|---|
 | 1 | Checkout redesign — layout & routing | 3 | TBD | Given a logged-in user, when they initiate checkout, then they land on the redesigned page with correct route and no regression on existing order flow |
 | 2 | Checkout redesign — form validation & error states | 3 | TBD | Given invalid input, when the user submits, then inline errors appear per field without page reload; given valid input, no false errors shown |
 | 3 | Payment retries | 5 | TBD | Given a failed payment attempt, when the system retries, then it does so up to 3 times with exponential backoff; failed retries are logged with error code; user sees accurate status message |
 | 4 | Analytics events | 3 | TBD | Given a user completes any checkout step, when the step fires, then the correct event name, user ID, and timestamp appear in the analytics pipeline within 30 seconds; no duplicate events |
 | 5 | Bug bash | 3 | Whole team | Given the bug bash session (Day 8), when complete, then all P1 bugs discovered are filed with repro steps; at least 50% of P2s have owners assigned |
 | — | API rate limiting | 5 | **Deferred to Sprint 15** | See rationale below |
 | — | Checkout redesign — integration with order service | 5 | **Deferred to Sprint 15** | Blocked on API contract confirmation |
 **Committed total: 17 points**
 *(Intentionally under the 21-point ceiling — see rationale below)*
 ---
 ### Carry-Overs from Previous Sprint
 None reported. ✅ Clean slate — full recommended capacity applied to new work.
 ---
 ### Capacity Rationale & Deferred Items
 **Why only 17 of 21 points committed:**
 The two deferred items (API rate limiting — 5 pts, checkout integration — 5 pts) together total 10 pts and both carry pre-conditions that aren't confirmed yet. Pulling them in speculatively would risk the sprint goal. The 4-point buffer below the 21-point ceiling absorbs:
 - Unplanned bug fixes surfacing from the bug bash
 - The absent engineer's knowledge areas (if they own context on payment systems)
 - Any scope creep on the checkout layout work
 If blockers resolve by **Day 2**, API rate limiting can be pulled in from the backlog to fill the buffer.
 ---
 ### Risks & Dependencies
 | # | Risk | Severity | Mitigation |
 |---|---|---|---|
 | 1 | Checkout integration slice depends on confirmed API contract with order service | 🔴 High | Confirm contract with order service team **before Day 1**. If not confirmed by EOD Day 1, defer entire integration slice to Sprint 15. |
 | 2 | Engineer absent 3 days — if they own payment systems context, retry logic is at risk | 🟡 Medium | Assign payment retries story to a different engineer now; schedule 30-min knowledge handoff before absence starts. |
 | 3 | Bug bash (Day 8) may surface P1 issues requiring immediate sprint work | 🟡 Medium | Reserve the 4-pt buffer explicitly for bug bash fallout. Do not commit buffer to new stories before Day 8. |
 | 4 | Analytics events story has no confirmed data-pipeline owner outside the team | 🟡 Medium | Identify pipeline contact on Day 1; confirm event schema is agreed before story is picked up. |
 | 5 | Two 5-point stories deferred — if Sprint 15 also has high load, rate limiting may slip a second time | 🟢 Low | Flag both deferred items to Product as Sprint 15 P1 carry-ins now, not after this sprint closes. |
 ---
 ### Stories Flagged for Missing Acceptance Criteria
 All five committed stories have acceptance criteria defined above. ✅
 > If any AC above was written without Product Owner confirmation, **treat it as a draft** and validate during the planning meeting — do not carry a story into the sprint on assumed criteria.
 ---
 ## Sprint Planning Agenda — Day 1, 09:00–11:00
 | Time | Block | Owner | Notes |
 |---|---|---|---|
 | 09:00–09:10 | Open: sprint goal read-aloud, capacity recap | Scrum Master | Confirm 17-pt commitment, name the 4-pt buffer explicitly |
 | 09:10–09:25 | Risk walk-through | Scrum Master | Resolve Risk #1 (API contract) live if possible — go/no-go on checkout integration |
 | 09:25–09:55 | Backlog walk — committed stories | Product Owner | Each story read aloud; team confirms AC or flags changes |
 | 09:55–10:25 | Estimation confirmation & story splitting | Whole team | Validate checkout sub-stories; challenge any estimate that's changed since grooming |
 | 10:25–10:50 | Ownership assignment | Whole team | Every story gets a named owner; absent engineer's stories reassigned |
 | 10:50–10:58 | Buffer & deferred item decision | Product Owner + Tech Lead | Confirm API rate limiting pull-in criteria; set Day 2 deadline for decision |
 | 10:58–11:00 | Commit & close | Scrum Master | Read sprint goal aloud one final time; team verbal confirmation |
 ---
 ## Quality Checklist
 - [x] Sprint goal is outcome-focused and pass/fail scoreable at sprint end
 - [x] Capacity uses 85% availability factor, not theoretical 100%
 - [x] 80% commit rule applied (17 of 21 recommended points committed)
 - [x] Every committed story has an acceptance criterion
 - [x] 8-point story split before entering sprint
 - [x] No carry-overs from previous sprint — accounted for in capacity
 - [x] Absent engineer's impact modelled in availability factor and risk log
 - [x] Deferred items flagged to Product as Sprint 15 pre-loads
@@ -0,0 +1,78 @@
 ---
 skill: stakeholder-update
 title: Stakeholder Update
 input: "Weekly update for sales, support, and exec stakeholders on the checkout revamp. Status: 10% rollout live, conversion +4%, one payments edge case under investigation, full launch gated on a Legal PCI review due Tuesday."
 source: generated by claude-sonnet-4-6
 ---
 # Product Update: Checkout Revamp
 **Week of [Current Week]**
 ---
 ## BLUF
 **Status**: 🟡 At Risk
 **Key Takeaway**: Checkout revamp is live at 10% traffic with early conversion gains (+4%), but full launch is gated on a Legal PCI review due Tuesday and one payments edge case under active investigation.
 **Action Needed**: Legal to confirm PCI review timeline by Tuesday EOD; Support leads to review edge case escalation path below.
 ---
 ## Progress Summary
 - Launched checkout revamp to 10% of traffic — rollout is stable with no incidents
 - Conversion rate up +4% in early data vs. control group
 - Identified one payments edge case (see Risks); investigation underway with no customer impact confirmed yet
 - Full launch hold in place pending Legal PCI compliance sign-off
 ---
 ## Key Metrics
 | Metric | Current | Target | Trend | Status |
 |--------|---------|--------|-------|--------|
 | Rollout Coverage | 10% | 100% | → | 🟡 |
 | Conversion Rate (new flow) | +4% vs. control | +5% | ↑ | 🟢 |
 | Payments Edge Case Incidents | Under investigation | 0 | — | 🟡 |
 | PCI Review Status | In review | Approved by Tue | → | 🟡 |
 ---
 ## Risks & Blockers
 **HIGH: PCI Compliance Review (Legal Gate)**
 - **Impact**: Full launch cannot proceed until Legal signs off; delays 100% rollout beyond Tuesday if review slips
 - **Mitigation**: Review is scheduled and in progress; Product is on standby to address any findings immediately
 - **Help Needed**: **Legal stakeholders** — please confirm Tuesday EOD as the firm completion date, or flag if timeline is at risk by Monday so the team can adjust launch planning
 **MEDIUM: Payments Edge Case Under Investigation**
 - **Impact**: Specific scenario in the payments flow is producing unexpected behavior; scope and customer impact not yet confirmed
 - **Mitigation**: Engineering is actively investigating; rollout held at 10% as a precaution; no customer-facing errors or failed transactions confirmed to date
 - **Help Needed**: **Support leads** — if you receive any customer complaints related to checkout errors or payment failures this week, escalate to [PM/Engineering contact] immediately rather than routing through standard queue
 ---
 ## Upcoming Milestones
 **Next 30 Days:**
 - PCI review complete (Tuesday, [date])
 - Payments edge case resolved (targeting this week)
 - Full 100% rollout (targeting end of week, contingent on above)
 - First full-week conversion data at scale (following week)
 **Next 90 Days:**
 - Post-launch iteration based on production data
 - Conversion optimization pass based on full-traffic learnings
 - Evaluate checkout improvements for mobile app
 ---
 ## What Each Team Should Know
 **Sales**: Early conversion data is encouraging (+4%) — this is directionally positive for pipeline-to-close metrics, though it's too early to quote confidently. Hold on using checkout performance in customer conversations until we're at full rollout.
 **Support**: One payments edge case is under investigation. Watch for any customer-reported checkout or payment errors and escalate immediately (do not hold for standard ticket review). We'll send a separate escalation guide by EOD today.
 **Exec**: We're on track for full launch this week if Legal clears PCI by Tuesday and the engineering investigation closes cleanly. If either slips, expect a 3–5 day delay to launch. No decision is needed today — Monday will be the go/no-go checkpoint.
 ---
 **Questions?** Reply here or ping [PM name] directly on Slack.
 **Next update**: Monday go/no-go decision summary.
@@ -0,0 +1,479 @@
 ---
 skill: user-story-writer
 title: User Story Writer
 input: "Feature: let users export a dashboard to PDF and schedule a recurring email of it. Users are analysts and their managers. Keep stories small and testable with clear acceptance criteria."
 source: generated by claude-sonnet-4-6
 ---
 # Epic Decomposition: Dashboard Export & Scheduled Email
 **Epic:** Dashboard Distribution
 **Goal:** Enable analysts and managers to export dashboards as PDFs and automate recurring email delivery, reducing manual reporting overhead.
 ---
 ## Story Map
 | # | Story | Priority | Dependencies |
 |---|---|---|---|
 | 1 | Export current dashboard view as PDF | P1 | — |
 | 2 | Configure PDF export options (orientation, page range) | P2 | #1 |
 | 3 | Send dashboard PDF as a one-off email | P2 | #1 |
 | 4 | Create a recurring email schedule for a dashboard | P1 | #1, #3 |
 | 5 | View and manage existing schedules | P1 | #4 |
 | 6 | Edit an existing schedule | P2 | #5 |
 | 7 | Delete a schedule | P1 | #5 |
 | 8 | Receive and view the scheduled email | P1 | #4 |
 **MVP sprint (P1):** Stories 1, 4, 5, 7, 8
 **Follow-on sprint (P2):** Stories 2, 3, 6
 ---
 ---
 ## Story 1: Export current dashboard view as PDF
 **Epic:** Dashboard Distribution
 **Story ID:** _(blank)_
 **Priority:** P1
 **Story points:** _(blank)_
 ---
 ### User Story
 > **As an** analyst,
 > I want to **export my current dashboard view as a PDF**,
 > so that I can **share a static snapshot of the data with stakeholders who don't have system access**.
 ---
 ### Context
 This is the foundational story for the epic — all scheduling and email features depend on the PDF generation capability. The export must reflect exactly what the analyst sees on screen at the moment of export, including any filters, date ranges, or segments they have applied. This is a manual, on-demand action.
 ---
 ### Acceptance Criteria
 **AC1: Export button is accessible on the dashboard**
 ```
 Given I am viewing any dashboard
 When I look at the dashboard toolbar
 Then I can see an "Export" button (or icon with label) that is keyboard-accessible
 ```
 **AC2: Successful PDF download**
 ```
 Given I am viewing a dashboard with at least one populated widget
 When I click "Export" and confirm the action
 Then a PDF file downloads to my browser's default download location
  And the filename follows the pattern: [Dashboard Name]_[YYYY-MM-DD].pdf
  And the PDF renders all visible widgets at their current state
 ```
 **AC3: Applied filters are preserved in the export**
 ```
 Given I have applied a date filter and a segment filter to the dashboard
 When I export to PDF
 Then the exported PDF reflects only the filtered data, not the unfiltered default view
  And the active filters are displayed as a summary line at the top of the PDF
 ```
 **AC4: Loading or processing state shown during export**
 ```
 Given I have clicked "Export"
 When the PDF is being generated (taking more than 1 second)
 Then a loading indicator is shown and the Export button is disabled to prevent duplicate requests
  And once the download begins, the loading state clears
 ```
 **AC5: Dashboard with no data**
 ```
 Given I am viewing a dashboard where all widgets return no data (e.g. filters return zero results)
 When I export to PDF
 Then the export still succeeds
  And each empty widget renders with its title and an "No data available" placeholder — no blank white boxes
 ```
 ---
 ### Edge Cases
 - [ ] **Very long dashboard:** Dashboard taller than one page — PDF must paginate cleanly, with no widget cut off mid-render across pages
 - [ ] **Dashboard with a live video or animation widget:** Static snapshot only — render the widget's still/placeholder state with a note "Live content not available in export"
 - [ ] **Slow network during generation:** If PDF generation exceeds 30 seconds, cancel and show an error — "Export failed. Please try again or contact support."
 - [ ] **Browser popup blocker:** If the download is blocked, show an inline message with a manual "Click here to download" fallback link
 - [ ] **Analyst has view-only permissions:** Export is still permitted — exporting does not require edit rights
 ---
 ### Out of Scope
 - Choosing page orientation or paper size (Story 2)
 - Emailing the PDF directly from this action (Story 3)
 - Scheduling recurring exports (Story 4)
 - Exporting to formats other than PDF (not in this epic)
 ---
 ### Definition of Done
 - [ ] Acceptance criteria all pass
 - [ ] Edge cases handled or explicitly deferred with a ticket raised
 - [ ] Unit tests for PDF generation logic; E2E test covering happy path download
 - [ ] PDF renders correctly in Adobe Acrobat, Chrome PDF viewer, and macOS Preview
 - [ ] Export button is keyboard-navigable and labelled for screen readers (aria-label)
 - [ ] Loading state prevents duplicate submissions
 - [ ] Error states have approved copy
 - [ ] Tested on Chrome, Firefox, Safari, and Edge
 - [ ] Product and design reviewed in staging
 ---
 ---
 ## Story 2: Configure PDF export options
 **Epic:** Dashboard Distribution
 **Story ID:** _(blank)_
 **Priority:** P2
 **Story points:** _(blank)_
 ---
 ### User Story
 > **As an** analyst,
 > I want to **choose page orientation and which widgets to include before exporting**,
 > so that I can **produce a PDF that fits the layout expected by the recipient without unnecessary content**.
 ---
 ### Context
 Some analysts share focused reports with a subset of dashboard widgets — for example, only revenue widgets for a finance audience. This story adds an options step before the export from Story 1 triggers. It must not slow down users who want the default full export — the quick path must remain one click.
 ---
 ### Acceptance Criteria
 **AC1: Export options panel opens on demand**
 ```
 Given I am viewing a dashboard
 When I click the dropdown arrow next to the "Export" button
 Then an options panel appears with controls for: page orientation (Portrait / Landscape) and widget selection (all selected by default)
 ```
 **AC2: Page orientation applies to the PDF**
 ```
 Given I have selected "Landscape" orientation in the export options
 When I confirm the export
 Then the downloaded PDF is rendered in landscape orientation
  And all widgets reflow to fit the wider page width
 ```
 **AC3: Deselecting widgets excludes them from the PDF**
 ```
 Given I have unchecked 2 of 6 widgets in the export options panel
 When I confirm the export
 Then the downloaded PDF contains only the 4 selected widgets
  And the excluded widgets leave no blank space in the PDF layout
 ```
 **AC4: Default export (no options selected) unchanged**
 ```
 Given I click the main "Export" button directly (not the dropdown)
 When the PDF downloads
 Then it uses the last-used orientation setting (defaulting to Portrait on first use)
  And includes all widgets — no options panel appears
 ```
 **AC5: Attempting to export with zero widgets selected**
 ```
 Given I have deselected all widgets in the options panel
 When I attempt to confirm the export
 Then the confirm button is disabled
  And an inline message reads "Select at least one widget to export"
 ```
 ---
 ### Edge Cases
 - [ ] **Single-widget dashboard:** Widget selection shows one item — deselecting it triggers AC5 immediately
 - [ ] **Options panel on small screen / tablet viewport:** Panel must be scrollable if content overflows; all controls must remain usable at 768px width
 - [ ] **User changes orientation after deselecting widgets:** Widget selection state must persist through orientation toggle
 ---
 ### Out of Scope
 - Custom margins, font sizes, or branding (not in this epic)
 - Saving export preferences per dashboard (not in this epic)
 - Export formats other than PDF
 ---
 ### Definition of Done
 - [ ] Acceptance criteria all pass
 - [ ] Options panel is keyboard-navigable; all inputs are properly labelled
 - [ ] Orientation preference persisted in local storage for default (not a backend call)
 - [ ] Unit tests for widget selection and orientation logic
 - [ ] Product and design reviewed in staging
 ---
 ---
 ## Story 3: Send a dashboard PDF as a one-off email
 **Epic:** Dashboard Distribution
 **Story ID:** _(blank)_
 **Priority:** P2
 **Story points:** _(blank)_
 ---
 ### User Story
 > **As an** analyst,
 > I want to **email the current dashboard PDF directly to one or more recipients from within the app**,
 > so that I can **share a report instantly without downloading it and attaching it manually**.
 ---
 ### Context
 This is a precursor pattern to the scheduling feature (Story 4) — the same email composition UI will be reused for scheduling. Recipients can be internal users (selectable from a directory) or external email addresses typed manually. This story is one-off only — no recurrence.
 ---
 ### Acceptance Criteria
 **AC1: Email send option is accessible**
 ```
 Given I am viewing a dashboard
 When I click the "Export" dropdown
 Then I can see a "Send by email" option alongside the PDF download option
 ```
 **AC2: Email composition form opens**
 ```
 Given I click "Send by email"
 When the form opens
 Then I can enter: one or more recipient email addresses, an optional subject line (pre-filled with "[Dashboard Name] — [Today's Date]"), and an optional message body
 ```
 **AC3: Successful email delivery confirmation**
 ```
 Given I have entered at least one valid email address and clicked "Send"
 When the system processes the request
 Then a success toast appears: "Dashboard sent to [n] recipient(s)"
  And the PDF attached to the email reflects the dashboard state at the moment "Send" was clicked
 ```
 **AC4: Invalid email address blocked**
 ```
 Given I have typed an invalid email address (e.g. "notanemail")
 When I attempt to submit the form
 Then the form does not submit
  And an inline error highlights the invalid field: "Enter a valid email address"
 ```
 **AC5: Email with no recipient blocked**
 ```
 Given the recipient field is empty
 When I click "Send"
 Then the Send button remains disabled
  And the recipient field shows helper text: "Add at least one recipient"
 ```
 **AC6: Recipient from internal directory**
 ```
 Given I begin typing a colleague's name in the recipient field
 When at least 2 characters are entered
 Then a dropdown suggests matching internal users by name and email
  And selecting a suggestion populates their email address in the field
 ```
 ---
 ### Edge Cases
 - [ ] **Recipient list exceeds 50 addresses:** Show a warning — "Sending to more than 50 recipients. Consider scheduling a distribution list instead." Allow user to proceed or cancel
 - [ ] **Email delivery failure (bounce or SMTP error):** Show an error notification — "Email could not be delivered to [address]. Please check the address and retry." Do not silently fail
 - [ ] **PDF generation fails before send:** Do not send the email — surface the PDF generation error first; do not send a blank or broken attachment
 - [ ] **User closes the form mid-composition:** Show a "Discard changes?" confirmation if the recipient field is populated
 ---
 ### Out of Scope
 - Scheduling recurring emails (Story 4)
 - Tracking email open rates or delivery receipts (not in this epic)
 - Sending on behalf of another user
 - Distribution list management
 ---
 ### Definition of Done
 - [ ] Acceptance criteria all pass
 - [ ] Email delivery handled via existing transactional email service (confirm provider with engineering before starting)
 - [ ] Recipient autocomplete is accessible via keyboard
 - [ ] Subject line and body sanitised before send (no script injection via message body)
 - [ ] Unit tests for validation logic; integration test for email send flow
 - [ ] Product and design reviewed in staging
 ---
 ---
 ## Story 4: Create a recurring email schedule for a dashboard
 **Epic:** Dashboard Distribution
 **Story ID:** _(blank)_
 **Priority:** P1
 **Story points:** _(blank)_
 ---
 ### User Story
 > **As a** manager,
 > I want to **set up a recurring email schedule that automatically sends a dashboard PDF to my team**,
 > so that I can **ensure my team receives up-to-date reports without me manually sending them each time**.
 ---
 ### Context
 Managers currently export dashboards and email them manually — often weekly or monthly. This story automates that workflow. The schedule runs a PDF snapshot at the configured time and emails it. The dashboard state at send-time is used — not the state at schedule-creation time. The scheduler must support timezone selection because teams are distributed.
 ---
 ### Acceptance Criteria
 **AC1: Schedule creation entry point is accessible**
 ```
 Given I am viewing a dashboard
 When I click the "Export" dropdown
 Then I can see a "Schedule email" option
 ```
 **AC2: Schedule form captures required fields**
 ```
 Given I click "Schedule email"
 When the schedule creation form opens
 Then I can configure:
  - Recipient email addresses (at least one required)
  - Frequency: Daily / Weekly / Monthly
  - Day of week (if Weekly) or day of month (if Monthly)
  - Time of day (hour and minute)
  - Timezone (searchable dropdown, defaults to my account timezone)
  - Optional subject line and message body
 ```
 **AC3: Schedule saves successfully**
 ```
 Given I have completed all required fields
 When I click "Save schedule"
 Then a success message confirms: "Schedule created. First email will send on [Next Occurrence Date] at [Time] [Timezone]"
  And the schedule appears in my Schedules list (Story 5)
  And no email is sent immediately
 ```
 **AC4: Frequency — daily schedule**
 ```
 Given I create a schedule with frequency set to "Daily" at 08:00 Europe/London
 When the schedule runs
 Then an email is sent every calendar day at 08:00 London time
  And daylight saving time transitions are handled automatically (the time stays 08:00 local, not UTC-fixed)
 ```
 **AC5: Frequency — weekly schedule**
 ```
 Given I create a schedule with frequency "Weekly", day "Monday", time 09:00
 When Monday arrives at 09:00 in the specified timezone
 Then the email is sent
  And the next send date shown in the Schedules list updates to the following Monday
 ```
 **AC6: Frequency — monthly schedule**
 ```
 Given I create a schedule with frequency "Monthly", day of month "31"
 When a month has fewer than 31 days (e.g. February)
 Then the email sends on the last day of that month
  And the schedule does not skip the month or error
 ```
 **AC7: Required field validation**
 ```
 Given I attempt to save a schedule with no recipients entered
 When I click "Save schedule"
 Then the form does not submit
  And the recipient field is highlighted with: "Add at least one recipient"
 ```
 ---
 ### Edge Cases
 - [ ] **Analyst (not manager) creates a schedule:** Analysts can also create schedules for their own dashboards — role restriction is not applied here; permission model is dashboard-level, not role-level
 - [ ] **Dashboard is deleted after a schedule is created:** Send the scheduled email team a notification — "Dashboard '[Name]' has been deleted. Your schedule '[Name]' has been paused." Do not silently fail
 - [ ] **Dashboard owner changes access permissions:** If the schedule creator loses access to the dashboard, pause the schedule and notify them
 - [ ] **Two schedules created for the same dashboard, same time, same recipients:** Allow it — do not deduplicate silently; user is responsible for managing duplicates (they'll see both in Story 5)
 - [ ] **Timezone not set on user account:** Default to UTC and prompt the user to confirm with a banner: "We've defaulted to UTC. Update your timezone in account settings."
 ---
 ### Out of Scope
 - Editing an existing schedule (Story 6)
 - Deleting a schedule (Story 7)
 - Viewing schedule history / delivery logs (not in this epic)
 - Scheduling exports to formats other than PDF
 - Setting an end date for a schedule (not in this epic — schedules run until manually deleted)
 ---
 ### Definition of Done
 - [ ] Acceptance criteria all pass
 - [ ] Timezone handling validated for DST transitions (London, New York, Sydney at minimum)
 - [ ] Monthly "short month" edge case has a passing automated test
 - [ ] Schedule is persisted in the database — survives server restart
 - [ ] Unit tests for recurrence calculation logic; integration test for schedule creation
 - [ ] No email is triggered on schedule creation — only on the first scheduled occurrence
 - [ ] Product and design reviewed in staging
 ---
 ---
 ## Story 5: View and manage existing schedules
 **Epic:** Dashboard Distribution
 **Story ID:** _(blank)_
 **Priority:** P1
 **Story points:** _(blank)_
 ---
 ### User Story
 > **As a** manager,
 > I want to **see a list of all the email schedules I have created**,
 > so that I can **understand what's currently active and avoid sending duplicate or outdated reports to my team**.
 ---
 ### Context
 Without visibility into active schedules, users will create duplicates or forget to update schedules when dashboards change. This view is the management hub for all schedule-related actions (editing in Story 6, deleting in Story 7). Scope is intentionally read-only for this story.
 ---
 ### Acceptance Criteria
 **AC1: Schedules list is accessible**
 ```
 Given I am
@@ -8,11 +8,13 @@ by hand; edit the source skill and run:
 node scripts/build-exports.mjs
 ```
-Currently exporting **172 skills** to:
+Currently exporting **174 skills** to:
 - **ChatGPT — Custom GPT instructions** → `exports/chatgpt/`
 - **Google Gemini — Gem instructions** → `exports/gemini/`
 - **Cursor — project rule (.mdc)** → `exports/cursor/`
 - **Windsurf — workspace rule (.md)** → `exports/windsurf/`
 - **Aider — conventions file (.md)** → `exports/aider/`
 Adding a new platform is a few lines in the `PLATFORMS` registry of
 `scripts/build-exports.mjs` — no content is duplicated.
@@ -0,0 +1,183 @@
 # Aider — conventions file (.md)
 > Auto-generated from `skills/*/SKILL.md` by `scripts/build-exports.mjs`.
 > **Do not edit these files by hand** — edit the source skill and regenerate.
 174 skills exported. Copy a `.mdc rule` into the tool to use it.
 | Skill | Bundle | Path |
 |---|---|---|
 | 360-Degree Feedback Template | `pm-people` | `pm-people/360-feedback-template/360-feedback-template.md` |
 | A/B Test Planner | `pm-delivery` | `pm-delivery/ab-test-planner/ab-test-planner.md` |
 | Accessibility Audit | `pm-design` | `pm-design/accessibility-audit/accessibility-audit.md` |
 | Account Plan | `pm-sales` | `pm-sales/account-plan/account-plan.md` |
 | AEO Optimizer | `pm-writers` | `pm-writers/aeo-optimizer/aeo-optimizer.md` |
 | AI Ethics Review | `pm-advanced` | `pm-advanced/ai-ethics-review/ai-ethics-review.md` |
 | AI Product Canvas | `pm-advanced` | `pm-advanced/ai-product-canvas/ai-product-canvas.md` |
 | Ambiguity Resolver | `pm-strategy` | `pm-strategy/ambiguity-resolver/ambiguity-resolver.md` |
 | API Docs Writer | `pm-engineering` | `pm-engineering/api-docs-writer/api-docs-writer.md` |
 | API Versioning Strategy | `pm-engineering` | `pm-engineering/api-versioning-strategy/api-versioning-strategy.md` |
 | Architecture Decision Record (ADR) | `pm-engineering` | `pm-engineering/architecture-decision-record/architecture-decision-record.md` |
 | Assumption Mapper | `pm-discovery` | `pm-discovery/assumption-mapper/assumption-mapper.md` |
 | Board Deck Narrative | `pm-business` | `pm-business/board-deck-narrative/board-deck-narrative.md` |
 | Budget Variance Analysis | `pm-finance` | `pm-finance/budget-variance-analysis/budget-variance-analysis.md` |
 | Capacity Planning | `pm-engineering` | `pm-engineering/capacity-planning/capacity-planning.md` |
 | Change Management Plan | `pm-hr` | `pm-hr/change-management-plan/change-management-plan.md` |
 | Changelog Generator | `pm-engineering` | `pm-engineering/changelog-generator/changelog-generator.md` |
 | Chart Data Extractor | `pm-data` | `pm-data/chart-data-extractor/chart-data-extractor.md` |
 | Churn Analysis | `pm-cs` | `pm-cs/churn-analysis/churn-analysis.md` |
 | CI/CD Playbook | `pm-engineering` | `pm-engineering/cicd-playbook/cicd-playbook.md` |
 | Claude Superpowers | `pm-engineering` | `pm-engineering/claude-superpowers/claude-superpowers.md` |
 | Clinical Case Summary | `pm-research` | `pm-research/clinical-case-summary/clinical-case-summary.md` |
 | Code Review Checklist | `pm-engineering` | `pm-engineering/code-review-checklist/code-review-checklist.md` |
 | Cohort Analysis | `pm-data` | `pm-data/cohort-analysis/cohort-analysis.md` |
 | Community Management Playbook | `pm-social` | `pm-social/community-management-playbook/community-management-playbook.md` |
 | Competitive Analysis | `pm-essentials` | `pm-essentials/competitive-analysis/competitive-analysis.md` |
 | Competitive Intelligence Monitor | `pm-strategy` | `pm-strategy/competitive-intelligence-monitor/competitive-intelligence-monitor.md` |
 | Competitor Signal Tracker | `pm-strategy` | `pm-strategy/competitor-signal-tracker/competitor-signal-tracker.md` |
 | Competitor Teardown | `pm-gtm` | `pm-gtm/competitor-teardown/competitor-teardown.md` |
 | Compliance Checklist | `pm-legal` | `pm-legal/compliance-checklist/compliance-checklist.md` |
 | Content Calendar | `pm-gtm` | `pm-gtm/content-calendar/content-calendar.md` |
 | Context Mode | `pm-engineering` | `pm-engineering/context-mode/context-mode.md` |
 | Contract Review | `pm-legal` | `pm-legal/contract-review/contract-review.md` |
 | Customer Escalation Brief | `pm-cs` | `pm-cs/cs-escalation-brief/cs-escalation-brief.md` |
 | Customer Health Scorecard | `pm-cs` | `pm-cs/cs-health-scorecard/cs-health-scorecard.md` |
 | Customer Journey Map | `pm-discovery` | `pm-discovery/customer-journey-map/customer-journey-map.md` |
 | Customer Success Plan | `pm-cs` | `pm-cs/customer-success-plan/customer-success-plan.md` |
 | Dashboard Brief | `pm-data` | `pm-data/dashboard-brief/dashboard-brief.md` |
 | Data Analysis Standard | `pm-analytics` | `pm-analytics/data-analysis-standard/data-analysis-standard.md` |
 | Data Pipeline Spec | `pm-data` | `pm-data/data-pipeline-spec/data-pipeline-spec.md` |
 | Database Migration Plan | `pm-engineering` | `pm-engineering/database-migration-plan/database-migration-plan.md` |
 | Database Schema Design | `pm-engineering` | `pm-engineering/database-schema-design/database-schema-design.md` |
 | Debugging Log Analyser | `pm-engineering` | `pm-engineering/debugging-log-analyser/debugging-log-analyser.md` |
 | Dependency Audit | `pm-engineering` | `pm-engineering/dependency-audit/dependency-audit.md` |
 | Design Critique | `pm-design` | `pm-design/design-critique/design-critique.md` |
 | Design Handoff Brief | `pm-advanced` | `pm-advanced/design-handoff-brief/design-handoff-brief.md` |
 | Design System Audit | `pm-design` | `pm-design/design-system-audit/design-system-audit.md` |
 | Developer Onboarding Document | `pm-engineering` | `pm-engineering/developer-onboarding-doc/developer-onboarding-doc.md` |
 | Disaster Recovery Plan | `pm-engineering` | `pm-engineering/disaster-recovery-plan/disaster-recovery-plan.md` |
 | Discovery Call Prep | `pm-sales` | `pm-sales/discovery-call-prep/discovery-call-prep.md` |
 | Discovery Interview Guide | `pm-discovery` | `pm-discovery/discovery-interview-guide/discovery-interview-guide.md` |
 | Word Doc Tracked Changes | `pm-essentials` | `pm-essentials/docx-tracked-changes/docx-tracked-changes.md` |
 | Email Campaign | `pm-gtm` | `pm-gtm/email-campaign/email-campaign.md` |
 | Email Triage | `pm-operations` | `pm-operations/email-triage/email-triage.md` |
 | Employee Engagement Survey | `pm-hr` | `pm-hr/employee-engagement-survey/employee-engagement-survey.md` |
 | Engineering Hiring Rubric | `pm-engineering` | `pm-engineering/engineering-hiring-rubric/engineering-hiring-rubric.md` |
 | Engineering Weekly Report | `pm-engineering` | `pm-engineering/engineering-weekly-report/engineering-weekly-report.md` |
 | Executive Summary | `pm-cross` | `pm-cross/executive-summary/executive-summary.md` |
 | Executive Update | `pm-strategy` | `pm-strategy/executive-update/executive-update.md` |
 | Experiment Designer | `pm-advanced` | `pm-advanced/experiment-designer/experiment-designer.md` |
 | Feature Flag Guide | `pm-engineering` | `pm-engineering/feature-flag-guide/feature-flag-guide.md` |
 | Feature Prioritisation | `pm-planning` | `pm-planning/feature-prioritisation/feature-prioritisation.md` |
 | Figma Annotation Guide | `pm-figma` | `pm-figma/figma-annotation-guide/figma-annotation-guide.md` |
 | Figma Component Audit | `pm-figma` | `pm-figma/figma-component-audit/figma-component-audit.md` |
 | Figma Design Brief | `pm-figma` | `pm-figma/figma-design-brief/figma-design-brief.md` |
 | Figma Design Critique — PM Perspective | `pm-figma` | `pm-figma/figma-design-critique-pm/figma-design-critique-pm.md` |
 | Figma Design QA | `pm-figma` | `pm-figma/figma-design-qa/figma-design-qa.md` |
 | Figma Design Review | `pm-figma` | `pm-figma/figma-design-review/figma-design-review.md` |
 | Figma Prototype Plan | `pm-figma` | `pm-figma/figma-prototype-plan/figma-prototype-plan.md` |
 | Figma Spacing System | `pm-figma` | `pm-figma/figma-spacing-system/figma-spacing-system.md` |
 | Figma User Flow Planner | `pm-figma` | `pm-figma/figma-user-flow-planner/figma-user-flow-planner.md` |
 | Figma Variant Matrix | `pm-figma` | `pm-figma/figma-variant-matrix/figma-variant-matrix.md` |
 | Financial Due Diligence | `pm-finance` | `pm-finance/financial-due-diligence/financial-due-diligence.md` |
 | Financial Model Narrative | `pm-finance` | `pm-finance/financial-model-narrative/financial-model-narrative.md` |
 | Go-To-Market | `pm-gtm` | `pm-gtm/go-to-market/go-to-market.md` |
 | Go-to-Market Planner | `pm-delivery` | `pm-delivery/go-to-market-planner/go-to-market-planner.md` |
 | Grant Proposal | `pm-cross` | `pm-cross/grant-proposal/grant-proposal.md` |
 | Hiring Rubric | `pm-people` | `pm-people/hiring-rubric/hiring-rubric.md` |
 | Incident Postmortem | `pm-engineering` | `pm-engineering/incident-postmortem/incident-postmortem.md` |
 | Influencer Brief | `pm-social` | `pm-social/influencer-brief/influencer-brief.md` |
 | Infrastructure-as-Code Review | `pm-engineering` | `pm-engineering/infra-as-code-review/infra-as-code-review.md` |
 | Instagram Post Downloader | `pm-writers` | `pm-writers/instagram-post-downloader/instagram-post-downloader.md` |
 | Investor Pitch Deck | `pm-finance` | `pm-finance/investor-pitch-deck/investor-pitch-deck.md` |
 | Investor Update | `pm-business` | `pm-business/investor-update/investor-update.md` |
 | Job Application | `pm-business` | `pm-business/job-application/job-application.md` |
 | Job Description Writer | `pm-hr` | `pm-hr/job-description-writer/job-description-writer.md` |
 | Job Story Mapper | `pm-discovery` | `pm-discovery/job-story-mapper/job-story-mapper.md` |
 | Last 30 Days Research | `pm-cross` | `pm-cross/last-30-days-research/last-30-days-research.md` |
 | Launch Readiness | `pm-delivery` | `pm-delivery/launch-readiness/launch-readiness.md` |
 | Legal Brief | `pm-legal` | `pm-legal/legal-brief/legal-brief.md` |
 | Literature Review | `pm-research` | `pm-research/literature-review/literature-review.md` |
 | Load Testing Plan | `pm-engineering` | `pm-engineering/load-testing-plan/load-testing-plan.md` |
 | Local Dev Setup | `pm-engineering` | `pm-engineering/local-dev-setup/local-dev-setup.md` |
 | Media Pitch | `pm-gtm` | `pm-gtm/media-pitch/media-pitch.md` |
 | Meeting Notes | `pm-essentials` | `pm-essentials/meeting-notes/meeting-notes.md` |
 | Metrics Framework | `pm-data` | `pm-data/metrics-framework/metrics-framework.md` |
 | Microservices Decomposition | `pm-engineering` | `pm-engineering/microservices-decomposition/microservices-decomposition.md` |
 | Monitoring Setup Guide | `pm-engineering` | `pm-engineering/monitoring-setup-guide/monitoring-setup-guide.md` |
 | Morning Intelligence | `pm-operations` | `pm-operations/morning-intelligence/morning-intelligence.md` |
 | Multi-Source Signal Synthesiser | `pm-advanced` | `pm-advanced/multi-source-signal-synthesiser/multi-source-signal-synthesiser.md` |
 | NDA Analyser | `pm-legal` | `pm-legal/nda-analyser/nda-analyser.md` |
 | NotebookLM Connector | `pm-cross` | `pm-cross/notebooklm-connector/notebooklm-connector.md` |
 | Notes Humanizer | `pm-writers` | `pm-writers/notes-humanizer/notes-humanizer.md` |
 | OKR Builder | `pm-planning` | `pm-planning/okr-builder/okr-builder.md` |
 | Onboarding Plan | `pm-hr` | `pm-hr/onboarding-plan/onboarding-plan.md` |
 | On-Call Runbook | `pm-engineering` | `pm-engineering/oncall-runbook/oncall-runbook.md` |
 | Partnership Proposal | `pm-sales` | `pm-sales/partnership-proposal/partnership-proposal.md` |
 | Patient Communication | `pm-research` | `pm-research/patient-communication/patient-communication.md` |
 | Performance Budget | `pm-engineering` | `pm-engineering/performance-budget/performance-budget.md` |
 | Performance Review | `pm-people` | `pm-people/performance-review/performance-review.md` |
 | PM Weekly Review | `pm-rituals` | `pm-rituals/pm-weekly-review/pm-weekly-review.md` |
 | PPTX Slide Auditor | `pm-delivery` | `pm-delivery/pptx-slide-auditor/pptx-slide-auditor.md` |
 | PR Description Writer | `pm-engineering` | `pm-engineering/pr-description-writer/pr-description-writer.md` |
 | PRD Template | `pm-essentials` | `pm-essentials/prd-template/prd-template.md` |
 | Press Release | `pm-cross` | `pm-cross/press-release/press-release.md` |
 | Pricing Strategy | `pm-planning` | `pm-planning/pricing-strategy/pricing-strategy.md` |
 | Process Documentation | `pm-operations` | `pm-operations/process-documentation/process-documentation.md` |
 | Product Health Analysis | `pm-analytics` | `pm-analytics/product-health-analysis/product-health-analysis.md` |
 | Product Launch Checklist | `pm-delivery` | `pm-delivery/product-launch-checklist/product-launch-checklist.md` |
 | Product Positioning Doc | `pm-gtm` | `pm-gtm/product-positioning-doc/product-positioning-doc.md` |
 | Project Status Report | `pm-operations` | `pm-operations/project-status-report/project-status-report.md` |
 | Proposal Writer | `pm-sales` | `pm-sales/proposal-writer/proposal-writer.md` |
 | QBR Deck | `pm-cs` | `pm-cs/qbr-deck/qbr-deck.md` |
 | RACI Matrix | `pm-operations` | `pm-operations/raci-matrix/raci-matrix.md` |
 | Redundancy Consultation | `pm-hr` | `pm-hr/redundancy-consultation/redundancy-consultation.md` |
 | Renewal Playbook | `pm-cs` | `pm-cs/renewal-playbook/renewal-playbook.md` |
 | Research Protocol | `pm-research` | `pm-research/research-protocol/research-protocol.md` |
 | Retention Analysis | `pm-analytics` | `pm-analytics/retention-analysis/retention-analysis.md` |
 | Retrospective Analysis | `pm-delivery` | `pm-delivery/retro-analysis/retro-analysis.md` |
 | RFC Writer | `pm-engineering` | `pm-engineering/rfc-writer/rfc-writer.md` |
 | RICE + Strategic Alignment | `pm-planning` | `pm-planning/rice-impact-matrix/rice-impact-matrix.md` |
 | RICE Prioritisation | `pm-planning` | `pm-planning/rice-prioritisation/rice-prioritisation.md` |
 | Risk Register | `pm-operations` | `pm-operations/risk-register/risk-register.md` |
 | Roadmap Narrative | `pm-planning` | `pm-planning/roadmap-narrative/roadmap-narrative.md` |
 | Roadmap Presentation | `pm-planning` | `pm-planning/roadmap-presentation/roadmap-presentation.md` |
 | Runbook Writer | `pm-engineering` | `pm-engineering/runbook-writer/runbook-writer.md` |
 | Sales Battlecard | `pm-sales` | `pm-sales/sales-battlecard/sales-battlecard.md` |
 | Sales Forecasting Model | `pm-sales` | `pm-sales/sales-forecasting-model/sales-forecasting-model.md` |
 | Security Threat Model | `pm-engineering` | `pm-engineering/security-threat-model/security-threat-model.md` |
 | SEO Content Brief | `pm-gtm` | `pm-gtm/seo-content-brief/seo-content-brief.md` |
 | Service Catalog Entry | `pm-engineering` | `pm-engineering/service-catalog-entry/service-catalog-entry.md` |
 | Skill Security Auditor | `pm-engineering` | `pm-engineering/skill-security-auditor/skill-security-auditor.md` |
 | SLO and Error Budget | `pm-engineering` | `pm-engineering/slo-error-budget/slo-error-budget.md` |
 | Social Ad Campaign | `pm-social` | `pm-social/social-ad-campaign/social-ad-campaign.md` |
 | Social Media Audit | `pm-social` | `pm-social/social-media-audit/social-media-audit.md` |
 | Social Media Strategy | `pm-gtm` | `pm-gtm/social-media-strategy/social-media-strategy.md` |
 | SOP Writer | `pm-operations` | `pm-operations/sop-writer/sop-writer.md` |
 | Sprint Brief | `pm-delivery` | `pm-delivery/sprint-brief/sprint-brief.md` |
 | Sprint Planning | `pm-delivery` | `pm-delivery/sprint-planning/sprint-planning.md` |
 | Sprint Velocity Analysis | `pm-engineering` | `pm-engineering/sprint-velocity-analysis/sprint-velocity-analysis.md` |
 | SQL Query Explainer | `pm-data` | `pm-data/sql-query-explainer/sql-query-explainer.md` |
 | Stakeholder Influence Mapper | `pm-strategy` | `pm-strategy/stakeholder-influence-mapper/stakeholder-influence-mapper.md` |
 | Stakeholder Update | `pm-essentials` | `pm-essentials/stakeholder-update/stakeholder-update.md` |
 | Strategic Narrative Generator | `pm-strategy` | `pm-strategy/strategic-narrative-generator/strategic-narrative-generator.md` |
 | Substack Notes Scraper | `pm-writers` | `pm-writers/substack-notes-scraper/substack-notes-scraper.md` |
 | Sycophancy Challenger | `pm-cross` | `pm-cross/sycophancy-challenger/sycophancy-challenger.md` |
 | System Design Interview | `pm-engineering` | `pm-engineering/system-design-interview/system-design-interview.md` |
 | Tax Planning Checklist | `pm-finance` | `pm-finance/tax-planning-checklist/tax-planning-checklist.md` |
 | Teaching Lesson Plan | `pm-cross` | `pm-cross/teaching-lesson-plan/teaching-lesson-plan.md` |
 | Team Health Check | `pm-people` | `pm-people/team-health-check/team-health-check.md` |
 | Team Offsite Planner | `pm-people` | `pm-people/team-offsite-planner/team-offsite-planner.md` |
 | Tech Radar | `pm-engineering` | `pm-engineering/tech-radar/tech-radar.md` |
 | Technical Debt Register | `pm-engineering` | `pm-engineering/technical-debt-register/technical-debt-register.md` |
 | Technical Spec Template | `pm-delivery` | `pm-delivery/technical-spec-template/technical-spec-template.md` |
 | Test Strategy Document | `pm-engineering` | `pm-engineering/test-strategy-doc/test-strategy-doc.md` |
 | Thumbnail Creator Skill (via Gemini) | `pm-writers` | `pm-writers/thumbnail-creator/thumbnail-creator.md` |
 | User Interview Synthesis | `pm-discovery` | `pm-discovery/user-interview-synthesis/user-interview-synthesis.md` |
 | User Research Synthesis | `pm-essentials` | `pm-essentials/user-research-synthesis/user-research-synthesis.md` |
 | User Story Writer | `pm-delivery` | `pm-delivery/user-story-writer/user-story-writer.md` |
 | UX Research Plan | `pm-design` | `pm-design/ux-research-plan/ux-research-plan.md` |
 | Vendor Evaluation | `pm-operations` | `pm-operations/vendor-evaluation/vendor-evaluation.md` |
 | Viral Content Framework | `pm-social` | `pm-social/viral-content-framework/viral-content-framework.md` |
 | Workshop Facilitation Guide | `pm-operations` | `pm-operations/workshop-facilitation-guide/workshop-facilitation-guide.md` |
 | YouTube Script Writer | `pm-writers` | `pm-writers/youtube-script-writer/youtube-script-writer.md` |
@@ -0,0 +1,210 @@
 # AI Ethics Review Skill
 This skill produces a structured ethical review of an AI or machine learning feature, model, or product. Output covers fairness, transparency, privacy, safety, accountability, and societal impact — with risk scoring, prioritised mitigations, and a checklist suitable for governance review or responsible AI documentation.
 > ⚠️ This skill provides a structured framework for identifying and documenting ethical risks. It is not a substitute for legal advice, regulated algorithmic impact assessments, or specialist ethics review required in specific jurisdictions (e.g. EU AI Act, UK AI regulation).
 ## Required Inputs
 Ask the user for these if not provided:
 - **Feature or model name** and what it does
 - **Who it affects** — which users or people does the AI interact with, make decisions about, or collect data from?
 - **What decisions or outputs it produces** — recommendations, predictions, classifications, generation, automation?
 - **Consequentiality** — how significant are the AI's decisions? (low-stakes suggestions vs decisions that affect employment, credit, health, safety, etc.)
 - **Data used** — what training data, user data, or third-party data is used?
 - **Human oversight** — is there a human in the loop, and at what stage?
 - **Deployment context** — who will use this and how? (internal tool / consumer-facing / automated pipeline)
 ## Output Structure
 ---
 # AI Ethics Review: [Feature / Model Name]
 **Product / system:** [Name and brief description]
 **Review type:** [Pre-deployment review / Post-deployment audit / Change review]
 **Risk tier:** [High / Medium / Low — based on consequentiality, scale, and affected population]
 **Reviewer:** [Name / Team]
 **Date:** [Date]
 **Status:** [Draft / Approved / Requires escalation]
 ---
 ## 1. Feature Summary
 | | |
 |---|---|
 | **What it does** | [1–2 sentences — plain English description of the AI feature and its purpose] |
 | **Who uses it** | [End users / internal teams / automated system] |
 | **Who is affected by its outputs** | [May be different from who uses it — e.g. an AI hiring tool is used by HR but affects candidates] |
 | **Output type** | [Recommendation / Classification / Prediction / Generation / Automation / Scoring] |
 | **Scale** | [How many people affected per day/month?] |
 | **Consequentiality** | [High: affects access to services, employment, credit, health, safety / Medium: influences decisions / Low: suggestions with easy override] |
 | **Human oversight level** | [Full automation / Human review before action / Human can override after action / Advisory only] |
 ---
 ## 2. Risk Tier Assessment
 | Factor | Score (1–3) | Rationale |
 |---|---|---|
 | **Consequentiality** (impact on individuals) | [1=low, 3=high] | [e.g. 3 — model output influences hiring decisions] |
 | **Scale** (number of people affected) | [1=few, 3=many] | [e.g. 2 — internal tool used for ~500 candidates/year] |
 | **Reversibility** (can harm be undone?) | [1=reversible, 3=irreversible] | [e.g. 2 — unfair rejection can be appealed but may not be caught] |
 | **Vulnerability of affected group** | [1=general population, 3=protected or vulnerable group] | [e.g. 2 — includes protected characteristics in the decision context] |
 | **Transparency** (do affected people know?) | [1=informed, 3=opaque] | [e.g. 3 — candidates are not told AI is used in screening] |
 **Composite risk tier:** [High (12–15) / Medium (7–11) / Low (3–6)]
 **Risk tier implications:**
 - **High:** Mandatory senior ethics review, DPA/DPIA required, human-in-loop for all consequential decisions, ongoing monitoring required
 - **Medium:** Ethics review recommended, document mitigations, quarterly monitoring
 - **Low:** Standard review, document assumptions, annual review
 ---
 ## 3. Fairness & Bias
 *Does the AI treat people equitably across groups?*
 **Protected characteristics relevant to this feature:**
 [List applicable protected characteristics — age, gender, race/ethnicity, disability, religion, national origin, etc.]
 | Risk | Analysis | Mitigation |
 |---|---|---|
 | **Training data bias** | [Does the training data reflect historical discrimination? e.g. hiring data that reflects past biases in who was hired] | [Audit training data for demographic representation / use debiasing techniques / document data lineage] |
 | **Proxy discrimination** | [Could the model use a proxy for a protected characteristic? e.g. using postcode as a proxy for race] | [Identify proxy features / test for disparate impact using adversarial debiasing] |
 | **Differential performance** | [Does the model perform differently across demographic groups? — e.g. lower accuracy for underrepresented groups] | [Disaggregate performance metrics by group / set minimum performance thresholds per group] |
 | **Feedback loops** | [Does the model's output reinforce existing disparities? e.g. recommending content that keeps disadvantaged groups in lower-engagement patterns] | [Monitor outcome distributions over time / implement feedback loop detection] |
 **Fairness evaluation method:** [What method will be used to measure fairness — statistical parity / equalised odds / individual fairness? Who is responsible for running it and how often?]
 ---
 ## 4. Transparency & Explainability
 *Can affected people understand how the AI makes decisions?*
 | Dimension | Current state | Required state | Gap |
 |---|---|---|---|
 | **User disclosure** | [Are users told they're interacting with AI?] | [Yes — required for trust and regulation] | [e.g. No disclosure on current UI] |
 | **Decision explanation** | [Can the system explain why it reached a conclusion?] | [For high-stakes decisions: yes] | [e.g. Black-box model — no feature attribution available] |
 | **Right to know** | [Can affected people ask how a decision was made?] | [Yes — required under GDPR Art. 22 for automated decisions] | [e.g. No process exists] |
 | **Confidence calibration** | [Does the model express appropriate uncertainty?] | [Yes — overconfident models cause over-reliance] | [e.g. Model outputs binary label without confidence score] |
 **Explainability approach:** [LIME / SHAP / rule-based surrogate / LLM-generated rationale / none — and why]
 ---
 ## 5. Privacy & Data
 *Is personal data used responsibly and lawfully?*
 | Risk | Analysis | Mitigation |
 |---|---|---|
 | **Data minimisation** | [Does the model use more personal data than necessary?] | [Audit input features — remove any that don't improve performance and involve unnecessary data collection] |
 | **Data retention** | [How long is personal data retained for training and inference?] | [Define retention policy aligned to GDPR / CCPA / sector requirements] |
 | **Re-identification risk** | [Could model outputs or training data be used to identify individuals?] | [Differential privacy / k-anonymity / output rate limiting] |
 | **Third-party data** | [Is data from third parties used? Is it licensed for this use?] | [Audit data licensing / get legal sign-off on each third-party source] |
 | **Cross-border data transfer** | [Is personal data transferred across jurisdictions?] | [Legal review — Standard Contractual Clauses or equivalent] |
 **DPIA required?** [Yes / No / Uncertain — for High tier or whenever processing is likely to result in high risk to individuals under GDPR Art. 35]
 ---
 ## 6. Safety & Reliability
 *What happens when the AI gets it wrong?*
 | Failure mode | Likelihood | Impact | Mitigation |
 |---|---|---|---|
 | **False positives** | [H/M/L] | [e.g. Flagging a legitimate transaction as fraud — customer locked out] | [Set threshold conservatively; human review for edge cases] |
 | **False negatives** | [H/M/L] | [e.g. Missing a real fraud case — financial loss] | [Monitor false negative rate; set minimum recall threshold] |
 | **Out-of-distribution inputs** | [H/M/L] | [Model behaves unpredictably on inputs outside training distribution] | [Input validation; confidence thresholding — route uncertain inputs to human review] |
 | **Model degradation** | [M] | [Performance degrades as data distributions shift post-deployment] | [Scheduled performance monitoring; drift detection alerts] |
 | **Adversarial inputs** | [L/M] | [Deliberate manipulation of inputs to game the model] | [Adversarial testing; rate limiting; anomaly detection on inputs] |
 | **Single point of failure** | [L/M] | [Model outage causes downstream system failure] | [Graceful degradation — define fallback behaviour when model is unavailable] |
 **Fallback behaviour:** [What happens if the AI is unavailable or returns low-confidence output? — e.g. route to human review / use rule-based fallback / block the action]
 ---
 ## 7. Accountability & Governance
 *Who is responsible when things go wrong?*
 | Question | Answer |
 |---|---|
 | **Who owns this AI feature?** | [Team or individual with end-to-end accountability] |
 | **Who approved deployment?** | [Name and role — must be documented] |
 | **Who is responsible for ongoing monitoring?** | [Team and cadence] |
 | **Who can shut it down?** | [Who has kill-switch authority and under what conditions?] |
 | **How are incidents reported?** | [Internal escalation path + external disclosure process if required] |
 | **Is this subject to regulation?** | [EU AI Act / UK AI regulation / sector-specific rules — FINRA, FDA, FCA, etc.] |
 **Incident response plan:** [Link to or describe what happens if the model causes harm — detection, escalation, remediation, disclosure]
 ---
 ## 8. Societal Impact
 *Beyond individual users — what are the broader effects?*
 | Impact area | Risk | Mitigation |
 |---|---|---|
 | **Labour displacement** | [Does this AI automate tasks that currently employ people?] | [Transition plan / human-AI collaboration framing / skills retraining commitment] |
 | **Environmental impact** | [What is the carbon cost of training and inference?] | [Measure and offset; prefer efficient architectures; use renewable-energy infrastructure where possible] |
 | **Power concentration** | [Does this AI give the deploying organisation disproportionate power over individuals?] | [Ensure right to opt out; avoid lock-in; consider open alternatives] |
 | **Information ecosystem** | [Could this AI contribute to misinformation, filter bubbles, or manipulation?] | [Provenance labelling / content policies / algorithmic diversity requirements] |
 ---
 ## 9. Mitigation Priorities
 | # | Risk | Severity | Action | Owner | Deadline |
 |---|---|---|---|---|---|
 | 1 | [Highest risk — e.g. No disclosure to affected candidates] | Critical | [Add AI disclosure to UI and candidate-facing documentation] | [PM + Legal] | [Before launch] |
 | 2 | [e.g. No fairness evaluation across demographic groups] | High | [Commission third-party fairness audit using [method]] | [ML team + external auditor] | [Within 30 days of launch] |
 | 3 | [e.g. No model monitoring in place] | High | [Deploy performance and drift monitoring dashboard] | [ML Ops] | [Launch day] |
 | 4 | [e.g. DPIA not completed] | High | [Complete DPIA with DPO before deployment] | [Legal / DPO] | [Before launch] |
 ---
 ## 10. Pre-Deployment Checklist
 - [ ] Ethics review completed and approved by required reviewers
 - [ ] DPIA completed (if required)
 - [ ] Fairness evaluation completed and results documented
 - [ ] AI disclosure is in place wherever required
 - [ ] Human oversight mechanism is defined and tested
 - [ ] Kill-switch and escalation path is documented and tested
 - [ ] Model monitoring is deployed and alerting is configured
 - [ ] Data lineage and training data audit documented
 - [ ] Legal sign-off obtained on data licensing and cross-border transfers
 - [ ] Incident response plan in place
 ---
 ## Quality Checks
 - [ ] "Who is affected" includes people the AI makes decisions *about*, not just who uses the product
 - [ ] Fairness analysis names specific protected characteristics, not just "diverse groups"
 - [ ] Safety section covers both false positive and false negative failure modes
 - [ ] Accountability section names real people, not teams or roles
 - [ ] Mitigations are specific and time-bound — not "monitor and review"
 ## Anti-Patterns
 - [ ] Do not limit the affected-population analysis to users of the product — AI that makes decisions about people (hiring, credit, content moderation) affects non-users who have no opt-out
 - [ ] Do not accept "we will monitor" as a mitigation without specifying what is monitored, at what threshold, and who acts
 - [ ] Do not assign fairness analysis to the model team alone — protected characteristic analysis requires input from legal, HR, or a subject-matter expert
 - [ ] Do not defer the DPIA to post-launch — for high-risk tier systems, a DPIA is a pre-requisite for lawful deployment under GDPR
 - [ ] Do not conflate statistical accuracy with fairness — a model can be 95% accurate overall while performing significantly worse for a protected group
 ## Example Trigger Phrases
 - "Run an AI ethics review for [feature]"
 - "Conduct an ethical impact assessment for our new ML model"
 - "Review the AI risks for our hiring / credit / recommendation system"
 - "Build a responsible AI checklist for our product"
 - "What are the ethical risks of using AI for [use case]?"
@@ -0,0 +1,164 @@
 # AI Product Canvas Skill
 Define AI products with the same rigour as any product decision — but with additional layers for data, model, evaluation, and responsible AI. This canvas prevents the most common AI product failure: building a technically impressive feature that doesn't solve a real problem.
 ## AI Product Anti-Patterns to Check First
 Before building, flag if any of these apply:
 - ❌ "We should add AI to [existing feature]" — with no user problem defined
 - ❌ Accuracy target undefined before build begins
 - ❌ No plan for what happens when the model is wrong
 - ❌ User-facing AI output with no human review or fallback
 - ❌ Training data not audited for bias or quality
 - ❌ No evaluation metric — "we'll know it when we see it"
 ---
 ## AI Product Canvas Output Format
 ### AI Product Canvas — [Feature Name] — [Date]
 **PM Owner:** [Name]
 **ML/AI Lead:** [Name]
 **Status:** Discovery / Design / Build / Evaluation / Live
 ---
 #### 1. Problem Definition
 **User problem being solved:**
 > [What specific situation is the user in? What job are they trying to get done?]
 **Why AI?**
 > [What makes this problem require AI vs a deterministic solution? If the answer is "because we can," stop here.]
 **Success for the user looks like:**
 > [What outcome does the user experience when the AI feature is working well?]
 ---
 #### 2. AI Approach
 **Task type:**
 - [ ] Classification
 - [ ] Generation (text, image, code)
 - [ ] Summarisation / extraction
 - [ ] Recommendation
 - [ ] Search / retrieval
 - [ ] Prediction / forecasting
 - [ ] Conversation / agent
 **Model approach:**
 - [ ] LLM API (GPT-4, Claude, Gemini, etc.) — specify: [Model name + version]
 - [ ] Fine-tuned model on own data
 - [ ] Custom model trained from scratch
 - [ ] RAG (retrieval-augmented generation)
 - [ ] Embedding + vector search
 **Rationale for chosen approach:** [Why this, not alternatives]
 ---
 #### 3. Data Requirements
 | Data Type | Source | Volume | Quality Status | Bias Risk |
 |---|---|---|---|---|
 | [Training data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L |
 | [Evaluation data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L |
 **Data gaps:** [What's missing and plan to get it]
 **Privacy considerations:** [Any PII in training or inference data]
 **Data ownership:** [Do we own this data? Can we use it for training?]
 ---
 #### 4. Evaluation Framework
 **Primary metric:** [The number that defines success — accuracy, F1, BLEU, user rating, task completion rate]
 **Minimum acceptable threshold:** [Below X, the feature does not ship]
 **Human evaluation plan:** [How will humans review model outputs? Sampling rate? Review panel?]
 | Evaluation Type | Method | Cadence | Owner |
 |---|---|---|---|
 | Offline (pre-launch) | [Test set, benchmark] | Pre-launch | ML Lead |
 | Online (post-launch) | [A/B test, user feedback] | Weekly | PM + ML |
 | Adversarial | [Red-team, edge cases] | Pre-launch | Safety reviewer |
 ---
 #### 5. User Experience Design
 **How is AI output presented?**
 - [ ] Direct output shown to user (high trust required)
 - [ ] AI-assisted with user confirmation
 - [ ] Suggestion user can accept/reject
 - [ ] Background action with audit log
 **Confidence and uncertainty handling:**
 - What happens when confidence is low? [Show alternative, ask for clarification, fallback to manual]
 - How is uncertainty communicated to the user? [UI pattern]
 **Fallback plan:**
 - If the model fails or returns an error: [Specific fallback behaviour]
 - If accuracy degrades below threshold: [Kill switch or graceful degradation plan]
 ---
 #### 6. Responsible AI Checklist
 - [ ] Bias audit completed on training data
 - [ ] Demographic fairness evaluated (does performance differ by user group?)
 - [ ] Hallucination / confabulation risk assessed and mitigated
 - [ ] User can see and correct AI output
 - [ ] Opt-out mechanism exists (can user disable the AI feature?)
 - [ ] Output provenance visible when relevant (does user know AI generated this?)
 - [ ] PII not used in ways user didn't consent to
 - [ ] Regulatory review completed (GDPR, AI Act, sector-specific)
 - [ ] Model cards / documentation completed
 ---
 #### 7. Launch & Monitoring Plan
 **Rollout:** [% of users, with staged expansion criteria]
 **Monitoring metrics:**
 - Model performance: [Metric + alert threshold]
 - User engagement with AI output: [Acceptance rate, override rate, feedback score]
 - Error rate: [% of failed inferences]
 - Latency: [P95 target]
 **Model refresh cadence:** [How often is the model retrained or updated?]
 **Drift detection:** [How will you know when model performance degrades in production?]
 ---
 ## Guidelines
 - Never skip the "Why AI?" section — it's the most important question in AI product development
 - The fallback UX is not optional — what happens when AI fails defines your product's trustworthiness
 - Responsible AI checklist must be completed before launch, not after
 - Include latency in success metrics — a 5-second AI response is often worse than no AI at all
 - Recommend starting with a human-in-the-loop design and automating only when accuracy is proven
 ## Required Inputs
 Ask the user for these if not provided:
 - **Feature or product description** (what the AI is intended to do)
 - **User problem** (what problem the AI is solving for users)
 - **Available data** (what training/inference data exists)
 - **ML/AI lead** (who owns the technical implementation)
 ## Anti-Patterns
 - [ ] Do not skip the "Why AI?" question — if the answer is "we want to use AI," stop and reframe around the user problem first
 - [ ] Do not launch with an undefined accuracy threshold — "good enough" is not a threshold; set a number before build begins
 - [ ] Do not design the UX to hide AI-generated output as if it were system truth — users need to know when AI is involved so they can override it
 - [ ] Do not defer the Responsible AI checklist to post-launch — bias and privacy issues are far harder to fix in production than in design
 - [ ] Do not treat model latency as a post-launch optimisation — a 6-second AI response that replaces a 1-second rule-based response is a regression, not a feature
 ## Quality Checks
 - [ ] "Why AI?" is answered clearly (not "because we can")
 - [ ] Minimum acceptable accuracy threshold is defined before build begins
 - [ ] Fallback UX is specified for model failures or low-confidence outputs
 - [ ] Responsible AI checklist is completed (not deferred to post-launch)
 - [ ] Monitoring plan includes both model performance and user engagement metrics
@@ -0,0 +1,78 @@
 # Design Handoff Brief Skill
 Produce a design brief that sets designers up for success — grounding them in user context and constraints before they open Figma, not after they've gone in the wrong direction.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Feature brief or PRD** (even rough notes work)
 - **Designer's name or team** (for personalisation)
 - **Technical constraints** (any engineering limitations already known)
 - **Timeline** (when does design need to be done?)
 ## What Designers Actually Need (and PMs Often Skip)
 - The user's goal, not the feature name
 - The emotional state of the user at this moment in the journey
 - What success looks like — how will we know the design worked?
 - Constraints: technical, legal, brand, accessibility
 - Edge cases that must be handled
 - What we're explicitly NOT solving for
 ## Process
 1. Read the feature brief or PRD provided
 2. Extract user goal (reframe from feature language to user outcome language)
 3. Identify constraints — technical limitations, brand guidelines, accessibility requirements
 4. List edge cases the design must handle
 5. Define success criteria the design should be evaluated against
 6. Write a "not in scope" section to prevent scope creep in design
 7. **Validate** — Confirm every edge case listed is specific enough to design for, and every out-of-scope item is concrete enough to say "no" to
 ## Output Structure
 ### Design Brief: [Feature Name]
 **User Goal:** (in the user's words, not ours)
 "When I [situation], I want to [motivation] so that I can [outcome]."
 **Context & Emotional State:**
 [Where is the user in their journey? What are they feeling? What just happened?]
 **Design Success Criteria:**
 - [Criterion 1 — measurable where possible]
 - [Criterion 2]
 - [Criterion 3]
 **Constraints:**
 - Technical: [limitations engineering has flagged]
 - Brand: [relevant brand guidelines]
 - Accessibility: [WCAG level required, any specific requirements]
 - Legal/Compliance: [if applicable]
 **Edge Cases to Design For:**
 - [Edge case 1]
 - [Edge case 2]
 - [Edge case 3]
 **Explicitly Out of Scope:**
 - [What we are NOT solving in this design iteration]
 **Reference Material:**
 - User research: [link]
 - Existing patterns: [Figma component library link]
 - Competitor examples: [links if relevant]
 ## Quality Checks
 - [ ] User goal is written in user language (not feature/product language)
 - [ ] At least one edge case covers an error or failure state
 - [ ] Success criteria are measurable or observable (not "looks good")
 - [ ] Out-of-scope section names at least one thing that might seem in scope but isn't
 - [ ] Technical constraints are specific enough for an engineer to confirm
 ## Anti-Patterns
 - [ ] Do not write the user goal in feature language ("design the checkout flow") — it must be written from the user's perspective with a motivation and outcome
 - [ ] Do not skip the "Explicitly Out of Scope" section — without it, designers will inadvertently solve problems not intended for this iteration
 - [ ] Do not list edge cases that are so generic they apply to any feature (e.g. "handle errors") — each edge case must be specific to this feature's failure modes
 - [ ] Do not hand off the brief without confirming engineering constraints are accurate — a constraint that is wrong is worse than no constraint
 - [ ] Do not omit the emotional context of the user — designs without emotional grounding produce technically correct but experientially flat results
@@ -0,0 +1,72 @@
 # Experiment Designer Skill
 Produce rigorous experiment designs from product hypotheses, and interpret results with statistical and practical significance — so you can defend every decision to a sceptical engineering lead or data scientist.
 ## Required Inputs
 Ask the user for these if not provided:
 **For experiment design:**
 - Hypothesis (what change, what metric, what expected movement)
 - Current baseline metric value
 - Minimum detectable effect (MDE) — the smallest lift worth caring about
 - Available daily sample size
 **For results interpretation:**
 - Control and variant results (raw numbers or percentages)
 - P-value or confidence interval
 - Run duration (days)
 - Any anomalies observed during the test
 ## Two-Phase Process
 ### Phase 1: Experiment Design
 1. Restate hypothesis as: "If we [change], we expect [metric] to [move by X%] because [reason]"
 2. Define control and variant clearly
 3. Select primary metric (one only) and secondary guardrail metrics (2-3 max)
 4. Calculate required sample size from MDE and baseline
 5. Estimate run time in days
 6. Set pre-defined success criteria before the test runs — no moving goalposts
 7. Flag design risks: novelty effects, seasonal confounds, multiple testing issues, network effects, sample ratio mismatch
 ### Phase 2: Results Interpretation
 1. Assess statistical significance (p < 0.05 threshold)
 2. Assess practical significance: was the lift meaningful for the business, not just real?
 3. Interpret confidence intervals
 4. Investigate confounding factors
 5. Recommend: Ship / Iterate / Kill / Run follow-up test
 6. **Validate** — Confirm the test ran for the full planned duration. Flag if it was stopped early (peeking problem). Confirm sample ratio mismatch did not occur.
 ## Output Structure
 **[Design or Results header based on phase]**
 *Hypothesis:* "If we [change], we expect [metric] to [move by X%] because [reason]"
 *Primary metric:* [One metric only]
 *Guardrail metrics:* [2-3 max]
 *Required sample size:* [n per variant]
 *Estimated run time:* [days]
 *Pre-defined success threshold:* [specific number]
 *Design risk flags:* [any concerns]
 **Results (Phase 2 only):**
 *Statistical significance:* [p-value and conclusion]
 *Practical significance:* [lift size vs. business threshold]
 *Recommendation:* Ship / Iterate / Kill / Follow-up — [rationale]
 ## Quality Checks
 - [ ] Hypothesis specifies the change, the metric, the direction, and the reason
 - [ ] Primary metric is singular — guardrail metrics are secondary
 - [ ] Success criteria are defined before the test launches (not after seeing results)
 - [ ] Test was not stopped early (or flagged clearly if it was)
 - [ ] Practical significance assessed separately from statistical significance
 - [ ] Sample ratio mismatch is checked in results interpretation
 ## Anti-Patterns
 - [ ] Do not define success criteria after seeing preliminary results — post-hoc success definitions are HARKing (Hypothesising After Results are Known) and invalidate the experiment
 - [ ] Do not stop a test early because the result looks significant — early stopping dramatically inflates false positive rates; the test must run to the planned sample size
 - [ ] Do not treat statistical significance as the same as practical significance — a p < 0.05 result with a 0.1% lift is real but may not be worth shipping
 - [ ] Do not run the same experiment on the same population multiple times without correction — multiple testing inflates the chance of a false positive proportionally
 - [ ] Do not use more than one primary metric — multiple primary metrics require multiple hypothesis corrections and make the ship/kill decision ambiguous
@@ -0,0 +1,65 @@
 # Multi-Source Signal Synthesiser Skill
 Reconcile user signals from multiple sources — interviews, support tickets, NPS, app reviews, sales calls — into a unified, weighted insight brief that surfaces the underlying need rather than the surface-level request.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Signal sources** (interviews, support tickets, NPS verbatims, app reviews, sales calls, analytics — any combination)
 - **Time period** covered by the data
 - **Product area or feature** the signals relate to (if scoped)
 ## Source Weighting (default — adapt to context)
 | Source | Weight | Rationale |
 |--------|--------|-----------|
 | Direct research (interviews, usability tests) | 5 | Highest-fidelity, structured |
 | Support tickets (unprompted pain signals) | 4 | Real pain, unfiltered |
 | NPS verbatims | 3 | Broad but shallow |
 | App store reviews | 2 | Public, self-selected |
 | Sales call summaries | 2 | Filtered through sales lens |
 | Anecdote or single report | 1 | Low confidence alone |
 ## Process
 1. Tag each signal by source and apply weight
 2. Look for **convergence**: same underlying need appearing across 3+ sources
 3. Look for **divergence**: contradictory signals suggesting user segmentation
 4. Distinguish surface request from underlying need (e.g. "faster export" may mean "I don't trust the data will be there when I need it")
 5. Produce ranked insights by weighted frequency
 6. **Validate** — Confirm each insight has evidence from at least 2 source types. Flag any insight resting on a single source as low-confidence.
 ## Output Structure
 ### User Signal Synthesis — [Date / Period]
 **Sources included:** [list with count per source]
 **Total signals processed:** [n]
 #### Insight 1: [Underlying need, not feature request]
 - **Confidence:** High / Medium / Low (based on source diversity and weight)
 - **Evidence:** [Signals from each source supporting this]
 - **Conflicting signals:** [Any contradicting evidence and how to interpret it]
 - **Product implication:** [Specific next step, not generic]
 [Repeat for top 3-5 insights]
 #### Divergent Signals (Possible Segmentation)
 [Where user groups appear to have genuinely different needs — specify which segments]
 #### What the Data Does NOT Tell Us
 [Gaps that require further research before acting]
 ## Quality Checks
 - [ ] Every insight references at least 2 distinct source types
 - [ ] Surface requests are translated to underlying needs (not just echoed)
 - [ ] Divergent signals identify the specific user segments, not just "some users disagree"
 - [ ] Confidence ratings are consistent with source diversity and weighting
 - [ ] "What the data does NOT tell us" section is honest about gaps
 ## Anti-Patterns
 - [ ] Do not echo surface-level feature requests as insights — translate every request to the underlying need before including it as a finding
 - [ ] Do not assign High confidence to insights supported by only one source type — confidence requires corroboration across at least two distinct source types
 - [ ] Do not treat all sources as equally weighted — a single interview quote and a pattern across 200 support tickets are not comparable signals
 - [ ] Do not collapse divergent signals into a single finding — where user segments have genuinely different needs, name the segments explicitly rather than averaging them away
 - [ ] Do not omit the research gap section when key decisions rest on thin data — acting on low-confidence findings without flagging the gaps misleads product teams
@@ -0,0 +1,129 @@
 # Data Analysis Standard Skill
 Turn raw numbers into product decisions. Structure every analysis with a clear question, methodology, finding, and recommended action.
 ## Analysis Framework: The 4-Question Method
 Every analysis starts here:
 1. **What changed?** (describe the metric and its movement)
 2. **Why did it change?** (root cause — segment, funnel step, cohort, channel)
 3. **So what?** (business or product impact)
 4. **Now what?** (recommended action with confidence level)
 Never deliver data without answering all four. A chart with no narrative is not an analysis.
 ---
 ## Metric Triage Template
 Use when a metric has moved unexpectedly:
 ```
 METRIC: [Name]
 MOVEMENT: [X% change over Y period]
 BASELINE: [What was normal]
 SEGMENTATION CHECK:
 - By platform (iOS / Android / Web)?
 - By user cohort (new / returning / power users)?
 - By acquisition channel?
 - By geography?
 - By plan/tier?
 ROOT CAUSE HYPOTHESIS:
 1. [Most likely explanation] — Evidence: [data point]
 2. [Alternative explanation] — Evidence: [data point]
 3. [Ruling out] — Eliminated because: [reason]
 CONCLUSION: [Single sentence answer to "why did this change?"]
 CONFIDENCE: [High / Medium / Low] — based on [data available]
 ```
 ---
 ## Funnel Analysis Structure
 | Stage | Metric | Current | Benchmark/Target | Drop-off % | Notes |
 |---|---|---|---|---|---|
 | [Top of funnel] | [Users] | [N] | [N] | — | |
 | [Step 2] | [Users] | [N] | [N] | [X%] | |
 | [Step 3] | [Users] | [N] | [N] | [X%] | |
 | [Conversion] | [Users] | [N] | [N] | [X%] | |
 **Biggest drop-off:** [Step X → Step Y] — Hypothesis: [reason]
 **Recommended investigation:** [specific query or test]
 ---
 ## Cohort Analysis Guidelines
 Always define:
 - **Cohort definition:** [What groups users — signup week, first action, plan type]
 - **Retention metric:** [What counts as retained — login, core action, revenue]
 - **Retention window:** [D1, D7, D30, W4, M3, etc.]
 Output a cohort retention table and annotate:
 - Baseline retention for each cohort
 - Cohorts that over/underperform and why (feature launch? campaign? seasonal?)
 - Trend direction across cohorts (improving / declining / stable)
 ---
 ## Stakeholder Analysis Output Format
 ### [Analysis Title] — [Date]
 **Question being answered:** [Specific question in plain English]
 **Time period:** [Date range]
 **Data source:** [Where data comes from]
 **Finding:**
 > [1–2 sentence plain-English summary of what the data shows]
 **Key chart / table:** [Include or describe]
 **Root cause:** [Best explanation with evidence]
 **Confidence level:** [High / Medium / Low] — [reason]
 **Recommended action:**
 1. [Immediate action — owner, timeline]
 2. [Investigation needed — what to check next]
 3. [Monitoring — what metric to watch and at what cadence]
 **What this analysis does NOT tell us:** [Important caveat — what data is missing or what can't be concluded]
 ---
 ## Required Inputs
 Ask the user for these if not provided:
 - **Metric or question** being investigated
 - **Time period** (what changed, from when to when)
 - **Data available** (which segments, sources, or queries you have access to)
 - **Business context** (what decision this analysis informs)
 - **Audience** (who will read this — exec / team / data team)
 ## Quality Checks
 - [ ] Analysis answers all 4 questions: what changed, why, so what, now what
 - [ ] Root cause has evidence (not just hypothesis)
 - [ ] Confidence level is stated and justified
 - [ ] What the data cannot tell us is explicitly named
 - [ ] Recommended action includes an owner and timeline
 ## Anti-Patterns
 - [ ] Do not present correlations as causation — always state the distinction explicitly
 - [ ] Do not report a metric movement without stating the time window and comparison baseline
 - [ ] Do not skip the "so what" — raw observations without recommended actions are incomplete analysis
 - [ ] Do not overstate confidence — label hypotheses clearly and note what data would be needed to confirm them
 - [ ] Do not ignore segment breakdowns — aggregate metrics can mask opposing trends in sub-segments
 ## Guidelines
 - Always state what the data *cannot* tell you — never oversell confidence
 - Correlations are not causation — flag this every time
 - If the user has no baseline, recommend establishing one before drawing conclusions
 - Recommend the simplest chart for each finding: bar for comparison, line for trends, scatter for correlation, table for detailed breakdowns
 - Always specify the time window — "conversion dropped" is meaningless without "from X to Y over Z period"
@@ -0,0 +1,62 @@
 # Product Health Analysis Skill
 Transform raw metrics data into a clear health narrative — what's working, what's not, and what needs immediate attention.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Metrics data** (current values for key metrics — even rough numbers work)
 - **Targets or benchmarks** (OKR targets, historical baselines, or industry benchmarks)
 - **Period** (week / month / quarter being analysed)
 - **Product area or segment** (are we looking at the whole product or a specific feature?)
 ## Metrics Framework
 Analyse across four layers:
 1. **Acquisition** — new users, source quality, CAC trends
 2. **Activation** — time to first value, onboarding completion rates
 3. **Engagement** — DAU/MAU, feature adoption, session depth
 4. **Retention** — D1/D7/D30 retention, churn rate, resurrection rate
 ## Process
 1. For each metric, compare: current period vs. previous period, current vs. target
 2. Flag anything more than 10% off target as requiring investigation
 3. Look for correlations — does a drop in activation explain a retention dip 2 weeks later?
 4. Write a plain-English health summary (no jargon) suitable for sharing with non-data stakeholders
 5. Recommend top 3 areas for immediate investigation with suggested diagnostic steps
 6. **Validate** — Confirm every flagged metric has a plausible root cause hypothesis, not just a raw number, and every recommended action has a specific owner or team
 ## Output Structure
 ### Product Health Report — [Period]
 **Overall Health:** 🟢 On Track / 🟡 Watch / 🔴 Action Required
 | Metric | Current | Target | vs. Last Period | Status |
 |--------|---------|--------|-----------------|--------|
 | [metric] | [value] | [target] | [+/-%] | [🟢/🟡/🔴] |
 **Key Observations:**
 [3-5 bullet observations written in plain English]
 **Areas Requiring Investigation:**
 1. [Metric + hypothesis + suggested diagnostic]
 2. [Metric + hypothesis + suggested diagnostic]
 3. [Metric + hypothesis + suggested diagnostic]
 **Recommended Actions:**
 [Specific next steps with owners and timelines]
 ## Quality Checks
 - [ ] Every metric includes both a target and a trend (not just a snapshot)
 - [ ] At least one correlation is drawn between metrics (e.g., activation → retention)
 - [ ] Every flagged metric has a root cause hypothesis, not just "it dropped"
 - [ ] Observations are written for a non-technical stakeholder (no raw query language or data jargon)
 - [ ] Overall health rating is justified with specific evidence
 ## Anti-Patterns
 - [ ] Do not report a single aggregate metric without segment breakdowns — averages hide opposing trends
 - [ ] Do not flag a metric as healthy just because it is above the target — check if the target itself is meaningful
 - [ ] Do not list metric movements without root cause hypotheses — observations without explanations are not analysis
 - [ ] Do not mix product health metrics with business KPIs without explaining the relationship between them
 - [ ] Do not omit recommended actions — a health report that only describes problems without prioritised next steps is incomplete
@@ -0,0 +1,137 @@
 # Retention Analysis Skill
 Diagnose why users leave, identify what keeps them, and recommend specific, testable interventions — not vague "improve onboarding" suggestions.
 ## Retention Fundamentals
 **The retention curve has two components:**
 1. **Steepness of initial drop** (D1–D7) — onboarding problem
 2. **Long-term floor level** — product-market fit indicator
 A product with PMF has a retention curve that flattens. If it trends to zero, you have a PMF problem, not an onboarding problem. Name this distinction explicitly.
 ---
 ## Retention Metrics Definitions
 | Metric | Formula | What It Tells You |
 |---|---|---|
 | D1 Retention | Users who return on day 2 ÷ new users day 1 | Quality of first experience |
 | D7 Retention | Users active on day 8 ÷ users who joined 7 days ago | Early habit formation |
 | D30 Retention | Users active on day 31 ÷ users who joined 30 days ago | Product-market fit signal |
 | DAU/MAU Ratio | Daily active users ÷ monthly active users | Stickiness (>20% good, >50% excellent) |
 | Churn Rate | Users lost in period ÷ users at start of period | Monthly or annual |
 | Net Revenue Retention | MRR at end of period ÷ MRR at start (same cohort) | Revenue health including expansion |
 ---
 ## Retention Investigation Framework
 ### Step 1: Segment the problem
 Don't analyse "retention" — analyse retention for specific cohorts:
 - New vs returning users
 - Paid vs free
 - Acquisition channel (organic vs paid vs referral)
 - Onboarding path completed vs not
 - Feature usage (power users vs lurkers)
 ### Step 2: Find the inflection points
 Where does the drop happen? D1? D7? Month 3?
 - D1 drop → First session experience
 - D7 drop → Habit loop not formed
 - D30 drop → Value not delivered at depth
 - Month 3+ drop → Boredom, competition, or lifecycle event
 ### Step 3: Identify the "aha moment" correlation
 Which early behaviour predicts long-term retention?
 - Run correlation: users who did [X] in first 7 days vs 30-day retention
 - Common patterns: connected an integration, invited a teammate, completed a core action N times
 ### Step 4: Qualify the churn
 Interview churned users — never skip this. Survey data alone is insufficient.
 - "What was the trigger that led you to cancel/stop?"
 - "What were you trying to accomplish that you couldn't?"
 - "What would need to change for you to come back?"
 ---
 ## Output Format
 ### Retention Analysis — [Product/Segment] — [Date]
 **Question:** [Specific retention question being answered]
 **Period Analysed:** [Date range]
 **Segment:** [Which users]
 ---
 **Current Retention Snapshot:**
 | Metric | Current | Industry Benchmark | Status |
 |---|---|---|---|
 | D1 Retention | [X%] | 25–40% | 🔴/🟡/🟢 |
 | D7 Retention | [X%] | 10–25% | 🔴/🟡/🟢 |
 | D30 Retention | [X%] | 5–15% | 🔴/🟡/🟢 |
 | DAU/MAU | [X%] | 10–20% typical | 🔴/🟡/🟢 |
 **Retention Curve Shape:** [Flattening / Still declining / Trending to zero]
 **PMF Signal:** [Strong / Weak / Absent — based on curve shape]
 ---
 **Root Cause Hypotheses:**
 | Hypothesis | Evidence | Confidence | Test |
 |---|---|---|---|
 | [Cause] | [Data point] | H/M/L | [How to validate] |
 **"Aha Moment" Correlation:**
 Users who [specific action] in first [N] days retain at [X%] vs [Y%] for those who don't.
 ---
 **Recommended Interventions:**
 | Intervention | Target Drop | Expected Lift | Effort | Priority |
 |---|---|---|---|---|
 | [Specific change] | D1 / D7 / D30 | [X%] | S/M/L | 1/2/3 |
 **Monitoring Plan:**
 - Metric to track: [X]
 - Review cadence: [Weekly / Monthly]
 - Alert threshold: [If X drops below Y, investigate immediately]
 ---
 ## Required Inputs
 Ask the user for these if not provided:
 - **Product and business model** (SaaS / consumer app / marketplace / other)
 - **Current retention metrics** (D1, D7, D30 if available)
 - **Segment to analyse** (all users / paid / free / a specific cohort)
 - **Key question to answer** (why is retention dropping? what drives retention?)
 - **Available data** (analytics events, churn surveys, interview notes)
 ## Quality Checks
 - [ ] Retention curve shape is diagnosed (flattening vs trending to zero = PMF vs onboarding)
 - [ ] Cohorts are segmented before analysis (not all users lumped together)
 - [ ] "Aha moment" correlation is identified or flagged as unknown
 - [ ] Interventions are specific (not "improve onboarding")
 - [ ] Churned user interviews are recommended (not just data analysis)
 - [ ] Monitoring plan includes an alert threshold
 ## Anti-Patterns
 - [ ] Do not recommend "improve onboarding" without specifying what specific step to change and why
 - [ ] Do not analyse retention without segmenting by cohort — aggregate retention curves hide cohort-specific patterns
 - [ ] Do not treat DAU/MAU below 5% as a retention problem — at that level, it is a product-market fit problem
 - [ ] Do not skip qualitative research — churned user interviews reveal reasons that quantitative data cannot
 - [ ] Do not set a monitoring alert without specifying the threshold that triggers it
 ## Guidelines
 - Never recommend "improve onboarding" without specifying *what* to change and *why*
 - Benchmark against industry — consumer apps, SaaS, and marketplaces have very different retention norms
 - If DAU/MAU is below 5%, that's a PMF conversation, not a retention tactics conversation
 - Always recommend talking to churned users — no amount of data replaces understanding the *reason*
@@ -0,0 +1,160 @@
 # Board Deck Narrative Skill
 This skill builds the complete narrative and slide structure for a board presentation — from opening framing to closing asks. It produces slide-by-slide content guidance, not just a list of topics.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Company stage and context** (Seed / Series A / Growth — and where you are in the year)
 - **Board meeting type** (Regular quarterly / Annual / Special / Fundraise-related)
 - **Key themes for this meeting** (e.g. strong growth quarter / pivoting strategy / hiring challenge / fundraise update)
 - **Key metrics to feature**
 - **Decisions needed from the board** (if any)
 - **Time available** (e.g. 60 min / 90 min)
 - **Audience** (investors only / investors + independent directors / mixed)
 ## Output Structure
 ---
 # Board Deck Narrative: [Company] — [Quarter/Period]
 **Meeting type:** [Regular quarterly / Special]
 **Time:** [X minutes]
 **Narrative theme:** [The one-sentence story of this quarter — e.g. "We hit our revenue target, but activation is the problem we need to solve together."]
 ---
 ## Opening Frame (Slide 1–2)
 **Slide 1: Title**
 - Company name, quarter, date
 - One-sentence framing of the meeting's narrative arc
 **Slide 2: Agenda**
 - List of sections + time allocation
 - Flag which sections need board input vs. are informational
 *Presenter note: Board members are busy. Tell them in the first 2 minutes what you need from them today. It changes how they listen.*
 ---
 ## Business Performance (Slides 3–6, ~15 min)
 **Slide 3: Scorecard / KPI Dashboard**
 - Content: Key metrics vs. targets for the quarter. No more than 6 metrics.
 - Format: Traffic-light table (Green / Amber / Red against plan)
 - Narrative: [1–2 sentences — the headline story of the quarter in numbers]
 - *Don't hide reds. Boards lose trust when they discover hidden problems later.*
 **Slide 4: Revenue / Growth Deep Dive**
 - Content: Revenue breakdown by segment, cohort retention, growth drivers
 - Key message: [What the data shows about the health of growth]
 - Call out: [Any trend that needs board context or discussion]
 **Slide 5: Unit Economics**
 - Content: CAC, LTV, payback period, gross margin — vs. last quarter and vs. plan
 - Flag: Any metric moving in the wrong direction and what's causing it
 **Slide 6: Operational Highlights**
 - Content: 3–5 bullet points of the most significant things that happened this quarter
 - Format: Each bullet = outcome, not activity. ("Signed 3 enterprise contracts worth £400K ARR" not "Continued enterprise sales motion")
 ---
 ## Strategic Update (Slides 7–9, ~15 min)
 **Slide 7: Strategy Snapshot**
 - Content: Where you said you'd be vs. where you are against the annual plan
 - Narrative: [Honest assessment — what's on track, what's shifted and why]
 **Slide 8: Key Strategic Decision or Update**
 - Content: The one strategic topic that most needs board input this meeting
 - Format: Context → Options considered → Recommendation → Question for board
 - *This is the highest-value 10 minutes of the meeting. Frame it as a real question.*
 **Slide 9: Product & Roadmap (if relevant)**
 - Content: Top 3 product bets this quarter — what shipped, what's coming, why these bets
 - Tailored for: What the board needs to understand to support strategic decisions, not a sprint review
 ---
 ## People & Organisation (Slide 10, ~5 min)
 **Slide 10: Team Update**
 - Content: Headcount (start vs. end of quarter), key hires made, open roles, any org changes
 - Flag: Any people risks or leadership gaps the board should know about
 - *Don't skip this slide. Board members often have network value here.*
 ---
 ## Financial Update (Slides 11–12, ~10 min)
 **Slide 11: P&L Summary**
 - Content: Revenue, gross margin, opex by category, EBITDA/net burn — actual vs. budget
 - Include: Year-to-date vs. annual plan
 **Slide 12: Cash & Runway**
 - Content: Cash on hand, monthly burn rate, runway at current burn
 - Include: Scenario if burn increases (e.g. key hire made), scenario if growth accelerates
 - Flag immediately: If runway is < 18 months — this needs board awareness and planning
 ---
 ## Closing & Asks (Slides 13–14, ~10 min)
 **Slide 13: Priorities for Next Quarter**
 - Content: Top 3–5 priorities and what success looks like for each
 - Format: Priority | What we're doing | How we'll know it worked
 - *Keeps board accountability consistent across meetings*
 **Slide 14: Board Asks**
 - Content: Specific things you need from board members before next meeting
 - Format: Each ask = specific, named if possible ("Looking for an intro to [Company] — [Board member X], do you have a connection?")
 - *A board meeting without specific asks is a missed opportunity*
 ---
 ## Appendix (Optional)
 - Detailed cohort analysis
 - Competitive landscape update
 - Full P&L
 - Team org chart
 - Any supporting data referenced in the main deck
 *Appendix slides are available but not presented. Board members who want detail can ask.*
 ---
 ## Narrative Principles
 - **Lead with honesty.** If it was a hard quarter, say so in the first slide. Don't bury bad news after the wins.
 - **One slide = one idea.** If a slide has two messages, split it.
 - **Fewer slides, more depth.** A 14-slide deck presented well beats a 35-slide deck rushed through.
 - **Every slide has a "so what."** A slide that just shows data without a takeaway wastes board time.
 - **Leave time for discussion.** Board value is in the conversation, not the presentation. Aim to spend 40% of the meeting presenting and 60% in discussion.
 ## Quality Checks
 - [ ] Opening frame states the meeting's narrative theme
 - [ ] Scorecard slide uses traffic-light format (not just green metrics)
 - [ ] Strategic decision slide frames a real question for the board
 - [ ] Financial slide includes runway explicitly
 - [ ] Board asks are specific and actionable
 - [ ] Deck is ≤ 15 slides (excluding appendix)
 ## Anti-Patterns
 - [ ] Do not bury bad news after slides full of good news — boards lose trust when they discover problems were de-emphasised; lead with the honest narrative
 - [ ] Do not include slides without a "so what" — a chart that shows data without a takeaway wastes board time and signals the presenter hasn't done the analysis
 - [ ] Do not exceed 15 slides in the main deck — a longer deck usually means the presenter hasn't decided what matters most
 - [ ] Do not attend a board meeting without at least one specific ask — a board meeting with no asks is a missed opportunity to leverage the room
 - [ ] Do not report metrics without comparing them to plan or a prior period — a metric shown in isolation gives the board no basis for judgement
 ## Example Trigger Phrases
 - "Build a board deck structure for our Q[N] board meeting"
 - "Help me create the narrative for our board presentation"
 - "Write the slide structure for our annual board review"
 - "Design a board deck for [specific context — e.g. fundraise update]"
@@ -0,0 +1,130 @@
 # Investor Update Skill
 This skill writes a complete investor update — structured for clarity, honest about challenges, and specific about asks. Output follows the format preferred by most early-stage and growth investors.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Company name and stage** (Seed / Series A / Series B / etc.)
 - **Period covered** (month or quarter)
 - **Key metrics this period** (revenue, MRR, users, churn, burn, runway — whatever's relevant)
 - **Biggest wins**
 - **Biggest challenges or misses**
 - **Specific asks from investors** (intros, advice, talent, partnerships)
 - **What's coming next period**
 - **Tone** (formal / conversational — most investors prefer conversational)
 ## Output Structure
 ---
 **[Company Name] — [Month/Quarter] Update**
 *[Date]*
 ---
 Hi [Investor names or "all"],
 [One or two sentence opener — a specific highlight or honest framing of the period. Don't open with "Hope you're well." Open with the most important thing that happened.]
 ---
 ## The Numbers
 | Metric | This Period | Last Period | Change |
 |---|---|---|---|
 | [MRR / ARR] | [Value] | [Value] | [+/- %] |
 | [Active users / customers] | | | |
 | [Churn rate] | | | |
 | [Burn rate] | | | |
 | [Runway] | | | |
 | [Other key metric] | | | |
 [1–2 sentences of narrative on the numbers — what's the story behind the movement? Don't just repeat the table.]
 ---
 ## Highlights
 **[Highlight 1 — 4–6 word title]**
 [2–4 sentences. What happened. Why it matters. Be specific — name the customer, the number, the milestone.]
 **[Highlight 2]**
 [2–4 sentences]
 **[Highlight 3 — optional]**
 ---
 ## Challenges
 [This section is what separates trustworthy updates from self-promotional ones. Investors know you have challenges. Being direct builds trust.]
 **[Challenge 1]**
 [2–4 sentences. What the problem is. What you've tried. What you're doing about it. Don't spin — investors see through it.]
 **[Challenge 2 — if applicable]**
 ---
 ## Focus for Next [Month/Quarter]
 [3–5 bullet points. What you're concentrating on next period and why. Keep it tight — not an exhaustive roadmap.]
 - [Priority 1]
 - [Priority 2]
 - [Priority 3]
 ---
 ## Asks
 [Be specific. "Let me know if you can help" is not an ask. These should be actionable items an investor can act on immediately.]
 1. **[Ask type: e.g. Intro]** — [Specific request. e.g. "Looking for an intro to procurement leads at mid-market SaaS companies. Happy to share a warm intro note."]
 2. **[Ask type: e.g. Advice]** — [Specific question you want input on]
 3. **[Ask type: e.g. Talent]** — [Specific hire you're looking for — title, key requirements]
 ---
 [Closing line — 1 sentence. Forward-looking or a genuine thanks. Not "as always, let me know if you have questions."]
 [Signature]
 [Name]
 [Company]
 [One way to reply — email / Calendly / reply to this thread]
 ---
 ## Writing Rules
 - Updates should take an investor 3–4 minutes to read. If it's longer, trim it.
 - Never lead with process ("This month we focused on...") — lead with outcomes
 - Challenges section must be honest. A missing challenges section signals the founder isn't self-aware or isn't being transparent.
 - Metrics table must include comparison to last period — a number without context is meaningless
 - Asks must be specific enough that an investor knows within 5 seconds if they can help
 - No jargon or buzzwords ("synergies," "crushing it," "hockey stick") — plain language only
 ## Quality Checks
 - [ ] Opens with a specific highlight or honest framing (not a pleasantry)
 - [ ] Numbers include period-over-period comparison
 - [ ] Challenges section is present and honest
 - [ ] Asks are specific and actionable
 - [ ] Total length is skimmable in 3–4 minutes
 - [ ] No spin or buzzwords
 ## Anti-Patterns
 - [ ] Do not omit challenges or bad news — sanitised updates erode investor trust faster than bad results do
 - [ ] Do not bury the lead — use BLUF structure and put the most important news in the first paragraph
 - [ ] Do not send an update without a clear "Ask" section — investors who want to help need to know how
 - [ ] Do not use buzzwords or spin — investors see hundreds of updates and will see through vague positive language
 - [ ] Do not report metrics without a comparison baseline — numbers without context (vs. last period or target) are meaningless
 ## Example Trigger Phrases
 - "Write an investor update for [month/quarter]"
 - "Draft a monthly update for our investors based on these notes: [paste notes]"
 - "Help me write a board update for Q[N]"
 - "Write our Series A investor newsletter"
@@ -0,0 +1,131 @@
 # Job Application Skill
 This skill tailors a CV and cover letter to a specific job description — optimising for ATS keyword matching while keeping the writing human and compelling. It also flags gaps between the candidate's profile and the role requirements.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Job description** (paste in full)
 - **Current CV / resume** (paste or describe key experience, roles, and skills)
 - **The specific thing that excites them about this role** (used in the cover letter — must be genuine)
 - **Any particular strengths to emphasise** (optional)
 - **Any gaps they're worried about** (optional — helps address them proactively)
 ## Output Structure
 ---
 ## Part 1: JD Analysis
 Before writing anything, analyse the job description and output:
 ### Must-Have Requirements
 [List explicit requirements from the JD — qualifications, years of experience, specific skills]
 ### Key Themes in the JD
 [3–5 themes that repeat or are emphasised — these are the keywords and priorities the hiring manager cares about most]
 ### ATS Keywords to Include
 [List 10–15 specific keywords and phrases from the JD that should appear in the CV and cover letter. Include: tools, methodologies, job titles, skills]
 ### Gaps Assessment
 [Honest comparison between the candidate's profile and the JD requirements. Flag: "Strong match" / "Partial match — can be positioned as X" / "Gap — address in cover letter or don't apply"]
 ---
 ## Part 2: Tailored CV Summary / Profile Section
 Rewrite or create the candidate's CV summary/profile section (the 3–5 lines at the top of a CV) specifically for this role:
 **Rules:**
 - Open with the job title or a near-match (ATS reward)
 - Include 2–3 keywords from the JD naturally
 - Reference years of experience in the relevant area
 - End with a forward-looking line connecting their background to what this role needs
 - Keep to 60–80 words maximum
 **Tailored CV Summary:**
 [Write the summary]
 ---
 ## Part 3: Experience Bullet Point Rewrites
 For the 2–3 most relevant roles on the CV, suggest how to reframe existing bullet points to better match this JD:
 **[Role Title] at [Company]**
 | Original Bullet | Tailored Version | Why |
 |---|---|---|
 | [Candidate's original text] | [Improved version with JD keywords and stronger impact framing] | [Brief note on what changed] |
 **Rules for bullet point rewrites:**
 - Lead with an action verb
 - Include a quantified outcome where possible (%, £, time saved, users impacted)
 - Weave in JD keywords naturally — not forced
 - Keep to one line (2 max)
 ---
 ## Part 4: Cover Letter
 **Format:** 3 paragraphs + closing. Target: 250–350 words. Anything longer won't be read.
 ---
 [Hiring Manager's name if known, otherwise "Hiring Team"]
 **Paragraph 1 — The Hook (Why this role, specifically)**
 [2–4 sentences. Reference something specific about the company or role — not generic enthusiasm. The candidate's genuine reason for applying goes here. This is what makes it human. Generic openers like "I am writing to apply for..." are filtered out mentally within 3 seconds.]
 **Paragraph 2 — The Evidence (Why them)**
 [3–5 sentences. 2–3 specific examples from their background that directly address the JD's key themes. Use the language of the JD. Include at least one quantified achievement. Don't list everything — pick the 2–3 strongest matches and go deep, not broad.]
 **Paragraph 3 — The Forward Bridge (Why now)**
 [2–3 sentences. Connect their trajectory to this role. Why is this the logical next step? What do they want to learn or build that this role enables? This should feel like the natural continuation of their career, not just "I want a new challenge."]
 ---
 I'd welcome the chance to discuss how my background could contribute to [Company/Team]. Thank you for your time.
 [Name]
 [Email] | [LinkedIn URL] | [Location if relevant]
 ---
 ## Part 5: Application Checklist
 Before submitting:
 - [ ] CV summary updated with tailored version above
 - [ ] ATS keywords appear in CV body (not just summary)
 - [ ] Cover letter is under 400 words
 - [ ] Company name is spelled correctly throughout (sounds obvious — it happens)
 - [ ] No generic phrases: "passionate about," "results-driven," "team player" without evidence
 - [ ] LinkedIn profile updated to match CV (recruiters cross-check)
 - [ ] Role title in subject line if emailing directly
 ---
 ## Quality Checks
 - [ ] JD analysis completed before writing (not skipped)
 - [ ] ATS keywords are integrated naturally (not stuffed)
 - [ ] Cover letter opens with something specific (not a generic opener)
 - [ ] Paragraph 2 includes at least one quantified achievement
 - [ ] Cover letter is 250–350 words
 - [ ] Gaps are either addressed or strategically omitted
 ## Anti-Patterns
 - [ ] Do not fabricate or embellish experience — only use real achievements from the provided CV
 - [ ] Do not use the same cover letter template for every role — every letter must reference specific details of the job description
 - [ ] Do not address selection criteria that aren't in the JD — match keywords the employer actually used
 - [ ] Do not omit ATS optimisation — ensure role-specific keywords from the JD appear naturally in the CV summary
 - [ ] Do not write a cover letter that re-summarises the CV — it must add context and motivation, not repeat bullet points
 ## Example Trigger Phrases
 - "Help me apply for this job: [paste JD]"
 - "Tailor my CV for this role: [paste JD + CV]"
 - "Write a cover letter for [role] at [company]"
 - "Optimise my application for ATS for this job description"
@@ -0,0 +1,102 @@
 # Executive Summary Skill
 Writes executive summaries that busy decision-makers actually read — front-loaded with conclusions, structured for skimming, ruthless about what to include.
 ## Required Inputs
 - **Source document or topic** (paste or describe)
 - **Audience** (CEO / board / investor / minister / client / committee)
 - **Decision or action needed** (what should the reader do after reading?)
 - **Length limit** (1 page / 2 pages / 500 words)
 - **Format** (formal report / slide / email / briefing paper)
 ## Core Principle
 An executive summary is NOT a summary of the document. It is a standalone document that:
 - States the conclusion upfront — not at the end
 - Contains only what the reader needs to make a decision
 - Can be understood without reading anything else
 - Recommends a specific action
 ## Output Structure
 ---
 ### [Title]
 **Executive Summary**
 *Prepared for: [Audience] | Date: [Date] | Author: [Name]*
 ---
 **Bottom line up front:**
 [The most important thing. The recommendation or finding. 2-3 sentences. A reader who only reads this should know what you are asking or telling them.]
 ---
 **Background (why this matters):**
 [2-3 sentences. Minimum context to understand the bottom line. Not the history — just what the reader needs now.]
 ---
 **Key findings / analysis:**
 - **[Finding 1]:** [One sentence — specific and evidence-based]
 - **[Finding 2]:** [One sentence]
 - **[Finding 3]:** [One sentence]
 ---
 **Options considered:** (include only if a decision is being presented)
 | Option | Benefit | Risk | Recommendation |
 |---|---|---|---|
 | [Option A] | [Benefit] | [Risk] | Recommended |
 | [Option B] | [Benefit] | [Risk] | Not recommended |
 ---
 **Recommendation:**
 [Specific. "We recommend [action] because [reason]. This will [outcome]." Not "we suggest consideration of options."]
 ---
 **Immediate next steps:**
 - [Action 1 — specific, with owner and date]
 - [Action 2]
 ---
 **Risks of inaction:** [What happens if the reader does nothing]
 **Full report:** [Reference to where the full document can be found]
 ---
 ## Adapting for Different Audiences
 **CEO/MD:** Lead with financial or strategic impact. 1 page. Make the decision binary. Ask in sentence one.
 **Board:** Lead with governance or risk. Frame against organisational objectives. State specifically what you need from them.
 **Investor:** Lead with return or opportunity. Specific numbers. 1 page. Anticipate "why now."
 **Minister/senior public sector:** Lead with public benefit or policy alignment. Include cost-benefit framing.
 **Client:** Lead with their problem. Show you understand before presenting recommendation.
 ## Quality Checks
 - [ ] Bottom line in first 3 sentences
 - [ ] Standalone — no need to read full document
 - [ ] Recommendation is specific
 - [ ] Fits length limit
 - [ ] Written for audience priorities not author priorities
 - [ ] Next steps have owners and dates
 ## Anti-Patterns
 - [ ] Do not summarise the document chronologically — an executive summary that follows the structure of the source document is not an executive summary, it is an abstract
 - [ ] Do not bury the recommendation at the end — executives read the first paragraph and skim the rest; the ask must be in sentence one or two
 - [ ] Do not use the same summary for different audiences — a CEO and a board member have different decision contexts and require different framing
 - [ ] Do not include background that the reader already knows — every sentence of background must earn its place by making the bottom line more actionable
 - [ ] Do not leave the "risks of inaction" section vague — a summary that does not quantify what happens if the reader does nothing removes the urgency needed for a decision
 ## Example Trigger Phrases
 - "Write an executive summary of this report: [paste]"
 - "Summarise this document for the board: [paste]"
 - "Create a one-pager from this proposal for the CEO"
 - "Turn these findings into an exec summary"
@@ -0,0 +1,105 @@
 # Grant Proposal Skill
 Produces structured grant proposals tailored to the funder priorities — the most common reason grants fail is writing about what you want to do rather than what the funder wants to fund.
 ## Required Inputs
 - **Funder name and grant programme**
 - **Grant amount sought**
 - **Project description** (rough notes are fine)
 - **Your organisation** (type, track record, capacity)
 - **Funder stated priorities** (copy from their guidance — essential)
 - **Word or page limits**
 - **Deadline**
 ## Output Structure
 ---
 ### Project Title
 [Informative and memorable. Should convey the problem being solved and the approach.]
 ### 1. Project Summary / Abstract (200-300 words — written last, placed first)
 [What you will do, why it matters, who will benefit, measurable outcomes. Every sentence earns its place.]
 ### 2. Problem Statement / Need
 - **The problem:** [Specific, evidenced — use data]
 - **Who is affected:** [Population, scale, geography]
 - **Current situation:** [What exists and why it is insufficient]
 - **Consequence of inaction:** [What happens if not funded]
 - **Why your organisation:** [Track record, relationships, expertise]
 Funder test: does this problem align with [funder] stated priorities? Make the connection explicit.
 ### 3. Project Objectives
 3-5 SMART objectives:
 - **Objective 1:** [Specific, Measurable, Achievable, Relevant, Time-bound]
 ### 4. Methodology / Approach
 **Phase 1: [Name]** (Months 1-X)
 [What will happen, who will do it, what is produced]
 **Key activities:**
 - [Activity — specific]
 **What makes this approach innovative or effective:** [Why this over alternatives]
 ### 5. Impact and Outcomes
 | Level | Description | Measure |
 |---|---|---|
 | Output | [Tangible deliverable] | [How counted] |
 | Short-term outcome | [Immediate change] | [How measured] |
 | Medium-term outcome | [Behaviour change] | [How measured] |
 | Long-term impact | [Systemic change] | [How evidenced] |
 **Direct beneficiaries:** [Who and how many]
 **Sustainability:** [How work continues beyond grant period]
 ### 6. Evaluation Plan
 - Who evaluates, how, when, what is measured, how findings are shared
 ### 7. Budget Narrative
 | Budget line | Amount | Justification |
 |---|---|---|
 | Staff costs | £[amount] | [Role, % FTE, duration, salary] |
 | Travel | £[amount] | [Specific journeys named] |
 | Equipment | £[amount] | [Itemised] |
 | Indirect costs | £[amount] | [[X]% of direct — check policy] |
 | **Total** | **£[total]** | |
 **Value for money:** [Cost per beneficiary. What could not be done without this grant]
 ### 8. Organisational Capacity
 [Track record of similar projects, governance, financial management. Name previous grants and outputs — be specific]
 ### 9. Risk Register
 | Risk | Likelihood | Impact | Mitigation |
 |---|---|---|---|
 | [Risk] | H/M/L | H/M/L | [Specific mitigation] |
 ---
 ## Quality Checks
 - [ ] Every section explicitly references funder stated priorities (not just generic language)
 - [ ] Problem statement includes specific data, not just assertions
 - [ ] Objectives are SMART (measurable and time-bound)
 - [ ] Budget narrative justifies every line with specific detail
 - [ ] Sustainability section explains what happens after the grant ends
 - [ ] Word limits respected
 ## Anti-Patterns
 - [ ] Do not write a generic proposal — every section must be tailored to the specific funder's stated priorities
 - [ ] Do not exceed the specified word or page limits — over-length proposals are disqualified at many funders
 - [ ] Do not leave the sustainability section vague — funders need to know what happens after grant funding ends
 - [ ] Do not use jargon the funder's reviewers won't understand — write for the panel, not the project team
 - [ ] Do not underspecify the budget narrative — every significant line item must be justified with method and reasoning
 ## Example Trigger Phrases
 - "Write a grant proposal for [project] applying to [funder]"
 - "Help me write a funding application for [grant programme]"
 - "Turn these project notes into a grant proposal: [paste]"
@@ -0,0 +1,153 @@
 # Last 30 Days Research
 ## The Problem
 Googling gives SEO-stuffed "best of" lists written six months ago by someone who has never used the thing. Real honest takes live on Reddit threads, X replies, and niche communities — but chasing them across platforms eats your afternoon. This skill does the chase for you.
 ## Required Inputs
 | Input | Required | Notes |
 |-------|----------|-------|
 | Topic | Yes | Tool, trend, feature, product, event, company — anything with a name |
 | Date scope | No | Defaults to last 30 days. Can override to last 7 days or last 90 days |
 | Angle | No | e.g. "focus on developer sentiment" or "looking for pricing complaints specifically" |
 ## Output Structure
 The output is a structured research report with the following sections, delivered in this exact order:
 ```
 ## Last 30 Days Research: [Topic]
 Research window: [Date 30 days ago] → [Today's date]
 ---
 ## What People Agree On
 [Consensus points that appear across multiple platforms — most reliable signal]
 ## Where People Disagree
 [Active debates, contrasting views — include which side has more weight]
 ## Pain Points That Keep Coming Up
 [Recurring complaints and frustrations — strongest signal of real problems]
 ## Positive Signals
 [What people genuinely praise — not PR, but unprompted appreciation]
 ## Most Interesting Takes
 [Contrarian, unexpected, or surprisingly insightful comments worth noting]
 ## Sources
 [Links to the most useful threads/posts found — 5–10 links with brief labels]
 ## Signal Confidence
 [High / Medium / Low — with a one-line rationale based on data volume and consistency]
 ```
 Each section should contain substantive content, not placeholders. If a section has no findings (e.g. no positive signals found), state that explicitly rather than leaving it empty or fabricating content.
 ## Instructions for Claude
 ### Step 1 — Calculate the date window
 Determine today's date and subtract 30 days to get the research start date. Format: YYYY-MM-DD. Use these dates explicitly in every search query.
 ### Step 2 — Reddit search
 Run at least three web searches targeting Reddit:
 ```
 site:reddit.com "[topic]" after:[30-days-ago-date]
 site:reddit.com "[topic]" 2025
 reddit.com "[topic]" discussion OR thread OR comments
 ```
 For each result: read the thread title, top-level comments, and any highly-upvoted replies. Record the key claims and the URL.
 If the topic has common synonyms or abbreviations, run additional searches with those (e.g. "Claude Code" and "claude.code" and "Anthropic coding tool").
 ### Step 3 — X/Twitter search
 Run at least two web searches targeting X:
 ```
 site:twitter.com OR site:x.com "[topic]" after:[30-days-ago-date]
 "[topic]" site:x.com -is:retweet
 ```
 Note: X search via web has limitations. If results are sparse, supplement with searches for specific accounts known to discuss the topic area (e.g. tech journalists, domain experts).
 ### Step 4 — Broader web search
 Run at least two broader searches for articles, blog posts, and commentary:
 ```
 "[topic]" review OR opinion OR experience [month] [year]
 "[topic]" vs OR alternative OR comparison [month] [year]
 ```
 Target sources: Hacker News, Substack, dev.to, personal blogs, product communities. Avoid press releases and vendor-authored content.
 ### Step 5 — Cross-platform corroboration check
 Before writing the report, review everything collected and apply the corroboration rule:
 **When the same point appears on both Reddit and X independently, treat it as strong signal — it's likely true.**
 A point mentioned only once on one platform is a data point, not a finding. Weight your sections accordingly.
 ### Step 6 — Write the report
 Populate each section of the output structure. Follow these rules:
 - **What People Agree On**: Only include points you saw on 2+ platforms or in multiple independent threads. These are your most reliable findings.
 - **Where People Disagree**: Name the sides. "Some say X, others say Y — and the X camp seems louder based on upvote counts / engagement."
 - **Pain Points**: Be specific. "Performance issues" is weak. "Cold start times over 4 seconds on the free tier" is useful.
 - **Positive Signals**: Must be unprompted praise, not from product marketing or sponsored content.
 - **Most Interesting Takes**: At least 2, maximum 5. Quote or closely paraphrase where possible.
 - **Sources**: Include the actual URLs. Label each one briefly (e.g. "Reddit thread: 'Has anyone switched from X to Y?'").
 - **Signal Confidence**: Rate High/Medium/Low based on:
  - High = 10+ sources, consistent signal across platforms
  - Medium = 5–10 sources, some inconsistency
  - Low = fewer than 5 sources, or highly fragmented signal
 ### Step 7 — Sanity check before delivering
 Before outputting the report, verify:
 - [ ] Every claim in the report traces to an actual source found during research (not prior knowledge)
 - [ ] The date window was actually applied to searches, not ignored
 - [ ] No fabricated or hallucinated URLs in the Sources section
 - [ ] Signal Confidence rating reflects the actual data volume, not optimism
 ## Quality Checks
 - [ ] At minimum 3 Reddit searches were run with the date filter applied
 - [ ] At minimum 2 X/Twitter searches were run
 - [ ] At minimum 2 broader web searches were run
 - [ ] Cross-platform corroboration principle was applied (same point on multiple platforms = stronger signal)
 - [ ] Pain Points section contains specific, concrete details — not vague generalisations
 - [ ] Sources section contains real URLs (not hallucinated), verified during research
 - [ ] Signal Confidence is rated and justified
 - [ ] If a section has no findings, it says so explicitly rather than being omitted or padded
 - [ ] No vendor-authored content or press releases treated as independent signal
 - [ ] Synonyms and alternative names for the topic were searched
 ## Anti-Patterns
 - [ ] Do not treat SEO blog posts or vendor-authored content as community signal — only count independent sources
 - [ ] Do not report findings without applying the date filter — prior knowledge mixed with recent search results produces stale, unverifiable claims
 - [ ] Do not fabricate or guess at URLs — every link in the Sources section must have been retrieved during the research session
 - [ ] Do not report a single mention as a "finding" — a finding requires corroboration from at least two independent sources
 - [ ] Do not rate Signal Confidence as High when fewer than 5 credible sources were found — this misleads the reader about how much to rely on the output
 ## Example Trigger Phrases
 - "What are people saying about Cursor AI from the last 30 days?"
 - "Research Vercel's recent sentiment"
 - "Last 30 days on the Arc browser shutdown"
 - "What's the current vibe on Supabase?"
 - "What are developers saying about Claude Code lately?"
 - "Research [topic] from the last 30 days"
 - "Give me a signal report on [product]"
 - "What's the Reddit and Twitter take on [trend]?"
@@ -0,0 +1,178 @@
 # NotebookLM Connector
 ## The Problem
 NotebookLM is one of the best AI research tools — but it doesn't connect to your other tools. Every notebook requires manual setup inside the NotebookLM UI: open browser, name the notebook, paste URLs one by one, click generate. For researchers, builders, or anyone who works with a high volume of sources, this friction compounds fast.
 This skill automates NotebookLM from Claude Code using browser automation via the Claude Chrome extension.
 ## Prerequisites
 | Requirement | Details |
 |-------------|---------|
 | Claude Chrome extension | Must be installed and active in your Chrome browser |
 | NotebookLM account | Active account at notebooklm.google.com |
 | Chrome browser | Open and signed into NotebookLM |
 If the Chrome extension is not installed, this skill cannot function. There is no fallback — you will need to perform actions manually.
 ## Required Inputs
 | Input | Required | Notes |
 |-------|----------|-------|
 | Action(s) to perform | Yes | What you want done — see Supported Actions below |
 | Notebook name | Conditional | Required for create; optional for add/generate if a notebook is already open |
 | Sources | Conditional | Required for add sources action — URLs, file paths, or pasted text |
 | Output type | Conditional | Required for generate action — mindmap, audio overview, or briefing doc |
 ## Supported Actions
 | Action | What It Does |
 |--------|-------------|
 | Create notebook | Opens NotebookLM, creates a new notebook with the specified title |
 | Add sources | Adds one or more URLs, files, or text blocks as sources to a notebook |
 | Generate mindmap | Triggers mindmap generation from the notebook's sources |
 | Generate audio overview | Requests an audio overview (note: takes several minutes to render) |
 | Generate briefing doc | Requests a briefing document or slide deck from sources |
 | List notebooks | Lists your existing notebooks and their source counts |
 | Open notebook | Navigates to a specific existing notebook by name |
 Actions can be chained in a single request: "Create a notebook called 'AI Trends Q2', add these 3 URLs as sources, then generate a mindmap."
 ## Output Structure
 After completing actions, Claude returns a structured confirmation:
 ```
 ## NotebookLM — Actions Completed
 **Notebook:** [Notebook name]
 **URL:** [Direct link to the notebook]
 **Actions completed:**
 - [x] Created notebook: "[Name]"
 - [x] Added source: [URL or file name]
 - [x] Added source: [URL or file name]
 - [x] Triggered: Mindmap generation
 **Status:** [Any pending items — e.g. "Audio overview is generating, check back in 5–10 minutes"]
 **Notes:** [Any issues encountered or deviations from the requested actions]
 ```
 If an action fails, the failed step is marked with `[ ]` and a reason is provided. See Error Handling below.
 ## Instructions for Claude
 ### Step 1 — Parse and confirm the request
 Before opening any browser, parse the full request into discrete steps:
 1. What notebook is being targeted (new or existing)?
 2. What sources need to be added (list each URL or file)?
 3. What outputs need to be generated?
 If anything is ambiguous — e.g. "add my research sources" without specifying what they are — ask for clarification before proceeding. Do not guess at source URLs.
 ### Step 2 — Check the Chrome extension is available
 Confirm browser automation is available via the Claude Chrome extension. If it is not active, stop and report:
 > "This skill requires the Claude Chrome extension to be installed and active. Please install it at [extension URL] and try again."
 ### Step 3 — Navigate to NotebookLM
 Open or navigate to `https://notebooklm.google.com`. Confirm the user is logged in. If a login screen appears, stop and ask the user to log in manually, then retry.
 ### Step 4 — Execute actions in order
 Execute each action in the sequence requested. After each action, confirm it completed before moving to the next. Do not batch actions speculatively.
 **Creating a notebook:**
 - Click "New Notebook"
 - Enter the specified title
 - Confirm the notebook is created and visible
 **Adding a URL source:**
 - In the notebook, click "Add Source"
 - Select "Website" or "URL"
 - Paste the URL
 - Wait for the source to process and appear in the sources list
 - Confirm before adding the next source
 **Adding pasted text:**
 - Click "Add Source"
 - Select "Copied text" or "Paste text"
 - Paste the content
 - Confirm the source appears
 **Generating a mindmap:**
 - Navigate to the notebook's output options
 - Select "Mindmap" from available outputs
 - Trigger generation
 - Confirm the mindmap begins rendering
 **Generating an audio overview:**
 - Navigate to output options
 - Select "Audio Overview"
 - Trigger generation
 - Note: rendering takes several minutes — report this to the user, do not wait for completion
 ### Step 5 — Compile and return the confirmation
 Return the structured output described in the Output Structure section above, including the direct notebook URL and a checklist of completed/failed actions.
 ## Error Handling
 If any step fails, do the following:
 1. Stop at the failed step (do not attempt to continue)
 2. Report the exact step that failed and what was observed
 3. Suggest a manual workaround for that step
 4. Offer to retry from that point
 **Common failures and workarounds:**
 | Failure | Likely Cause | Manual Workaround |
 |---------|-------------|-------------------|
 | Extension not detected | Extension not installed or disabled | Install from Chrome Web Store |
 | Login screen appears | Session expired | Log in manually, then retry |
 | Source fails to process | URL is paywalled or blocked | Download content and add as pasted text instead |
 | Mindmap not available | Source volume too low | Add more sources (NotebookLM requires minimum content) |
 | Audio overview grayed out | Sources not yet indexed | Wait 1–2 minutes for indexing, then retry |
 ## Limitations
 - **Chrome extension required** — This skill does not work in the Claude web interface without the extension. It cannot function in API-only or terminal-only Claude setups.
 - **NotebookLM UI changes** — If Google updates the NotebookLM interface, specific steps (button names, navigation paths) may need to be updated in this skill.
 - **Audio overview render time** — Audio overviews are queued server-side by NotebookLM and typically take 5–15 minutes. Claude can trigger the request but cannot wait for completion.
 - **File uploads** — Uploading local files (PDFs, docs) requires the file to be accessible from the browser. File paths must be absolute.
 - **Session state** — Claude cannot save or restore NotebookLM session state between conversations. Each session starts fresh.
 ## Quality Checks
 - [ ] User's full request was parsed into discrete steps before any browser action was taken
 - [ ] Ambiguous source references were clarified before proceeding
 - [ ] Each action was confirmed complete before the next one started
 - [ ] Direct notebook URL is included in the output
 - [ ] If audio overview was triggered, user was informed of the render delay
 - [ ] Any failed steps are explicitly reported with the specific failure reason
 - [ ] Manual workaround was offered for any step that failed
 - [ ] Output checklist accurately reflects what was completed vs. what failed
 ## Anti-Patterns
 - [ ] Do not proceed with any browser action before the full request has been parsed into discrete steps — ambiguous source references must be clarified before navigating
 - [ ] Do not guess at source URLs if the user says "add my research sources" without specifying them — ask for the explicit list before starting
 - [ ] Do not batch actions speculatively — each action must be confirmed complete before the next one begins to avoid compounding failures
 - [ ] Do not wait for audio overview rendering to complete — audio overviews take 5–15 minutes server-side; report the trigger and move on rather than blocking the session
 - [ ] Do not attempt this skill if the Claude Chrome extension is not active — report the missing prerequisite immediately rather than attempting browser steps that will fail
 ## Example Trigger Phrases
 - "Open NotebookLM and create a notebook called 'Competitor Analysis Q2'"
 - "Add these 5 URLs as sources to my NotebookLM notebook"
 - "Generate a mindmap in NotebookLM from my current notebook"
 - "Create a NotebookLM notebook on AI agent frameworks, add these sources, and generate an audio overview"
 - "What notebooks do I have in NotebookLM?"
 - "Add this article to NotebookLM: [URL]"
 - "Generate a briefing doc from my NotebookLM sources on [topic]"
@@ -0,0 +1,82 @@
 # Press Release Skill
 Writes press releases that journalists actually read — structured around the news angle, not the desire to promote.
 ## Required Inputs
 - **The news** (what is actually happening — be specific)
 - **Company name**
 - **Date of announcement / embargo date**
 - **Key quote** (from which executive and approximately what they want to say)
 - **Why this matters** (to the reader, not the company)
 - **Target media** (trade / national / local / consumer / investor)
 - **Media contact details**
 ## Output Structure
 ---
 FOR IMMEDIATE RELEASE / EMBARGOED UNTIL: [Date and time]
 ---
 # [Headline — active verb, specific news, under 10 words]
 ## [Subheadline — the so-what in one sentence, adds context not repetition]
 **[City, Date]** — [Opening paragraph: Who, What, When, Where, Why in 2-3 sentences. A journalist should be able to run this paragraph alone. No background, no context, no company history.]
 [Second paragraph: the significance. Why does this matter? What does it mean for customers or the industry?]
 [Third paragraph: quote from executive. Human and specific. Not a restatement of the headline.]
 "[Quote text — specific, adds something the facts do not say]," said [Name], [Title] at [Company]. "[Second sentence extending the thought]."
 [Fourth paragraph: supporting detail — data, customer names with permission, additional context]
 [Fifth paragraph optional: what happens next, when it goes live, what people can do]
 ---
 ENDS
 ---
 **Notes to editors:**
 **About [Company]**
 [Boilerplate: 3-4 sentences. What the company does, when founded, where based, key facts. Factual not promotional.]
 **Media contact:**
 [Name] | [Title] | [Email] | [Phone] | [Hours/timezone]
 ---
 ## Headline Rules
 - Active voice: "Company launches X" not "X is launched by Company"
 - Specific: "raises 5M" not "secures significant investment"
 - Under 10 words
 - Never start with the company name — lead with the news
 ## Journalist Test
 Would a journalist care? Is the headline the full story? Is there a human angle? Is the quote something a human would say? Can the first paragraph stand alone?
 ## Quality Checks
 - [ ] Headline uses active voice and is under 10 words
 - [ ] First paragraph stands alone as the complete story
 - [ ] Quote adds something the facts don't say (not a restatement)
 - [ ] Boilerplate is factual, not promotional
 - [ ] Embargo date and media contact are included
 ## Anti-Patterns
 - [ ] Do not bury the news — the most important information must appear in the first paragraph (inverted pyramid)
 - [ ] Do not use promotional language or superlatives — press releases must read as news, not advertising copy
 - [ ] Do not omit the boilerplate — every press release needs the standard "About [Company]" paragraph at the end
 - [ ] Do not forget the embargo date and media contact — journalists need both to use the release
 - [ ] Do not write a headline longer than 12 words — it must be scannable and specific
 ## Example Trigger Phrases
 - "Write a press release announcing [news]"
 - "Draft a media statement about [event]"
 - "We are launching [product] — write the press release"
 - "Turn this announcement into a press release: [paste notes]"
@@ -0,0 +1,159 @@
 # Sycophancy Challenger
 Claude defaults to validating. You bring a decision, it finds three reasons your instinct is solid, and you leave more confident but not more right. That's actively dangerous when the stakes are high — a hiring call, a pricing change, a strategy pivot, a public commitment. This skill flips the default: Claude argues against your idea first, holds its position under pushback, and only concedes when you give it new evidence. Not when you express displeasure.
 > Credit: Originally created by Joel Salinas (Leadership in Change) — adapted and extended for this library.
 ---
 ## Required Inputs
 | Input | Format | Notes |
 |---|---|---|
 | Your idea, decision, plan, or assumption | Describe it in plain language | More context = sharper challenge. Include reasoning if you have it. |
 No other setup required. Activating the skill is enough — describe your idea and Claude will challenge it immediately.
 ---
 ## Output Structure
 Every response in this mode follows this exact format:
 ```
 ## Strongest Case AGAINST This
 [The single most damaging criticism of the idea. Not a list of concerns — the
 one argument that, if true, would kill this. Stated directly, without softening.]
 ## The Weakest Element
 [The specific part of the idea most likely to fail, be wrong, or break under
 real-world conditions. Named precisely. Not "execution risk" — the actual thing.]
 ## What You'd Need to Prove to Make This Work
 [The assumptions that must be true for this idea to succeed. Written as testable
 claims, not as encouragement. If an assumption can't be tested, that's noted.]
 ## What I Can't Find Fault With
 [Only appears when a genuine search finds nothing damaging. States clearly what
 holds up and why — doesn't invent weak praise to fill the section. If everything
 is actually fine, says so plainly and explains why the challenge came up short.]
 ```
 No additional sections. No summary. No "overall, this is a solid idea." The format ends when the four sections are complete.
 ---
 ## Instructions for Claude
 ### On activation
 Do not open with agreement, validation, or any form of "I see where you're coming from." Begin the challenge immediately. The first word of your response should advance the criticism, not soften the user's expectations.
 ### Step 1: Assume the idea hasn't been stress-tested
 Treat the idea as if the user believes in it strongly and has not actively looked for reasons it fails. Your job is to be the adversary they didn't have in the room.
 ### Step 2: Find the strongest case against it
 Not a balanced view. Not pros and cons. The strongest case against. Ask:
 - What's the most likely way this fails?
 - What's the assumption that, if wrong, makes everything else irrelevant?
 - Who would argue against this, and what's the best version of their argument?
 - What does this idea get wrong about how people, markets, or systems actually behave?
 State the strongest case directly. Do not list multiple criticisms in this section — lead with the one that does the most damage.
 ### Step 3: Identify the weakest element
 This is different from the strongest case against. The weakest element is the most fragile specific component — the thing most likely to crack under execution, scrutiny, or changed conditions. Name it precisely. Examples of insufficient answers:
 - "The timeline might be tight" → insufficient
 - "The assumption that customers will pay $99/month before experiencing the product is the element most likely to break this, because you have no evidence of willingness-to-pay at that price point" → correct level of specificity
 ### Step 4: Surface the required assumptions
 List what must be true for this to work. Write each assumption as a testable claim:
 ```
 For this to work, the following must be true:
 1. [Assumption stated as a claim that can be verified or falsified]
 2. [Assumption stated as a claim]
 3. [Assumption stated as a claim]
 ```
 If an assumption cannot be tested — it's based on hope, belief, or unprovable prediction — flag it explicitly: "This assumption cannot currently be tested. That's a risk."
 ### Step 5: Report what holds up (only if true)
 Search genuinely for what the idea gets right or where the challenge fails. If you find it, state it clearly. If you can't find a real flaw, say exactly that: "I've looked for the failure points and I can't find them. Here's what actually holds up: [specific things]." Do not invent praise. Do not invent flaws either.
 ### Handling pushback
 If the user pushes back:
 - **New evidence or new information:** update your position based on the evidence. State what changed and why.
 - **Emotional pushback, repetition, or displeasure:** do not move. Restate the criticism calmly. Example: "I understand you feel strongly about this — I'm not backing off the point about X because that hasn't changed. If there's something I'm missing, tell me what it is."
 - **A clarification that changes the picture:** acknowledge the clarification, adjust if warranted, and explain exactly what the clarification changed.
 Do not soften a position because the user seems upset. Do not move back to validation mode mid-conversation.
 ### When the skill ends
 The session is complete when the user has either:
 1. Strengthened their idea by addressing the core criticism with real evidence or a genuine plan adjustment, or
 2. Identified a real flaw they're going to fix.
 Not when they've expressed satisfaction. Not when a certain number of exchanges have happened. The measure is whether something actually changed or was genuinely defended.
 ### Prohibitions
 These prohibitions do more work than the rules above. Follow them absolutely:
 - **Never open with agreement or validation.** Not "That's an interesting approach," not "I can see why you'd think that." Start with the challenge.
 - **Never say "great question," "great point," or "I see where you're coming from" as a lead.** These are validation openers, not neutral transitions.
 - **Never soften a criticism with "however, there are also positives."** If the positives are real, they go in the "What I Can't Find Fault With" section, not as a counterweight to every criticism.
 - **Never back down because the user expressed displeasure.** Only move if given new evidence.
 - **Never invent a flaw that isn't real.** If the idea is actually solid, say so. Inventing fake criticisms is as useless as fake validation.
 - **Never use the word "valid" to describe the user's perspective mid-challenge.** It's a validation signal disguised as a neutral word.
 ---
 ## Quality Checks
 - [ ] Response opened with the challenge — not with a softening phrase or acknowledgment
 - [ ] "Strongest Case Against" section contains one argument, not a list
 - [ ] "Weakest Element" is specific — names the actual component, not a category of risk
 - [ ] "What You'd Need to Prove" lists testable assumptions, not encouragement
 - [ ] Untestable assumptions are explicitly flagged as risks
 - [ ] "What I Can't Find Fault With" only appears if the search was genuine and something held up
 - [ ] No invented flaws — every criticism connects to something real in what the user described
 - [ ] Pushback was met with a position restatement, not a retreat (unless new evidence was provided)
 - [ ] The session ended because something changed or was genuinely defended — not because the user seemed satisfied
 - [ ] None of the prohibited phrases or patterns appear anywhere in the response
 ---
 ## Anti-Patterns
 - [ ] Do not open with a softening phrase or acknowledgment before the challenge — the first sentence must be the critique
 - [ ] Do not retreat from a position when the user pushes back without providing new evidence — update only when genuinely persuaded
 - [ ] Do not invent flaws — every criticism must connect to something real in what the user described
 - [ ] Do not provide a list of weak objections — identify the single strongest case against the idea
 - [ ] Do not end the session because the user seems satisfied — end only when something genuinely changed or was defended
 ## Example Trigger Phrases
 - "Use the sycophancy-challenger skill — here's my plan: [describe it]"
 - "Challenge this idea before I commit to it: [describe it]"
 - "I've already decided to do X — tell me why I'm wrong"
 - "Be the devil's advocate on this hire: [describe the candidate and the role]"
 - "I'm about to pitch this to investors — tear it apart first: [describe it]"
 - "Don't validate this, challenge it: [idea or assumption]"
 - "Stress-test this strategy: [describe it]"
 - "What's the strongest argument against doing this: [decision]"
 - "I think I'm right about X — what am I missing?"
@@ -0,0 +1,121 @@
 # Teaching Lesson Plan Skill
 Produces a complete, structured lesson plan for any subject, age group, or setting — from a one-hour corporate training to a full school lesson. Built around clear learning objectives, varied activities, and formative assessment.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Subject or topic**
 - **Audience** (age group, experience level, group size)
 - **Session length** (30 / 45 / 60 / 90 / 120 minutes)
 - **Setting** (classroom / workshop / online / corporate training / one-to-one)
 - **Learning goal** (what should participants know or be able to do by the end?)
 - **Prior knowledge** (what can you assume they already know?)
 ## Output Structure
 ---
 # Lesson Plan: [Topic]
 **Subject:** [Subject] | **Audience:** [Description] | **Duration:** [X minutes]
 **Setting:** [Setting] | **Group size:** [N]
 ---
 ## Learning Objectives
 By the end of this session, participants will be able to:
 1. [Objective 1 — use Bloom's taxonomy verbs: recall, explain, apply, analyse, evaluate, create]
 2. [Objective 2]
 3. [Objective 3 — maximum 3–4 objectives per session]
 **Key vocabulary:** [3–5 terms participants will need to know]
 ---
 ## Materials and Preparation
 - [ ] [Resource 1 — slides, handout, equipment]
 - [ ] [Resource 2]
 - [ ] Room setup: [configuration — rows / circles / tables / breakout spaces]
 ---
 ## Lesson Structure
 | Time | Phase | Activity | Format |
 |---|---|---|---|
 | [00:00] | Hook / Opener | [How you grab attention and establish relevance] | [Whole group / Individual / Pairs] |
 | [00:05] | Prior knowledge | [How you connect to what they already know] | [Discussion / Quiz / Think-pair-share] |
 | [00:15] | Instruction | [Direct teaching of new content] | [Explanation / Demo / Video] |
 | [00:30] | Guided practice | [Supported practice with feedback] | [Worked examples / Group task] |
 | [00:50] | Independent practice | [Students apply learning independently] | [Task / Problem / Discussion] |
 | [01:05] | Check for understanding | [Formative assessment] | [Exit ticket / Quiz / Q&A] |
 | [01:15] | Closure | [Summarise, connect to next session] | [Whole group] |
 ---
 ## Key Explanations and Worked Examples
 ### [Concept 1]
 [Clear explanation + one concrete worked example. Explain the concept the way a good teacher would — no jargon without definition, one idea at a time.]
 ### [Concept 2]
 [Explanation + example]
 ---
 ## Differentiation
 **For those who need more support:**
 - [Scaffold: e.g. sentence starters, worked examples, vocabulary cards]
 - [Modified task or reduced scope]
 **For those ready for a challenge:**
 - [Extension: e.g. apply to a new context, evaluate, create something]
 ---
 ## Formative Assessment (Check for Understanding)
 **During session:**
 - [Method 1: e.g. Cold calling with no-stakes approach, thumbs up/down, mini whiteboards]
 - [Method 2: e.g. Think-pair-share before moving on]
 **Exit ticket (last 5 minutes):**
 [One specific question that directly tests the learning objective — not "what did you enjoy?" but "solve this problem" or "explain this concept in your own words"]
 ---
 ## Common Misconceptions to Address
 | Misconception | Correct understanding | How to address it |
 |---|---|---|
 | [What learners often get wrong] | [The correct version] | [Specific activity or explanation] |
 ---
 ## Quality Checks
 - [ ] Learning objectives use action verbs (not "understand" or "know")
 - [ ] Session has a clear hook that establishes relevance
 - [ ] Activities are varied (not all listening)
 - [ ] Formative assessment checks the actual learning objective
 - [ ] Differentiation is specified for both support and extension
 - [ ] Timing adds up to session length
 ## Anti-Patterns
 - [ ] Do not design a lesson plan without explicitly stating the learning objectives — activities must trace back to outcomes
 - [ ] Do not allocate timing that does not add up to the total session length — the plan must be time-feasible
 - [ ] Do not create activities with no assessment component — learning must be measurable, not just delivered
 - [ ] Do not ignore differentiation — a plan with no accommodation for different learning levels or abilities is incomplete
 - [ ] Do not front-load all content delivery without interactive breaks — passive listening degrades retention after 15–20 minutes
 ## Example Trigger Phrases
 - "Write a lesson plan on [topic] for [audience]"
 - "Design a 60-minute session on [subject]"
 - "Create a training module on [skill]"
 - "Plan a workshop on [topic] for [group]"
@@ -0,0 +1,182 @@
 # Churn Analysis Skill
 Produce a structured churn analysis that goes beyond the headline rate — identifying why customers leave, which segments are most at risk, and what interventions will have the highest impact on retention.
 ## Required Inputs
 Ask for these if not already provided:
 - **Time period** being analysed (e.g. Q1, last 12 months)
 - **Total customers at start of period** and **customers churned**
 - **ARR or revenue lost** to churn
 - **Churn reasons data** — exit survey results, CSM notes, support data, or sales loss reasons
 - **Customer segments** — by tier, industry, cohort, or product line
 - **Current retention rate** if known
 - **Any recent changes** — pricing, product, support model — that may have affected churn
 ## Churn Categories
 Always classify churn before analysing it:
 | Category | Definition |
 |---|---|
 | **Voluntary — avoidable** | Customer left due to a problem we could have addressed (product gaps, poor onboarding, relationship failures) |
 | **Voluntary — unavoidable** | Customer left for reasons outside our control (budget cuts, acquisition, company shutdown) |
 | **Involuntary** | Payment failure, contract non-renewal by mistake, admin error |
 The interventions for each category are different. Conflating them leads to wrong conclusions.
 ## Output Format
 ---
 # Churn Analysis: [Product / Segment / Company]
 **Period:** [Start date] — [End date]
 **Prepared by:** [Name] | **Date:** [Date]
 ---
 ## Headline Numbers
 | Metric | Value |
 |---|---|
 | Customers at start of period | [N] |
 | Customers churned | [N] |
 | **Customer churn rate** | **[X]%** |
 | ARR at start of period | £/$/€[X] |
 | ARR lost to churn | £/$/€[X] |
 | **Revenue churn rate (gross)** | **[X]%** |
 | ARR from expansions (same period) | £/$/€[X] |
 | **Net revenue retention (NRR)** | **[X]%** |
 **Benchmark context:**
 - Customer churn rate: [X]% vs. industry benchmark [Y]% — [above / below / in line]
 - NRR: [X]% — [What this means: above 100% = expansion offsets churn; below 100% = shrinking base]
 ---
 ## Churn Breakdown by Category
 | Category | Customers | % of churn | ARR lost |
 |---|---|---|---|
 | Voluntary — avoidable | [N] | [X]% | £/$/€[X] |
 | Voluntary — unavoidable | [N] | [X]% | £/$/€[X] |
 | Involuntary | [N] | [X]% | £/$/€[X] |
 | **Total** | **[N]** | **100%** | **£/$/€[X]** |
 **Avoidable churn as % of total churn:** [X]% — this is the number we can actually influence.
 ---
 ## Churn Reasons — Avoidable Churn Only
 Rank by frequency. Include ARR weight where data allows.
 | Reason | Count | % of avoidable churn | ARR lost | Representative quote |
 |---|---|---|---|---|
 | [Reason 1 — e.g. "Product missing key feature"] | [N] | [X]% | £/$/€[X] | "[Quote]" |
 | [Reason 2] | [N] | [X]% | £/$/€[X] | "[Quote]" |
 | [Reason 3] | [N] | [X]% | £/$/€[X] | "[Quote]" |
 | [Reason 4] | [N] | [X]% | £/$/€[X] | "[Quote]" |
 | Other | [N] | [X]% | £/$/€[X] | — |
 **Theme synthesis:** [2–3 sentences grouping the top reasons into 2–3 themes. E.g. "The top three reasons cluster around two themes: product gaps in [area] (affecting X% of avoidable churn) and onboarding failures where customers never achieved value (Y%)."]
 ---
 ## Churn by Segment
 Identify which segments over- or under-index for churn.
 ### By Tier
 | Tier | Churn rate | vs. Overall | Notes |
 |---|---|---|---|
 | Enterprise | [X]% | +/-[X]pp | |
 | Mid-Market | [X]% | +/-[X]pp | |
 | SMB | [X]% | +/-[X]pp | |
 ### By Cohort (Acquisition Year)
 | Cohort | Churn rate | Notes |
 |---|---|---|
 | [Year 1] | [X]% | |
 | [Year 2] | [X]% | |
 | [Year 3] | [X]% | |
 ### By Industry / Use Case (if data available)
 | Segment | Churn rate | Notes |
 |---|---|---|
 | [Segment 1] | [X]% | |
 | [Segment 2] | [X]% | |
 **Key pattern:** [Which segment has the highest churn rate and what likely explains it]
 ---
 ## Timing Analysis
 - **Average contract length before churn:** [X months]
 - **Highest-risk moment:** [e.g. "Month 3 — when trial value has worn off but full adoption hasn't happened"]
 - **Churn timing distribution:**
 | When churn occurred | % of churned accounts |
 |---|---|
 | 0–3 months | [X]% |
 | 3–6 months | [X]% |
 | 6–12 months | [X]% |
 | 12+ months | [X]% |
 ---
 ## Early Warning Signals
 Based on the churned accounts, identify the signals that preceded churn (and could have triggered earlier intervention):
 | Signal | Lead time before churn | How to detect |
 |---|---|---|
 | [Signal 1 — e.g. "DAU/MAU dropped below 15%"] | [~X weeks] | [Usage dashboard / alert] |
 | [Signal 2 — e.g. "No QBR in 90+ days"] | [~X weeks] | [CRM flag] |
 | [Signal 3 — e.g. "Champion left the account"] | [~X weeks] | [LinkedIn alert / CSM tracking] |
 | [Signal 4] | [~X weeks] | [Detection method] |
 ---
 ## Intervention Recommendations
 Ranked by estimated impact × feasibility.
 | Intervention | Addresses | Est. churn reduction | Effort | Owner |
 |---|---|---|---|---|
 | [Intervention 1 — e.g. "Improve onboarding for [segment] with dedicated 30-day check-in"] | [Reason 1] | [X accounts / £X ARR] | Low / Med / High | [Team] |
 | [Intervention 2] | [Reason 2] | [X accounts / £X ARR] | Low / Med / High | [Team] |
 | [Intervention 3] | [Reason 3] | [X accounts / £X ARR] | Low / Med / High | [Team] |
 **Priority call:** [Which one intervention, if implemented this quarter, would have the biggest impact and why]
 ---
 ## What We Don't Know (Data Gaps)
 - [Data gap 1 — e.g. "Exit survey response rate is only 30% — the reasons data may not be representative"]
 - [Data gap 2 — e.g. "No product usage data for SMB tier — can't confirm usage signal correlation"]
 - [Data gap 3]
 ---
 ## Anti-Patterns
 - [ ] Do not mix avoidable and unavoidable churn in intervention plans — recommending product fixes for customers who churned due to company shutdown wastes resources
 - [ ] Do not calculate churn rate using end-of-period customer count as the denominator — this understates churn; always divide churned customers by the starting cohort
 - [ ] Do not rely solely on exit survey data for churn reasons — response rates are typically low and self-selection biases the sample toward customers who are engaged enough to complete a survey
 - [ ] Do not recommend interventions without linking them to a specific churn reason — interventions disconnected from root causes will not move retention
 - [ ] Do not report only gross revenue churn — without net revenue retention (NRR), a healthy-looking retention number can hide a shrinking revenue base
 ## Quality Checks
 - [ ] Churn rate is correctly calculated (churned ÷ starting cohort, not end-of-period total)
 - [ ] Avoidable and unavoidable churn are separated — interventions target avoidable churn only
 - [ ] Churn reasons are customer-reported, not internally assumed
 - [ ] Segment analysis identifies which segments over-index — not just averages
 - [ ] Early warning signals are specific and detectable, not generic ("low engagement")
 - [ ] Interventions link directly to the top churn reasons — no recommendations without a root cause match
@@ -0,0 +1,179 @@
 # Customer Escalation Brief Skill
 Produce a clear, concise escalation brief that gives internal stakeholders — VP CS, CCO, product leadership, or the CEO — everything they need to understand the situation, make decisions, and act fast.
 A good escalation brief is not a complaint. It is a professional document that states the facts, assigns accountability honestly, and proposes a specific resolution plan.
 ## Required Inputs
 Ask for these if not already provided:
 - **Account name**, tier, and ARR
 - **CSM name** and account owner
 - **Nature of the escalation** — what happened, what the customer is saying
 - **Timeline** of events leading to escalation
 - **Customer contact** who escalated (name, role, influence level)
 - **What the customer wants** — their stated ask
 - **What we believe the root cause is**
 - **What has already been done** to address the situation
 - **Renewal date** and current renewal risk assessment
 ## Escalation Levels
 Calibrate urgency and audience based on escalation level:
 | Level | Trigger | Audience | Response time |
 |---|---|---|---|
 | L1 — Account Risk | Customer expressing dissatisfaction; renewal at risk | CSM + CS Manager | 24 hours |
 | L2 — Executive Escalation | Customer escalated to their exec; requesting vendor exec involvement | VP CS + Account Exec | 4 hours |
 | L3 — Churn Risk | Customer has issued notice or is in active churn conversation | CCO / CEO + Revenue leadership | 1 hour |
 | L4 — Public Risk | Customer threatening public escalation, legal, or press | CCO / Legal / Comms | Immediate |
 ## Output Format
 ---
 # Escalation Brief: [Account Name]
 **Escalation level:** L[1/2/3/4] — [Label]
 **Date raised:** [Date]
 **Raised by:** [CSM name]
 **Escalation owner:** [Name of exec or senior stakeholder now leading response]
 ---
 ## Account at a Glance
 | Field | Detail |
 |---|---|
 | ARR | £/$/€[X] |
 | Tier | Enterprise / Mid-Market / SMB |
 | Customer since | [Date] |
 | Renewal date | [Date] — [N] days away |
 | Renewal risk (pre-escalation) | Green / Amber / Red |
 | Renewal risk (current) | Green / Amber / Red |
 | Customer contact who escalated | [Name, role, seniority] |
 | Executive sponsor (customer) | [Name, role — active / passive / vacant] |
 | Executive sponsor (vendor) | [Name, role] |
 ---
 ## What Happened — Summary
 [3–5 sentences. State the facts plainly. What the customer experienced, how they reacted, and how we learned about the escalation. No editorialising. No blame.]
 ---
 ## Timeline
 List in chronological order. Each entry: `[Date / time] — [What happened. Who did what.]`
 Include:
 - When the original issue or trigger event occurred
 - When the customer first raised concerns (informally)
 - When it escalated (formal escalation or exec involvement)
 - Actions taken since escalation
 ---
 ## Root Cause
 **Primary cause:** [One clear sentence. What specifically went wrong.]
 **Contributing factors:**
 - [Factor 1 — be honest about internal failures as well as external ones]
 - [Factor 2]
 **Is this a systemic issue or isolated?**
 [ ] Isolated to this account
 [ ] Pattern seen in other accounts — details: [_______]
 [ ] Product or process gap that needs fixing
 ---
 ## Customer's Stated Position
 **What the customer says happened:** [Their version of events — fair and unfiltered]
 **What they are asking for:** [Their explicit ask — compensation, fix by date, exec call, SLA credit, exit clause]
 **Sentiment of escalating contact:** [Frustrated but constructive / Angry / Seeking exit / Unknown]
 **Risk of public escalation:** Low / Medium / High — [evidence if Medium or High]
 ---
 ## Business Impact
 | Impact type | Detail |
 |---|---|
 | ARR at risk | £/$/€[X] |
 | Potential churn probability | [X]% |
 | Reputational risk | Low / Medium / High |
 | Reference / case study status | [Was a reference — now at risk / Not a reference] |
 | Expansion pipeline at risk | £/$/€[X] |
 ---
 ## What Has Been Done So Far
 1. [Action taken — by whom — date — outcome]
 2. [Action taken — by whom — date — outcome]
 3. [Action taken — by whom — date — outcome]
 **Has a formal apology or acknowledgement been issued?** Yes / No
 ---
 ## Proposed Resolution Plan
 **Immediate actions (next 24–48 hours):**
 | Action | Owner | By when |
 |---|---|---|
 | [Action] | [Name] | [Date] |
 | [Action] | [Name] | [Date] |
 **Medium-term actions (next 2–4 weeks):**
 | Action | Owner | By when |
 |---|---|---|
 | [Action] | [Name] | [Date] |
 **What we are NOT offering:** [Be explicit about what is not on the table — avoids misaligned expectations]
 **Success criteria:** [How will we know the escalation is resolved? What does the customer need to confirm they are satisfied?]
 ---
 ## Decision Required from Escalation Owner
 [State clearly what decision or resource the escalation owner needs to provide. Be specific — do not make them ask. E.g.: "We need approval to offer a 20% service credit for Q2" or "We need an exec call with [name] within 48 hours."]
 ---
 ## Communication Plan
 | Audience | Message | Channel | Owner | By when |
 |---|---|---|---|---|
 | Escalating customer contact | [Summary of message] | Email / Call | [Name] | [Date] |
 | Customer exec sponsor | [Summary] | Call | [Name] | [Date] |
 | Internal CS team | [Summary] | Slack / Meeting | CS Manager | [Date] |
 ---
 ## Quality Checks
 - [ ] Root cause is specific — not "communication breakdown" or "product gap" without detail
 - [ ] Customer's position is stated fairly — not minimised or dismissed
 - [ ] A clear decision is requested from the escalation owner — brief does not end with "what do you think?"
 - [ ] ARR at risk is quantified
 - [ ] Communication plan has owners and dates — not "TBD"
 - [ ] Language is professional and blameless toward individuals
 ## Anti-Patterns
 - [ ] Do not assign blame to individuals — focus on system failures and process gaps
 - [ ] Do not downplay ARR at risk or describe churn risk vaguely without a number
 - [ ] Do not leave resolution plan ownership as "TBD" or unassigned
 - [ ] Do not write the brief without a clear ask from the escalation owner
 - [ ] Do not omit the customer's own stated position — their perspective must be represented fairly
@@ -0,0 +1,158 @@
 # Customer Health Scorecard Skill
 Produce a structured, data-driven health scorecard for a customer account — giving the CSM and leadership a clear view of renewal risk, expansion potential, and the actions needed to move the account in the right direction.
 ## Required Inputs
 Ask for these if not already provided:
 - **Account name** and tier (enterprise / mid-market / SMB)
 - **Contract value** (ARR) and **renewal date**
 - **Product usage data** — logins, DAU/MAU ratio, key feature adoption
 - **Support data** — open tickets, CSAT or NPS score, recent escalations
 - **Engagement data** — last QBR date, executive sponsor status, champion name
 - **Commercial data** — payment history, expansion conversations, seats used vs. licensed
 - **Any known risks or recent changes** at the account
 ## Scoring Framework
 Score each dimension 1–5. Weight as shown. Calculate weighted total out of 100.
 | Dimension | Weight | What to Score |
 |---|---|---|
 | **Product Adoption** | 30% | DAU/MAU ratio, breadth of features used, power users identified |
 | **Engagement** | 20% | QBR cadence, executive sponsor active, champion strength |
 | **Outcomes** | 20% | Customer hitting their stated goals / success metrics |
 | **Support Health** | 15% | Ticket volume trend, unresolved escalations, CSAT |
 | **Commercial** | 15% | On-time payments, seats utilised, expansion signals |
 **Score → RAG conversion:**
 - 80–100: Green (healthy, renew likely)
 - 60–79: Amber (at risk, needs attention)
 - 0–59: Red (high churn risk, escalate)
 ## Programmatic Helper
 This skill ships with a stdlib-only Python script that applies the weights above and converts the weighted total to a RAG status — so the headline score is computed identically every time and weights always sum to 100%.
 ```bash
 # Five scores 1-5 in order: adoption engagement outcomes support commercial
 python3 scripts/health_score.py --scores 4 3 4 2 5 --account "Acme Corp"
 # Or from JSON (lets you override the default weights per account/segment)
 python3 scripts/health_score.py --input account.json
 ```
 It returns the per-dimension weighted points, the **total out of 100**, and the **RAG band** (Green ≥80, Amber 60–79, Red <60) with a one-line next step. Run it to set the headline number, then write the dimension detail and actions below around it. Add `--json` for downstream tooling.
 ## Output Format
 ---
 # Customer Health Scorecard: [Account Name]
 **CSM:** [Name] | **Tier:** [Enterprise / Mid-Market / SMB]
 **ARR:** £/$/€[X] | **Renewal date:** [Date] | **Days to renewal:** [N]
 **Overall health:** [Green / Amber / Red] — [Score]/100
 **Last updated:** [Date]
 ---
 ## Health Score Summary
 | Dimension | Score (1–5) | Weight | Weighted Score | Trend |
 |---|---|---|---|---|
 | Product Adoption | [1–5] | 30% | [X] | ↑ / → / ↓ |
 | Engagement | [1–5] | 20% | [X] | ↑ / → / ↓ |
 | Outcomes | [1–5] | 20% | [X] | ↑ / → / ↓ |
 | Support Health | [1–5] | 15% | [X] | ↑ / → / ↓ |
 | Commercial | [1–5] | 15% | [X] | ↑ / → / ↓ |
 | **Total** | — | 100% | **[X]/100** | |
 ---
 ## Dimension Detail
 ### Product Adoption — [Score]/5
 - **DAU/MAU ratio:** [X]% (benchmark: >25% = healthy)
 - **Key features adopted:** [List features in use]
 - **Features not adopted:** [List unused high-value features]
 - **Power users identified:** [Yes / No — how many]
 - **Assessment:** [1–2 sentences on adoption health]
 ### Engagement — [Score]/5
 - **Last QBR:** [Date] — [Outcome summary]
 - **Next QBR:** [Scheduled / Overdue]
 - **Executive sponsor:** [Active / Passive / Vacant]
 - **Champion:** [Name, role, strength: strong / moderate / weak]
 - **Assessment:** [1–2 sentences]
 ### Outcomes — [Score]/5
 - **Customer's stated goals:** [List 2–3 goals from onboarding or last QBR]
 - **Progress against goals:** [On track / Partial / Off track]
 - **Evidence of value:** [Metric or quote that demonstrates ROI]
 - **Assessment:** [1–2 sentences]
 ### Support Health — [Score]/5
 - **Open tickets:** [N] (priority breakdown: P1: X, P2: X, P3: X)
 - **CSAT / NPS:** [Score] (benchmark: >8 CSAT / >30 NPS = healthy)
 - **Unresolved escalations:** [Yes / No — details if yes]
 - **Ticket trend (last 90 days):** Increasing / Stable / Decreasing
 - **Assessment:** [1–2 sentences]
 ### Commercial — [Score]/5
 - **Seats licensed:** [N] | **Seats active:** [N] ([X]% utilisation)
 - **Payment history:** [On time / Late — details]
 - **Expansion signals:** [Yes — describe / No]
 - **Downgrade or cancellation signals:** [Yes — describe / No]
 - **Assessment:** [1–2 sentences]
 ---
 ## Top Risks
 | Risk | Severity | Mitigation |
 |---|---|---|
 | [Risk description] | High / Medium / Low | [Specific action to mitigate] |
 ---
 ## Recommended Actions
 **Immediate (this week):**
 1. [Action — owner — deadline]
 **This month:**
 1. [Action — owner — deadline]
 **Before renewal:**
 1. [Action — owner — deadline]
 ---
 ## Renewal Forecast
 | Scenario | Probability | ARR at risk |
 |---|---|---|
 | Full renewal at current ARR | [X]% | £/$/€0 |
 | Renewal with contraction | [X]% | £/$/€[X] |
 | Churn | [X]% | £/$/€[full ARR] |
 **Recommended renewal play:** [Expand / Hold / Save / Manage out]
 ---
 ## Quality Checks
 - [ ] Score is based on data, not gut feel — each dimension has evidence
 - [ ] Risks are specific (not "low engagement" — something like "executive sponsor left in March, no replacement identified")
 - [ ] Actions have owners and deadlines
 - [ ] Renewal probability is calibrated against pipeline reality
 - [ ] Trend arrows reflect direction of change vs. last scorecard, not just current state
 ## Anti-Patterns
 - [ ] Do not score health dimensions on gut feel — every score needs specific supporting evidence
 - [ ] Do not give a Green status to accounts with unresolved P1 issues or missed milestones
 - [ ] Do not list risks vaguely — "low engagement" without specifics is not actionable
 - [ ] Do not leave recommended actions without named owners and deadlines
 - [ ] Do not conflate product usage frequency with product value delivery
@@ -0,0 +1,195 @@
 # Customer Success Plan Skill
 This skill produces a joint customer success plan — a living document shared between the CSM and the customer that aligns on outcomes, milestones, and mutual commitments. Output is ready to co-author with the customer in a kickoff call or QBR.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Account name** and industry
 - **Product / plan purchased**
 - **Key stakeholders** — customer champion and economic buyer
 - **Customer's stated business goals** — why did they buy? What problem are they solving?
 - **Contract term and renewal date**
 - **Current onboarding stage** (new customer / expanding / post-QBR / pre-renewal)
 - **Seats / licenses / usage purchased**
 - **Any known risks** — adoption gaps, champion uncertainty, competing priorities
 ## Output Structure
 ---
 # Customer Success Plan: [Account Name]
 **Product:** [Product name / plan tier]
 **Contract term:** [Start date → Renewal date]
 **CSM:** [Name]
 **Customer champion:** [Name, Title]
 **Customer executive sponsor:** [Name, Title — if known]
 **Last updated:** [Date]
 **Status:** [Active / Under review / Completed]
 ---
 ## 1. Partnership Objectives
 > *What does success look like for [Account Name] at contract end?*
 [Write 2–3 sentences describing the customer's core objective in plain English — what they are trying to achieve in their business, not what features they are using.]
 **Primary business goal:** [e.g. Reduce time-to-hire by 30% across engineering teams]
 **Secondary goal:** [e.g. Consolidate three legacy tools into one platform, saving £X/year]
 **Success statement (customer's words):** "[Direct quote from champion about what success looks like — ask for this in kickoff]"
 ---
 ## 2. Success Metrics
 Define how both parties will measure success. Agreed in the kickoff call and tracked in QBRs.
 | Metric | Baseline (today) | Target | By when | Data source |
 |---|---|---|---|---|
 | [e.g. Seat utilisation] | [X%] | [≥ 80%] | [Month 3] | [Product analytics] |
 | [e.g. Time to hire] | [X days] | [< Y days] | [Month 6] | [Customer's ATS] |
 | [e.g. Reports produced/month] | [X] | [≥ Y] | [Month 3] | [Product analytics] |
 | [e.g. NPS] | [X] | [≥ 8] | [Month 6] | [Quarterly survey] |
 **Leading indicators** (early signs the plan is on track):
 - [e.g. 5+ users log in within the first 2 weeks]
 - [e.g. First workflow automated within 30 days]
 - [e.g. Champion presents the tool to their team by end of Month 1]
 ---
 ## 3. Milestone Roadmap
 Break the success journey into phases with clear milestones and owners:
 ### Phase 1: Onboard (Month 1)
 | Milestone | Owner | Due date | Status |
 |---|---|---|---|
 | Admin setup complete (SSO, permissions, data integration) | [IT contact] | [Date] | [ ] |
 | All purchased seats activated and users invited | [Champion] | [Date] | [ ] |
 | Core workflow [X] configured and tested | [CSM + Champion] | [Date] | [ ] |
 | First training session delivered (all teams) | [CSM] | [Date] | [ ] |
 | Kickoff call completed and success plan co-signed | [CSM + Champion] | [Date] | [ ] |
 ### Phase 2: Adopt (Months 2–3)
 | Milestone | Owner | Due date | Status |
 |---|---|---|---|
 | [Core feature] in active daily use by ≥ X users | [Champion] | [Date] | [ ] |
 | First business outcome achieved and documented | [Champion + CSM] | [Date] | [ ] |
 | 30-day check-in completed | [CSM] | [Date] | [ ] |
 | [Power user workflow] enabled for advanced users | [CSM] | [Date] | [ ] |
 ### Phase 3: Value (Months 4–6)
 | Milestone | Owner | Due date | Status |
 |---|---|---|---|
 | QBR 1 delivered — ROI evidence presented | [CSM + AE] | [Date] | [ ] |
 | Success metric [X] hit target | [Champion] | [Date] | [ ] |
 | Expansion use case identified and introduced | [AE] | [Date] | [ ] |
 | Reference call or case study agreed | [Champion] | [Date] | [ ] |
 ### Phase 4: Renew & Expand (Months 7–12)
 | Milestone | Owner | Due date | Status |
 |---|---|---|---|
 | QBR 2 delivered — renewal conversation started | [CSM + AE] | [Date] | [ ] |
 | Renewal proposal sent | [AE] | [Date] | [ ] |
 | Expansion or flat renewal signed | [AE] | [Date] | [ ] |
 ---
 ## 4. Mutual Commitments
 Success plans work when both parties commit. Document what each side will do:
 **[Vendor] commits to:**
 - Dedicated CSM available [X days/week / by email within 24 hours]
 - Monthly [call / check-in / async update] with champion
 - QBR every [90 days] with executive summary and ROI report
 - Priority support for [Account] — response SLA of [X hours] for P1 issues
 - Roadmap preview for relevant upcoming features
 - [Any other specific commitment made in sales cycle]
 **[Account Name] commits to:**
 - Champion available for [30-min monthly] check-in
 - Users complete onboarding training by [date]
 - Feedback on product experience shared monthly (async or sync)
 - Executive sponsor participates in QBR 1 and renewal discussion
 - Provide outcome data to CSM quarterly for ROI tracking
 ---
 ## 5. Stakeholder Engagement Plan
 | Stakeholder | Role | Engagement frequency | Format | Owner |
 |---|---|---|---|---|
 | [Champion] | Day-to-day owner | Weekly (async) + Monthly (call) | Slack / Email + Zoom | CSM |
 | [Economic buyer] | Budget holder | Quarterly | QBR (in-person or video) | CSM + AE |
 | [IT contact] | Integration owner | As needed | Email | CSM |
 | [End users] | Active users | Training only | Group session | CSM |
 ---
 ## 6. Risk & Mitigation
 | Risk | Likelihood | Impact | Mitigation plan |
 |---|---|---|---|
 | Low adoption in first 30 days | [M] | [H] | CSM hosts live onboarding; champion sends internal comms day 1 |
 | Champion changes role | [L] | [H] | Multi-thread: introduce CSM to 2 additional stakeholders by Month 2 |
 | Budget pressure at renewal | [M] | [H] | Build ROI case monthly; document value continuously |
 | Competing priorities delay rollout | [H] | [M] | Agree minimum viable adoption path with champion; don't require perfection to declare value |
 ---
 ## 7. Communication Plan
 | Communication | Audience | Frequency | Format | Owner |
 |---|---|---|---|---|
 | Health update | Champion | Monthly | Email summary (3 bullets: what's good, what needs attention, one ask) | CSM |
 | QBR | Champion + Exec | Quarterly | 45-min video call with slide deck | CSM + AE |
 | Product updates | Champion | As released | Release notes email | CSM |
 | Support status | Champion | When open tickets exist | Email / Slack | Support + CSM |
 ---
 ## 8. Escalation Path
 If the success plan falls off track:
 | Trigger | Action | Owner | Timeline |
 |---|---|---|---|
 | Health drops to Amber | Internal review + champion call within 5 days | CSM | Immediate |
 | Health drops to Red | CS leadership + AE looped in; escalation brief drafted | CS Manager | Within 24 hours |
 | Champion is unresponsive for >10 days | AE attempts exec sponsor contact | AE | After CSM attempt fails |
 | Adoption <40% at Month 3 | Emergency enablement session + revised milestone plan | CSM | Within 1 week of flag |
 ---
 ## Quality Checks
 - [ ] Success metrics are the customer's metrics — not just product usage metrics
 - [ ] Milestones have specific owners and due dates — not "TBD"
 - [ ] Mutual commitments section is genuinely mutual — not just what the vendor will do
 - [ ] Risk register includes champion departure and low adoption
 - [ ] Plan is written to be shared with the customer — no internal-only commentary in this document
 - [ ] Executive sponsor is identified and has an engagement role
 ## Anti-Patterns
 - [ ] Do not define success metrics that the vendor controls — metrics must reflect the customer's business outcomes
 - [ ] Do not set milestone dates without customer confirmation — unilateral timelines undermine joint ownership
 - [ ] Do not create a plan the customer hasn't agreed to — it must be mutual, not a CSM's internal plan
 - [ ] Do not leave ownership fields blank or assigned to "CS team" — every action needs a named owner
 - [ ] Do not confuse product adoption milestones with customer business outcomes — both are needed but are not the same
 ## Example Trigger Phrases
 - "Build a success plan for [Account Name] who just signed"
 - "Create a joint success plan for our new enterprise customer"
 - "Write a 6-month customer success roadmap for [Company]"
 - "I need a mutual action plan for our QBR with [Account]"
 - "Generate a customer success plan for an at-risk account"
@@ -0,0 +1,221 @@
 # QBR Deck Skill
 Produce a complete Quarterly Business Review deck — structured, data-backed, and customer-focused. A good QBR demonstrates value delivered, aligns on goals for the next quarter, and strengthens the executive relationship. It should never feel like a product demo or a vendor update.
 ## Required Inputs
 Ask for these if not already provided:
 - **Account name**, CSM name, and customer stakeholders attending
 - **Contract details** — ARR, contract start date, renewal date
 - **Last quarter's goals** (from previous QBR or kickoff)
 - **Usage and adoption data** — key metrics for the quarter
 - **Support summary** — tickets raised, resolution time, any escalations
 - **Business outcomes the customer cares about** — what success looks like for them
 - **Product updates or new features** relevant to this customer
 - **Goals for next quarter**
 - **Any open commercial conversations** (expansion, renewal, at-risk signals)
 ## QBR Principles
 - Lead with customer outcomes, not product features
 - Every metric should connect to a business result the customer cares about
 - The agenda is a conversation, not a presentation — build in time for customer input at every stage
 - Close with mutual commitments, not just vendor actions
 ## Output Format
 ---
 # QBR: [Account Name] × [Your Company]
 **[Quarter] [Year] Business Review**
 **Date:** [Date] | **Location / Call link:** [TBC]
 **Customer attendees:** [Names and roles]
 **[Your company] attendees:** [Names and roles]
 ---
 ## Slide 1: Agenda (5 min)
 | Time | Topic | Owner |
 |---|---|---|
 | 0:00 | Welcome and introductions | CSM |
 | 0:05 | [Last quarter] — how did we do? | CSM + Customer |
 | 0:20 | Value delivered — business impact | CSM |
 | 0:35 | What's coming — roadmap preview | CSM / Product |
 | 0:45 | [Next quarter] — goals and priorities | Customer |
 | 0:55 | Actions and mutual commitments | CSM |
 | 1:00 | Close | |
 *Talking point: "We've kept today to 60 minutes. We want as much of this to be a conversation as possible — please push back, redirect, and ask questions throughout."*
 ---
 ## Slide 2: Where We Are Together (2 min)
 **Partnership snapshot:**
 - **Customer since:** [Date]
 - **Contract value:** £/$/€[ARR]/year
 - **Renewal date:** [Date]
 - **Active users:** [N] of [N] licensed seats ([X]% adoption)
 - **Products / modules active:** [List]
 *Talking point: "Before we dive in — a quick picture of where we are. [X] months in, [Y] active users, and this is our [Nth] QBR together."*
 ---
 ## Slide 3: Last Quarter — Goals We Set Together (5 min)
 | Goal | Set in [Last QBR / Kickoff] | Status |
 |---|---|---|
 | [Goal 1] | [What we committed to] | ✅ Achieved / ⚠️ Partial / ❌ Missed |
 | [Goal 2] | [What we committed to] | ✅ Achieved / ⚠️ Partial / ❌ Missed |
 | [Goal 3] | [What we committed to] | ✅ Achieved / ⚠️ Partial / ❌ Missed |
 For any partial or missed goal: state what happened and what changes next quarter.
 *Talking point: "Let's start with accountability. Here's what we said we'd achieve last quarter — let's be honest about where we landed."*
 ---
 ## Slide 4: Usage and Adoption (5 min)
 **Quarter-over-quarter trend:**
 | Metric | [Q-1] | [Q] | Change |
 |---|---|---|---|
 | Monthly active users | [N] | [N] | +/-X% |
 | Sessions per user per week | [N] | [N] | +/-X% |
 | [Key feature 1] adoption | [X]% | [X]% | +/-X% |
 | [Key feature 2] adoption | [X]% | [X]% | +/-X% |
 **Highlights:**
 - [Positive adoption trend to call out]
 - [Feature or workflow with strongest engagement]
 **Opportunity:**
 - [Feature with low adoption that could drive more value — link to their goals]
 *Talking point: "Usage is [up / stable / something we want to talk about]. The area I'd like to focus on is [feature] — we're not seeing the adoption we'd expect given [their goal], and I want to understand why."*
 ---
 ## Slide 5: Business Impact — Value Delivered (10 min)
 Lead with outcomes, not activity.
 **[Outcome 1: customer's primary success metric]**
 - Before: [baseline]
 - Now: [current state]
 - Impact: [quantified business result — time saved, revenue influenced, cost reduced, risk mitigated]
 **[Outcome 2]**
 - [Same structure]
 **[Outcome 3]**
 - [Same structure]
 **Customer evidence** (use if available):
 > "[Quote from champion or user about value experienced]"
 *Talking point: "This is the section I most want your input on. Are these the outcomes that matter to your business? Are there other ways you're measuring success that we should be tracking?"*
 ---
 ## Slide 6: Support Summary (3 min)
 | Metric | This quarter | Last quarter | Trend |
 |---|---|---|---|
 | Tickets raised | [N] | [N] | ↑ / → / ↓ |
 | Average resolution time | [X hrs] | [X hrs] | ↑ / → / ↓ |
 | P1 / critical issues | [N] | [N] | ↑ / → / ↓ |
 | CSAT score | [X/10] | [X/10] | ↑ / → / ↓ |
 **Notable issues this quarter:**
 - [Any escalation or major ticket — brief summary and resolution]
 **What we're doing differently:**
 - [Any process change or improvement based on support patterns]
 ---
 ## Slide 7: What's Coming — Roadmap Preview (5 min)
 Focus only on what's relevant to this customer's goals. Do not dump the full roadmap.
 | Feature / Improvement | Expected | Why it matters to [Account Name] |
 |---|---|---|
 | [Feature 1] | [Q+1] | [Direct link to their goal or pain point] |
 | [Feature 2] | [Q+1 / Q+2] | [Direct link] |
 | [Feature 3] | [H2] | [Direct link] |
 *Talking point: "I've filtered the roadmap to what I think matters most to your team. I'd love your reaction — are these the right priorities from your perspective?"*
 ---
 ## Slide 8: Next Quarter — Your Goals (10 min)
 **Customer input section — facilitate, don't present.**
 Prompt questions:
 - "What does success look like for your team in [next quarter]?"
 - "What's the biggest challenge you're trying to solve in the next 90 days?"
 - "Is there anything about the way you're using [product] you want to change?"
 **Capture live:**
 | Goal for next quarter | Owner (customer) | How we'll support it | How we'll measure it |
 |---|---|---|---|
 | [Goal 1] | [Name] | [CSM / product action] | [Metric] |
 | [Goal 2] | [Name] | [CSM / product action] | [Metric] |
 ---
 ## Slide 9: Mutual Commitments (5 min)
 **[Your company] commits to:**
 1. [Specific action — owner — by when]
 2. [Specific action — owner — by when]
 3. [Specific action — owner — by when]
 **[Account Name] commits to:**
 1. [Specific action — owner — by when]
 2. [Specific action — owner — by when]
 **Next touchpoint:** [Date of next check-in or mid-quarter review]
 ---
 ## Slide 10: Thank You + Open Q&A (5 min)
 - Recap the one headline from today: [The single most important thing you want them to remember]
 - Confirm actions are captured and shared after the call
 - Ask: "Is there anything we didn't cover today that you wanted to raise?"
 ---
 ## Preparation Checklist
 - [ ] Usage data pulled and QoQ comparison calculated
 - [ ] Last QBR goals reviewed — status confirmed before the meeting
 - [ ] Business outcomes framed in customer language (not product language)
 - [ ] Roadmap filtered to this account's specific use cases
 - [ ] Customer's goals for next quarter researched or pre-confirmed with champion
 - [ ] Executive sponsor briefed on any sensitive topics before the call
 - [ ] Actions from previous QBR reviewed — any outstanding items addressed
 ## Quality Checks
 - [ ] Every slide has a talking point, not just a title
 - [ ] Value slide leads with business outcomes, not product activity
 - [ ] Roadmap preview links each item to a customer goal
 - [ ] Mutual commitments section has real owners on both sides
 - [ ] Customer has at least 20 minutes of airtime in the agenda
 ## Anti-Patterns
 - [ ] Do not fill the QBR with product activity metrics — lead with business outcomes the customer cares about
 - [ ] Do not present a roadmap without linking each item to a customer goal — vendor priorities are not a QBR agenda
 - [ ] Do not run a QBR as a one-sided presentation — it must include structured time for the customer to speak
 - [ ] Do not close a QBR without documented mutual commitments with named owners on both sides
 - [ ] Do not skip the "what's not working" slide — suppressing problems erodes trust and misses renewal risks
@@ -0,0 +1,193 @@
 # Renewal Playbook Skill
 This skill produces a complete renewal playbook for a specific customer account, covering health assessment, commercial strategy, negotiation preparation, expansion opportunity mapping, and a step-by-step timeline. Output is ready for the CSM or account team to execute 90–180 days before renewal.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Account name**
 - **Renewal date**
 - **Current ARR** and proposed renewal ARR (if different)
 - **Account health** — RAG status and main reasons (or describe the account situation)
 - **Key stakeholders** — economic buyer, champion, and any detractors
 - **Renewal risk factors** — budget pressure, low adoption, competitive threat, champion departure, etc.
 - **Expansion opportunity** — any upsell or cross-sell potential?
 - **Contract terms** — current plan, duration, and any terms up for renegotiation
 ## Output Structure
 ---
 # Renewal Playbook: [Account Name]
 **Renewal date:** [Date]
 **Current ARR:** [£/$/€ X]
 **Target renewal ARR:** [£/$/€ X — flat / +X% expansion / contraction risk]
 **Health status:** [Green / Amber / Red]
 **CSM:** [Name]
 **Account executive:** [Name]
 **Days to renewal:** [X days]
 ---
 ## 1. Account Health Snapshot
 | Dimension | Score (1–5) | Evidence |
 |---|---|---|
 | **Product adoption** | [X/5] | [e.g. 3 of 5 purchased seats active; core feature used weekly] |
 | **Business outcomes** | [X/5] | [e.g. Customer reports X% improvement in [metric]; no formal ROI review done] |
 | **Relationship depth** | [X/5] | [e.g. Strong champion in [name/role]; limited exec sponsorship] |
 | **Support & satisfaction** | [X/5] | [e.g. 2 open P2 tickets; last NPS 7; no escalations in 6 months] |
 | **Commercial engagement** | [X/5] | [e.g. Invoice paid on time; no discount pressure raised yet] |
 | **Overall health** | [X/5 — weighted] | [Green / Amber / Red] |
 **Renewal thesis:** [One sentence: why this account will renew — or what must change for it to renew.]
 ---
 ## 2. Stakeholder Map
 | Stakeholder | Role | Influence | Sentiment | Our relationship |
 |---|---|---|---|---|
 | [Name] | Economic buyer | High | [Positive / Neutral / Negative] | [Warm / Cold / Unknown] |
 | [Name] | Champion | High | [Positive] | [Warm] |
 | [Name] | End user | Low | [Neutral] | [Limited] |
 | [Name] | IT / procurement | Medium | [Neutral] | [Transactional] |
 **Champion risk:** [Is our champion secure in their role? Any signals of departure or reorganisation?]
 **Multi-thread plan:** [Who else do we need relationships with before renewal? How do we get there?]
 ---
 ## 3. Risk Register
 | Risk | Likelihood (H/M/L) | Impact (H/M/L) | Mitigation |
 |---|---|---|---|
 | [Budget pressure / cost-cutting] | [H] | [H] | [Build ROI case 90 days out; identify budget holder's priorities] |
 | [Low adoption in [department]] | [M] | [H] | [Run targeted enablement session; tie to champion's OKRs] |
 | [Competitor evaluation] | [M] | [M] | [Request competitive intelligence; schedule exec-level call] |
 | [Champion departure] | [L] | [H] | [Map two additional stakeholders; executive intro call] |
 ---
 ## 4. Value Story
 Build the ROI narrative for the renewal conversation:
 **Headline result:** [e.g. "[Account] saved X hours/week or reduced [metric] by X% using [product]"]
 **Evidence sources:**
 - [ ] Product usage data (logins, features used, seat utilisation)
 - [ ] Business metric improvement (pull from QBR deck or success plan)
 - [ ] Support resolution time improvement
 - [ ] Customer-provided testimonial or case study quotes
 **Value gaps to close before renewal:** [Are there outcomes the customer expected but hasn't seen yet? What's the plan to close these?]
 ---
 ## 5. Expansion Opportunity
 Map upside beyond flat renewal:
 | Opportunity | Type | Estimated value | Likelihood | Timing |
 |---|---|---|---|---|
 | [Seat expansion — [dept] wants to add 10 users] | Upsell | [+£X ARR] | [High] | [Renewal or +3M] |
 | [Cross-sell — [Product B] use case identified] | Cross-sell | [+£X ARR] | [Medium] | [+6M] |
 | [Multi-year commitment] | Discount for term | [+£X TCV / -X% discount] | [Low] | [At renewal] |
 **Expansion play:** [Which opportunity to lead with, and the sequence for raising it in the renewal conversation]
 ---
 ## 6. Commercial Strategy
 **Renewal scenario planning:**
 | Scenario | Probability | ARR outcome | Response strategy |
 |---|---|---|---|
 | **Flat renewal** | [X%] | [£X — same as current] | [Accept; plant seeds for +6M expansion] |
 | **Expansion** | [X%] | [£X] | [Lead with ROI evidence; pitch seat or feature expansion] |
 | **Contraction risk** | [X%] | [£X — downgrade to lower tier] | [Propose phased commitment; demonstrate path to full adoption] |
 | **Churn risk** | [X%] | [£0] | [Escalate to leadership; executive sponsor engagement] |
 **Discount guardrails:**
 - Floor discount: [X% — do not go below without VP approval]
 - Triggers for discount: [Multi-year / volume / reference customer commitment]
 - What to ask for in return: [Reference case study / G2 review / executive intro / case study participation]
 **Pricing flexibility:**
 - [e.g. Can offer monthly billing in exchange for 24-month commit]
 - [e.g. Can offer X seats free in exchange for expansion commitment]
 ---
 ## 7. Objection Responses
 Prepare for the most likely objections:
 **"The price is too high"**
 > Anchor on value delivered: "[Customer] achieved [X outcome] — at [£X ARR], that's [£Y per outcome / hour saved / user]. What would it cost to deliver that outcome without us?"
 > If budget is genuinely constrained, explore: phased payment, reduction in scope rather than full churn, multi-year pricing.
 **"We're not seeing enough adoption"**
 > Acknowledge, then commit: "You're right — [X seats] are actively using [core feature] out of [Y]. We want to fix this. Here's our 60-day plan: [exec sponsor on enablement call / training session / in-product nudge campaign]."
 **"We're evaluating [Competitor]"**
 > Don't panic. Ask: "What's driving the evaluation — is it specific features, pricing, or something else?" Then map gaps honestly. Offer a feature roadmap preview if relevant. Get clarity on their criteria and timeline before responding defensively.
 **"We need to reduce spend this quarter"**
 > Separate the commercial conversation from the value conversation. Offer to protect the relationship with a reduced scope today with a committed expansion trigger at a business milestone. Avoid discounting without a reason.
 ---
 ## 8. Renewal Timeline
 | Week | Action | Owner | Notes |
 |---|---|---|---|
 | **W–16** (4 months out) | Internal renewal review — health, expansion opportunity, risk | CSM | Flag to leadership if Red |
 | **W–12** | QBR / executive business review — ROI evidence delivered | CSM + AE | Book 45–60 min with economic buyer |
 | **W–10** | Champion 1:1 — pulse check on satisfaction and upcoming priorities | CSM | Uncover internal dynamics before commercial discussion |
 | **W–8** | Expansion conversation — plant seeds, share roadmap | AE | Do not lead with pricing |
 | **W–6** | Send renewal proposal — pricing, terms, options | AE | Include multi-year option |
 | **W–4** | Negotiation — address objections, finalise commercial terms | AE + CSM | Escalate to VP if >X% discount required |
 | **W–2** | Legal / procurement — contract redlines, signature process | AE + Legal | |
 | **W–0** | Signed. Handoff to post-renewal success plan | CSM | Thank the champion; begin next cycle |
 ---
 ## 9. Success Criteria
 - [ ] Renewal signed before deadline
 - [ ] ARR outcome within target range
 - [ ] Champion relationship maintained or improved
 - [ ] At least one expansion conversation started
 - [ ] ROI evidence documented and accepted by customer
 ---
 ## Quality Checks
 - [ ] Stakeholder map includes the economic buyer — not just the champion
 - [ ] Risk register has a mitigation for every H/H risk
 - [ ] Value story uses product data and business outcomes, not just feature lists
 - [ ] Commercial strategy includes a floor discount and a reason-to-discount framework
 - [ ] Timeline starts at least 90 days before renewal date
 - [ ] Objection responses are specific to this account, not generic
 ## Anti-Patterns
 - [ ] Do not start renewal conversations less than 90 days before the renewal date for accounts over $50K ARR
 - [ ] Do not build a renewal strategy without first honestly assessing account health — wishful thinking leads to last-minute churn
 - [ ] Do not treat all renewal objections as negotiating tactics — some objections signal genuine dissatisfaction that requires resolution first
 - [ ] Do not offer discounts as the first response to price objections — explore value gaps before reducing price
 - [ ] Do not close the renewal without confirming the expansion opportunity — every renewal is also an expansion conversation
 ## Example Trigger Phrases
 - "Build a renewal playbook for [Account Name] renewing in [Month]"
 - "Help me plan the renewal strategy for an at-risk customer"
 - "Prepare a renewal brief for my QBR with [Company]"
 - "What's my renewal strategy for a Red account coming up in 60 days?"
 - "Create a renewal and expansion plan for [Account]"
@@ -0,0 +1,97 @@
 # Chart Data Extractor Skill
 Extracts data from images of charts and graphs — bar charts, line charts, pie charts, scatter plots, and tables in images — producing a structured data table that can be used in spreadsheets or rebuilt in any charting tool. Built to leverage Opus 4.7 pixel-level image analysis capabilities.
 ## Required Inputs
 Ask the user for these if not provided:
 - **The chart image** (upload a screenshot or image file)
 - **Chart type** (if ambiguous — bar / line / pie / scatter / other)
 - **What matters most** (approximate trends / precise values / specific data points / categorisation)
 - **Known axis values** (optional — if the user knows the max/min values to anchor the extraction)
 ## Output Structure
 ### 1. Chart Identification
 | Attribute | Value |
 |---|---|
 | Chart type | [Bar / Line / Pie / Scatter / Area / Other] |
 | Chart title (if visible) | [Title text] |
 | X-axis label | [Label + unit] |
 | Y-axis label | [Label + unit] |
 | Number of series | N |
 | Legend categories | [List] |
 | Data period (if time-based) | [Start — End] |
 ### 2. Extracted Data Table
 | [X axis] | [Series 1] | [Series 2] | ... |
 |---|---|---|---|
 | [Value] | [Value] | [Value] | |
 ### 3. Confidence Levels
 For each data point or series, flag confidence:
 - **High confidence:** data points where the value is clearly readable against gridlines or labels
 - **Medium confidence:** data points where the value is interpolated between gridlines
 - **Low confidence:** data points where the value is ambiguous or overlaps with other elements
 Low-confidence points should be explicitly listed — not silently included in the main table.
 ### 4. Notable Observations
 Observations that the data itself reveals:
 - Peak value: [Value, when, in which series]
 - Lowest value: [Value, when, in which series]
 - Largest delta between series: [Details]
 - Any anomalies or outliers visible in the chart
 ### 5. Reconstructed Source
 CSV format for direct use:
 ```csv
 [x_axis],[series_1],[series_2]
 [value],[value],[value]
 ```
 ### 6. Assumptions and Caveats
 - Grid resolution: [How precisely values could be read — e.g. "Y-axis has major gridlines every 10 units, minor every 2"]
 - Interpolation used: [Any values that required estimating between gridlines]
 - Unclear data: [Anything in the chart that could not be read reliably]
 - Axis scale: [Linear/logarithmic/etc — note if not obvious]
 ### 7. Follow-up Options
 Ask the user which of these they want:
 - Rebuild the chart in a specified format (Excel formula, Python matplotlib, D3, etc.)
 - Produce a narrative description of what the chart shows
 - Compare this data against another chart or source
 - Flag potentially misleading visual choices in the original (truncated axes, misleading scales, etc.)
 ## Quality Checks
 - [ ] Every extracted number specifies which series it belongs to
 - [ ] Confidence levels are explicit for ambiguous points
 - [ ] Low-confidence values are flagged separately, not silently included
 - [ ] Assumptions about axis scale and interpolation are stated
 - [ ] CSV output is clean and directly usable
 ## Anti-Patterns
 - [ ] Do not silently include low-confidence data points in the main table — flag them separately so the user knows which values to verify
 - [ ] Do not assume a linear scale without confirming it — logarithmic axes make extracted values incorrect by orders of magnitude if misread
 - [ ] Do not report extracted values with false precision — if the chart's Y-axis only shows gridlines every 10 units, a reported value of 37 is invented, not extracted
 - [ ] Do not omit the assumptions and caveats section — partial image quality, overlapping bars, or unlabelled axes must be disclosed
 ## Example Trigger Phrases
 - "Extract the data from this chart"
 - "Transcribe the numbers in this graph"
 - "Turn this chart image into a spreadsheet"
 - "Digitise this chart so I can rebuild it"
 - "What are the exact values in this bar chart?"
 ## Why This Works Better on Opus 4.7
 Earlier models struggled with pixel-level data transcription from charts, often hallucinating values or misreading gridline positions. Opus 4.7 uses a higher image resolution (2576px vs 1568px) with coordinates mapping 1:1 to pixels, making chart data extraction reliable for practical use.
@@ -0,0 +1,190 @@
 # Cohort Analysis Skill
 This skill produces a structured cohort analysis covering retention curves, LTV estimation, behavioural segmentation, and actionable interventions. Output is ready to present to product leadership or share with growth and data teams.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Analysis goal** (retention improvement / LTV modelling / behavioural segmentation / churn prediction)
 - **Product or feature being analysed**
 - **Cohort definition** — what groups users? (acquisition month, signup channel, plan tier, feature adoption)
 - **Observation window** — how many periods to track? (e.g. 12 months, 8 weeks)
 - **Key metric** — what are you measuring per cohort? (retention rate, revenue, engagement score, feature usage)
 - **Available data** — what tables/metrics are available? (paste schema or describe)
 - **Baseline** — any existing retention benchmarks or goals?
 ## Output Structure
 ---
 # Cohort Analysis: [Product / Feature]
 **Analysis type:** [Retention / LTV / Behavioural / Churn]
 **Cohort definition:** [Acquisition month / Signup channel / Plan tier / Feature adoption date]
 **Observation window:** [X months / weeks]
 **Primary metric:** [Metric name]
 **Date prepared:** [Date]
 ---
 ## 1. Cohort Definitions
 | Cohort | Period | Size | Description |
 |---|---|---|---|
 | [Cohort 1] | [Jan 2025] | [N users] | [e.g. Users who signed up in Jan 2025 via organic] |
 | [Cohort 2] | [Feb 2025] | [N users] | [...] |
 **Cohort logic:**
 - Cohort entry event: [First sign-up / First purchase / Feature activation]
 - Cohort exit criteria: [Churned / Downgraded / No activity for 30 days]
 - Exclusions: [Trial users / Internal test accounts / Users with < X days of data]
 ---
 ## 2. Retention Curve
 **How to read:** Each cell shows what % of the cohort performed the key metric in period N.
 | Cohort | Period 0 | Period 1 | Period 2 | Period 3 | Period 6 | Period 12 |
 |---|---|---|---|---|---|---|
 | Jan 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
 | Feb 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
 | [Trend] | — | [↑/↓ vs prior] | [...] | [...] | [...] | [...] |
 **Retention plateau:** [At what period does retention flatten? What % does it flatten at?]
 **Key observations:**
 - [e.g. Period 1 → Period 2 drop is the largest — average X% churn in first 30 days]
 - [e.g. Cohorts acquired via [channel] retain X% better at Period 6]
 - [e.g. Retention has improved from X% → Y% at Period 3 comparing oldest to newest cohort]
 ---
 ## 3. LTV Projection (if applicable)
 **ARPU per period:** [£/$/€ X per active user per month]
 **Retention curve used:** [Which cohort or blended average]
 | Period | Retained % | Revenue per user | Cumulative LTV |
 |---|---|---|---|
 | Month 1 | [X%] | [£X] | [£X] |
 | Month 3 | [X%] | [£X] | [£X] |
 | Month 6 | [X%] | [£X] | [£X] |
 | Month 12 | [X%] | [£X] | [£X] |
 **Blended LTV:** [£X at 12 months — based on blended retention across cohorts]
 **LTV by segment:**
 | Segment | LTV (12M) | vs Baseline |
 |---|---|---|
 | [Organic] | [£X] | [+X%] |
 | [Paid] | [£X] | [-X%] |
 | [Enterprise] | [£X] | [+X%] |
 ---
 ## 4. Behavioural Segmentation
 Group cohorts by behaviour patterns, not just acquisition date:
 | Segment | Definition | Size | Retention (P6) | LTV (12M) |
 |---|---|---|---|---|
 | **Power users** | [Used core feature ≥ 3x/week in first 30 days] | [X%] | [X%] | [£X] |
 | **Casual users** | [Used 1–2x/week in first 30 days] | [X%] | [X%] | [£X] |
 | **Dormant** | [Logged in but did not use core feature] | [X%] | [X%] | [£X] |
 | **Never activated** | [Signed up but never completed onboarding] | [X%] | [X%] | [£X] |
 **Activation threshold insight:** [What action — taken within the first X days — most strongly predicts retention? This is the "aha moment" to optimise for.]
 ---
 ## 5. Leading Indicators of Churn
 List the signals that appear **before** users churn, so teams can intervene:
 | Signal | How early does it appear? | Churn correlation | Intervention |
 |---|---|---|---|
 | [No login for 7 days] | [7 days before churn] | [Strong] | [Re-engagement email sequence] |
 | [Support ticket with escalation] | [14 days before churn] | [Moderate] | [CSM outreach within 48 hours] |
 | [Feature usage dropped >50% WoW] | [10 days before churn] | [Strong] | [In-app nudge with use-case tutorial] |
 ---
 ## 6. Cohort Comparison: What's Changed Over Time
 Compare oldest and newest cohorts to assess whether product improvements are showing up in retention:
 | Metric | [Oldest cohort — e.g. Jan 2024] | [Newest cohort — e.g. Jan 2025] | Change |
 |---|---|---|---|
 | Period 1 retention | [X%] | [X%] | [↑/↓ X pp] |
 | Period 3 retention | [X%] | [X%] | [↑/↓ X pp] |
 | Activation rate | [X%] | [X%] | [↑/↓ X pp] |
 | Avg. sessions in first 30 days | [X] | [X] | [↑/↓] |
 **Verdict:** [Are more recent cohorts performing better or worse? What shipped in that period that might explain the change?]
 ---
 ## 7. Recommendations
 Prioritise by impact on retention curve:
 | # | Recommendation | Target segment | Expected impact | Effort | Priority |
 |---|---|---|---|---|---|
 | 1 | [e.g. Redesign onboarding to hit activation milestone in day 1, not day 7] | [Never-activated segment] | [+X pp P1 retention] | [Medium] | P1 |
 | 2 | [e.g. Launch re-engagement sequence at day 7 inactivity trigger] | [Dormant segment] | [+X pp P2 retention] | [Low] | P1 |
 | 3 | [e.g. Introduce power-user features earlier to accelerate habit formation] | [Casual users] | [+X pp P6 LTV] | [High] | P2 |
 ---
 ## 8. SQL Reference (if applicable)
 Provide the core cohort query so data teams can replicate or extend the analysis:
 ```sql
 -- Retention cohort query
 SELECT
  DATE_TRUNC('month', u.created_at) AS cohort_month,
  DATE_TRUNC('month', e.event_date) AS activity_month,
  DATEDIFF('month', u.created_at, e.event_date) AS period,
  COUNT(DISTINCT e.user_id) AS retained_users,
  COUNT(DISTINCT c.user_id) AS cohort_size,
  ROUND(COUNT(DISTINCT e.user_id) * 100.0 / COUNT(DISTINCT c.user_id), 1) AS retention_rate
 FROM users u
 JOIN events e ON u.user_id = e.user_id
 JOIN (
  SELECT user_id, DATE_TRUNC('month', created_at) AS cohort_month
  FROM users
  WHERE created_at >= '[start_date]'
 ) c ON u.user_id = c.user_id AND DATE_TRUNC('month', u.created_at) = c.cohort_month
 WHERE e.event_type = '[key_retention_event]'
 GROUP BY 1, 2, 3
 ORDER BY 1, 3;
 ```
 ---
 ## Quality Checks
 - [ ] Cohort definition is unambiguous — the same user cannot appear in two cohorts
 - [ ] Retention curve shows a clear plateau, or the analysis notes that the window is too short to see one
 - [ ] LTV projection uses observed retention, not assumed
 - [ ] Behavioural segments are mutually exclusive and exhaustive
 - [ ] Recommendations are tied to specific cohort or segment findings — not generic growth advice
 - [ ] Leading indicators are observable in production data, not just in theory
 ## Anti-Patterns
 - [ ] Do not allow the same user to appear in multiple cohorts — overlapping cohorts produce retention numbers that cannot be compared or acted upon
 - [ ] Do not assume assumed ARPU in LTV projections — use observed revenue per retained user per period, not a blended average that hides segment differences
 - [ ] Do not draw conclusions from cohorts too small to be statistically meaningful — flag minimum cohort size thresholds and note when a cohort is too small to trust
 - [ ] Do not conflate retention rate with engagement rate — a user who logs in but does not complete the key retention event is not retained by the definition used
 - [ ] Do not make recommendations without connecting them to specific cohort or segment findings — generic growth advice that could apply to any product adds no value
 ## Example Trigger Phrases
 - "Run a cohort analysis for our SaaS product"
 - "Analyse retention by acquisition month for the last 12 cohorts"
 - "What's the LTV of users who came via paid vs organic?"
 - "Build a cohort retention model showing period 0 through period 12"
 - "Segment users by behaviour and show me which group retains best"
@@ -0,0 +1,125 @@
 # Dashboard Brief Skill
 This skill converts a business question or monitoring need into a complete, implementation-ready dashboard specification. The output gives a data engineer or BI developer everything they need to build without a follow-up meeting.
 ## Required Inputs
 Ask the user for these if not provided:
 - **The business question this dashboard should answer** (e.g. "How is our activation funnel performing this week?")
 - **Primary audience** (exec / product team / operations / customer success / engineering)
 - **Refresh cadence** (real-time / hourly / daily / weekly)
 - **Data sources available** (e.g. Postgres, BigQuery, Mixpanel, Salesforce, Jira)
 - **BI tool being used** (Looker / Metabase / Tableau / Power BI / Grafana / Custom / Unknown)
 ## Output Structure
 ---
 # Dashboard Brief: [Dashboard Name]
 **Business Question:** [The question this dashboard answers — verbatim from inputs or refined]
 **Audience:** [Who uses this]
 **Refresh Rate:** [Real-time / Hourly / Daily / Weekly]
 **Data Sources:** [List]
 **BI Tool:** [Tool or Unknown]
 ---
 ## Section 1: Key Metrics (KPI Cards)
 List the headline numbers that should appear at the top of the dashboard as KPI cards.
 | Metric | Definition | Data Source | Comparison |
 |---|---|---|---|
 | [Metric name] | [How it's calculated] | [Table/source] | [vs. last week / vs. target / MoM] |
 Aim for 3–6 KPI cards. More than 6 is noise.
 ---
 ## Section 2: Charts & Visualisations
 For each chart, specify:
 ### Chart [N]: [Chart Title]
 - **Chart type:** [Line / Bar / Stacked bar / Pie / Funnel / Heatmap / Table / Scatter]
 - **Why this chart type:** [One sentence — why this type suits this data]
 - **X-axis / Rows:** [Dimension — e.g. Date, User segment, Product]
 - **Y-axis / Values:** [Metric — e.g. Count of active users, Revenue]
 - **Breakdown/colour:** [Optional secondary dimension — e.g. by Plan tier, by Channel]
 - **Data source:** [Table or source]
 - **Filters:** [Any default filters applied — e.g. "Exclude internal test accounts"]
 - **Key insight to surface:** [What pattern or signal this chart should help the viewer spot]
 ---
 ## Section 3: Filters & Controls
 Global filters available to dashboard viewers:
 | Filter | Type | Default | Options |
 |---|---|---|---|
 | Date range | Date picker | Last 30 days | Custom |
 | [Segment filter] | Dropdown | All | [List relevant values] |
 | [Other filter] | Multi-select | All | [List relevant values] |
 ---
 ## Section 4: Layout Recommendation
 Describe the dashboard layout in plain terms:
 ```
 [ROW 1 — KPI Cards]: [Metric 1] | [Metric 2] | [Metric 3] | [Metric 4]
 [ROW 2 — Primary chart, full width]: [Chart name]
 [ROW 3 — Two charts side by side]: [Chart A] | [Chart B]
 [ROW 4 — Supporting table, full width]: [Table name]
 ```
 ---
 ## Section 5: Data Requirements
 List any data transformations, joins, or derived fields needed:
 | Derived Field | Logic | Source Tables |
 |---|---|---|
 | [Field name] | [How it's calculated] | [Tables involved] |
 Flag any fields that may not exist in current data infrastructure.
 ---
 ## Section 6: Access & Ownership
 - **Dashboard owner:** [Leave for user to fill]
 - **Who can edit:** [Leave for user to fill]
 - **Who can view:** [Leave for user to fill]
 - **Review cadence:** [When should this dashboard be reviewed for relevance?]
 ---
 ## Quality Checks
 - [ ] Every chart has a stated "key insight to surface" — not just "show the data"
 - [ ] KPI cards are 3–6 (not more)
 - [ ] Chart types are justified
 - [ ] Layout follows visual hierarchy (summary → detail)
 - [ ] Data requirements section flags any missing fields
 - [ ] Filters are practical and don't require IT to configure
 ## Anti-Patterns
 - [ ] Do not specify metrics that the available data sources cannot actually support — always validate data availability
 - [ ] Do not include more than 8–10 primary metrics on a single dashboard — more creates noise, not insight
 - [ ] Do not skip the primary business question — a dashboard without a north-star question becomes a vanity metrics display
 - [ ] Do not choose chart types for aesthetic reasons — every chart type must match the data relationship it represents
 - [ ] Do not leave filter configurations vague — specify exact filter values, not just filter categories
 ## Example Trigger Phrases
 - "Design a dashboard to track [business process]"
 - "Give me a spec for a [team] performance dashboard"
 - "What should go on a [topic] dashboard?"
 - "Write a dashboard brief for our [metric] monitoring"
@@ -0,0 +1,224 @@
 # Data Pipeline Spec Skill
 This skill produces a complete data pipeline specification covering sources, transformations, destinations, scheduling, SLAs, error handling, data quality checks, and monitoring requirements. Output is ready for engineering handoff or architecture review.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Pipeline purpose** — what business question or workflow does this pipeline serve?
 - **Source systems** — where does data come from? (databases, APIs, files, event streams)
 - **Destination** — where does data land? (data warehouse, data lake, downstream DB, reporting tool)
 - **Transformation type** — ETL (transform before loading) or ELT (load raw, transform in warehouse)?
 - **Frequency / SLA** — how often must data be fresh? (real-time / hourly / daily / weekly)
 - **Volume estimate** — approximate rows/events per run
 - **Data quality requirements** — completeness, deduplication, freshness, schema enforcement
 - **Team or stack** — any specific tools in use? (Airflow, dbt, Fivetran, Spark, Kafka, etc.)
 ## Output Structure
 ---
 # Data Pipeline Spec: [Pipeline Name]
 **Purpose:** [One sentence — what decision or workflow does this pipeline enable?]
 **Type:** [ETL / ELT / Streaming / Batch]
 **Owner:** [Team or individual]
 **Version:** [1.0]
 **Date:** [Date]
 **Status:** [Draft / Under Review / Approved]
 ---
 ## 1. Overview
 [2–3 sentences describing the pipeline end-to-end: what data moves, from where to where, at what cadence, and why.]
 **Architecture diagram (text):**
 ```
 [Source A] ──┐
 [Source B] ──┤──► [Ingestion Layer] ──► [Transform Layer] ──► [Destination] ──► [Consumers]
 [Source C] ──┘
 ```
 ---
 ## 2. Sources
 | Source | System | Connection type | Data format | Update pattern | Volume |
 |---|---|---|---|---|---|
 | [Source 1] | [PostgreSQL / Salesforce / S3 / Kafka] | [JDBC / REST API / SDK / Webhook] | [JSON / CSV / Parquet / CDC] | [Append / Full refresh / Incremental] | [X rows/day] |
 | [Source 2] | [...] | [...] | [...] | [...] | [...] |
 **Incremental key (if applicable):** [The column used to identify new or changed records — e.g. `updated_at`, `event_id`]
 **Authentication:** [API key / OAuth / IAM role / connection string — note where credentials are stored]
 ---
 ## 3. Ingestion Layer
 **Tool:** [Fivetran / Airbyte / Kafka Connect / custom script / dbt source]
 **Ingestion method:**
 - [ ] Full extract (full table refresh each run)
 - [ ] Incremental extract (only new/changed rows since last run)
 - [ ] CDC (change data capture from database transaction log)
 - [ ] Event streaming (continuous ingestion from Kafka/Kinesis)
 **Raw landing zone:** [Where raw data lands before transformation — e.g. `raw.salesforce_opportunities` in Snowflake, S3 bucket `s3://data-raw/crm/`]
 **Schema handling:** [Strict schema enforcement / Schema evolution allowed / Union schema]
 ---
 ## 4. Transformation Logic
 List each transformation in execution order. For ELT pipelines, this is the dbt model or SQL layer.
 | Step | Name | Description | Input | Output | Tool |
 |---|---|---|---|---|---|
 | 1 | [Deduplicate events] | [Remove duplicate event rows based on event_id] | `raw.events` | `staging.events_deduped` | [dbt / SQL / Spark] |
 | 2 | [Join user profile] | [Enrich events with user attributes from CRM] | `staging.events_deduped`, `raw.users` | `staging.events_enriched` | [...] |
 | 3 | [Aggregate to daily] | [Roll up to user×day grain] | `staging.events_enriched` | `mart.user_daily_activity` | [...] |
 **Business logic rules:**
 - [e.g. Revenue is recognised on `payment_confirmed_at`, not `payment_initiated_at`]
 - [e.g. Users in the `internal@company.com` domain are excluded from all metrics]
 - [e.g. Currency conversion uses the ECB rate from the first business day of each month]
 **Slowly Changing Dimensions (SCD) — if applicable:**
 - [e.g. `users.plan_tier` is SCD Type 2 — keep history of plan changes with `valid_from` / `valid_to`]
 ---
 ## 5. Destination
 | Destination | System | Schema / Table | Write mode | Consumers |
 |---|---|---|---|---|
 | [Primary] | [Snowflake / BigQuery / Redshift / PostgreSQL] | [`analytics.mart_user_activity`] | [Append / Upsert / Full replace] | [Looker / Metabase / downstream pipeline] |
 | [Secondary] | [...] | [...] | [...] | [...] |
 **Partitioning / Clustering:** [e.g. Partitioned by `event_date`, clustered by `user_id` — reduces query cost for time-range scans]
 **Retention policy:** [e.g. Raw data retained for 90 days; mart tables retained indefinitely]
 ---
 ## 6. Scheduling & SLAs
 | SLA | Target | Breach action |
 |---|---|---|
 | **Data freshness** | [Data must be ≤ X hours old by HH:MM UTC] | [Page on-call / alert Slack channel] |
 | **Pipeline completion** | [Must complete within X minutes of trigger] | [Alert and auto-retry] |
 | **Availability** | [Pipeline must run successfully X% of days per month] | [Incident review] |
 **Schedule:** [Cron expression and human description — e.g. `0 6 * * *` — daily at 06:00 UTC]
 **Trigger type:**
 - [ ] Time-based (cron)
 - [ ] Event-based (triggered by upstream pipeline success / file arrival / Kafka lag)
 - [ ] Manual (ad hoc runs only)
 **Backfill strategy:** [How to reprocess historical data if the pipeline fails or logic changes — e.g. parameterised date range, full drop-and-reload]
 ---
 ## 7. Data Quality Rules
 | Check | Table | Rule | Failure action |
 |---|---|---|---|
 | Completeness | `staging.events` | `event_id IS NOT NULL` — 100% of rows | Block load / Alert |
 | Uniqueness | `mart.user_daily_activity` | `(user_id, date)` must be unique | Block load |
 | Freshness | `mart.user_daily_activity` | `max(event_date) >= CURRENT_DATE - 1` | Alert |
 | Volume | `staging.events` | Row count within ±20% of 7-day average | Alert |
 | Referential integrity | `staging.events` | All `user_id` values exist in `users` table | Alert |
 **DQ tool:** [dbt tests / Great Expectations / Monte Carlo / custom SQL assertions]
 ---
 ## 8. Error Handling & Recovery
 **Retry policy:** [e.g. 3 retries with exponential back-off: 5 min, 20 min, 60 min]
 **Failure modes and responses:**
 | Failure | Detection | Response | Owner |
 |---|---|---|---|
 | Source unavailable | HTTP 5xx / connection timeout | Retry 3×, then alert and skip run | Data engineering |
 | Schema change in source | Column missing or type mismatch | Block load, alert schema owner | Data owner + engineering |
 | DQ check fails | dbt test failure / assertion error | Block load for P1 checks; alert for P2 | Data engineering |
 | Partial load | Row count < expected threshold | Alert; do not publish to consumers until resolved | Data engineering |
 **Dead-letter queue:** [Where failed records are routed for manual inspection — e.g. `raw.dlq_events`]
 ---
 ## 9. Monitoring & Observability
 **Metrics to track:**
 - Pipeline run duration (p50, p95)
 - Rows processed per run
 - DQ check pass rate
 - Source freshness lag
 - Error rate per source
 **Alerting:**
 - [Slack channel: #data-alerts]
 - [PagerDuty: data-on-call escalation for P1 SLA breaches]
 - [Dashboard: [link to monitoring dashboard]]
 **Logging:** [What gets logged and where — e.g. Airflow task logs to CloudWatch, structured JSON to data lake]
 ---
 ## 10. Dependencies & Sequencing
 **Upstream dependencies:** [Which pipelines or data sources must succeed before this pipeline runs?]
 **Downstream dependents:** [Which dashboards, pipelines, or models depend on this pipeline's output?]
 ```
 [upstream pipeline A] ──► THIS PIPELINE ──► [downstream dashboard B]
                                          └──► [downstream pipeline C]
 ```
 **Coordination mechanism:** [Airflow DAG dependency / dbt ref() / event trigger / manual gate]
 ---
 ## 11. Security & Compliance
 - **PII fields:** [List columns containing PII — e.g. `email`, `ip_address`, `name`]
 - **Masking / Pseudonymisation:** [e.g. email hashed with SHA-256 before landing in mart layer]
 - **Access control:** [Who can query the destination tables? — e.g. Role-based access in Snowflake]
 - **Data residency:** [Which regions is data permitted to transit and rest in?]
 - **Audit trail:** [Is pipeline execution auditable for compliance purposes? Where are logs retained?]
 ---
 ## Quality Checks
 - [ ] Every source has an incremental key or full-refresh justification
 - [ ] Business logic rules are documented, not just the SQL
 - [ ] SLAs are agreed with consumers, not set unilaterally by engineering
 - [ ] DQ checks cover completeness, uniqueness, freshness, and volume
 - [ ] Failure modes include a documented recovery owner
 - [ ] PII fields are identified and a treatment plan is specified
 ## Anti-Patterns
 - [ ] Do not spec a pipeline without defining SLAs — "as fast as possible" is not an acceptable freshness target
 - [ ] Do not omit error handling and dead-letter queue strategy — every pipeline must specify what happens to failed records
 - [ ] Do not design idempotent loads without documenting the deduplication key — assume reruns will happen
 - [ ] Do not leave data quality rules implicit — schema validation, null checks, and referential integrity must be explicit
 - [ ] Do not ignore schema evolution — specify how upstream schema changes are detected and handled
 ## Example Trigger Phrases
 - "Design a data pipeline for our Salesforce to Snowflake sync"
 - "Write a pipeline spec for ingesting Stripe events into our data warehouse"
 - "Build an ETL spec for our user activity data"
 - "Document our dbt pipeline from raw events to the analytics mart"
 - "Spec out the pipeline that feeds the executive dashboard"
@@ -0,0 +1,106 @@
 # Metrics Framework Skill
 This skill builds a complete metrics framework tailored to a product or business. It connects the North Star metric to actionable leading indicators, making it clear which metrics to track, which to optimise, and how they relate to each other.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Product or business description** (one paragraph is enough)
 - **Business model** (SaaS / Marketplace / E-commerce / Consumer app / B2B / Other)
 - **Stage** (Pre-PMF / Growth / Scale / Mature)
 - **Framework preference** (if they have one): North Star + Metric Tree / AARRR / HEART / OKRs / Custom
 - **Primary goal this quarter** (e.g. grow activation, reduce churn, increase revenue)
 If no framework preference is given, recommend the best fit based on stage and business model.
 ## Output Structure
 ### 1. Framework Recommendation (if not specified)
 Explain in 2–3 sentences why you're recommending this framework for their context.
 ---
 ### 2. North Star Metric
 **[Metric Name]:** [Definition — exactly what is measured and how]
 **Why this is the right North Star for this business:**
 [2–3 sentences. It should reflect customer value delivered, not just revenue or activity. Explain what behaviour it captures and why maximising it correlates with long-term business health.]
 **How to measure it:** [Formula or data source]
 **Current baseline:** [Leave as [ADD BASELINE] for user to fill]
 **Target:** [Leave as [ADD TARGET] for user to fill]
 ---
 ### 3. Metric Tree
 Show how supporting metrics roll up to the North Star. Format as a hierarchy:
 ```
 [North Star Metric]
 ├── [Driver 1: e.g. Acquisition]
 │   ├── [L2 metric: e.g. Organic signups / week]
 │   └── [L2 metric: e.g. Paid CAC by channel]
 ├── [Driver 2: e.g. Activation]
 │   ├── [L2 metric: e.g. % users completing onboarding within 7 days]
 │   └── [L2 metric: e.g. Time to first value action]
 └── [Driver 3: e.g. Retention]
    ├── [L2 metric: e.g. Day 30 retention rate]
    └── [L2 metric: e.g. Feature adoption depth]
 ```
 For each L2 metric, provide:
 - **Definition:** [What exactly is measured]
 - **Why it matters:** [How it connects to the North Star]
 - **Leading or lagging?** [Leading = predictive / Lagging = outcome]
 - **How to measure:** [Data source or calculation]
 ---
 ### 4. Counter-Metrics
 [2–3 metrics to watch that prevent optimising the North Star in ways that damage the business. E.g. "If we optimise for signups, we need to watch spam account rate. If we optimise for engagement, we need to watch support ticket volume."]
 ---
 ### 5. Dashboard Recommendation
 Suggest a 3-tier dashboard structure:
 - **Exec view (weekly):** [3–5 metrics — outcomes only]
 - **Team view (daily):** [7–10 metrics — leading indicators + outputs]
 - **Diagnostic view (on demand):** [Metrics to drill into when something looks wrong]
 ---
 ### 6. Metric Health Check Questions
 [5 questions the team should ask in their weekly metrics review to turn numbers into insights. e.g. "Is our activation rate improving while retention stays flat? That suggests onboarding quality issue, not a product-market fit problem."]
 ---
 ## Quality Checks
 - [ ] North Star reflects customer value, not just business activity
 - [ ] Metric tree has 3–4 distinct drivers (not all one category)
 - [ ] Each L2 metric is classified as leading or lagging
 - [ ] Counter-metrics are included to prevent perverse incentives
 - [ ] Dashboard tiers are tailored to the product stage
 - [ ] All metric definitions are unambiguous (formula or clear description)
 ## Anti-Patterns
 - [ ] Do not set a North Star metric that measures business activity (revenue, pageviews) rather than customer value delivered — this creates incentives misaligned with product quality
 - [ ] Do not define metrics without specifying the formula or data source — an ambiguous metric will be measured differently by different people
 - [ ] Do not skip counter-metrics — optimising any single metric without a guard rail will eventually produce perverse incentives
 - [ ] Do not include more than 4–5 metrics in a daily team view — a dashboard with 20 metrics is a dashboard nobody looks at
 - [ ] Do not classify all metrics as "leading" — be honest about which are lagging outcome metrics and which genuinely predict future outcomes
 ## Example Trigger Phrases
 - "Build a metrics framework for [product]"
 - "What should our North Star metric be?"
 - "Create a KPI tree for [business]"
 - "Give me an AARRR breakdown for [product]"
 - "What metrics should our [team type] team track?"
@@ -0,0 +1,138 @@
 # SQL Query Explainer Skill
 This skill explains SQL queries in plain language, identifies optimisation opportunities, and helps communicate data logic to non-technical stakeholders. It also writes and documents new queries from natural language descriptions.
 ## Modes
 Detect which mode the user needs based on their request:
 1. **Explain** — Translate existing SQL into plain English
 2. **Optimise** — Review SQL for performance issues and suggest improvements
 3. **Write** — Generate SQL from a natural language description
 4. **Document** — Produce a data dictionary or query documentation
 ---
 ## Mode 1: Explain
 When given a SQL query, produce:
 ### Plain English Summary
 [1–3 sentences. What does this query do? What data does it return? Write as if explaining to a business analyst, not a developer.]
 ### Step-by-Step Walkthrough
 Break the query into logical sections. For each section:
 - Quote the SQL clause
 - Explain what it does in plain English
 - Flag any complexity (e.g. window functions, subqueries, CTEs)
 ### What the Result Looks Like
 [Describe the shape of the output: "Returns one row per user, with columns for X, Y, Z. Ordered by [field] descending."]
 ### Potential Issues to Flag
 - [Gotchas, edge cases, or implicit assumptions in this query]
 - [e.g. "This will include NULLs in the user_id column if the LEFT JOIN finds no match"]
 ---
 ## Mode 2: Optimise
 When asked to optimise a query, produce:
 ### Performance Assessment
 Rate overall: 🟢 Well-optimised / 🟡 Some improvements possible / 🔴 Significant issues
 ### Issues Found
 For each issue:
 **Issue [N]: [Short name, e.g. "Missing index on join column"]**
 - **What it is:** [Plain explanation]
 - **Why it matters:** [Performance impact — e.g. "Full table scan on a 10M row table"]
 - **Fix:**
 ```sql
 -- Before
 [original snippet]
 -- After
 [improved snippet]
 ```
 - **Expected improvement:** [Estimate if possible]
 ### Optimisation Checklist
 - [ ] SELECT * used? (Replace with specific columns)
 - [ ] Implicit type conversions on JOIN/WHERE columns?
 - [ ] Missing indexes on JOIN or WHERE columns?
 - [ ] N+1 patterns (queries inside loops)?
 - [ ] DISTINCT used where GROUP BY would be faster?
 - [ ] Window functions used where a subquery would be clearer/faster?
 - [ ] CTEs re-used or materialised unnecessarily?
 - [ ] Large IN() lists that could use a JOIN instead?
 ---
 ## Mode 3: Write
 When given a natural language description, generate the SQL query and then explain it using Mode 1.
 Ask the user to confirm:
 - **Database/dialect** (PostgreSQL / MySQL / BigQuery / Snowflake / SQLite / Standard SQL)
 - **Table and column names** (if known; otherwise use descriptive placeholder names like `users`, `orders`, `user_id`)
 - **Any filters, sorting, or aggregation requirements**
 Produce:
 1. The SQL query with inline comments
 2. Plain English explanation (Mode 1 format)
 ---
 ## Mode 4: Document
 When asked to create documentation for a query or table:
 ### Query Documentation
 ```
 Query: [Name]
 Purpose: [One sentence — what business question this answers]
 Author: [If provided]
 Last reviewed: [If provided]
 Inputs:
  - Table: [table_name] — [what it contains]
  - Filter: [any WHERE conditions and their business meaning]
 Output columns:
  | Column | Type | Description |
  |--------|------|-------------|
  | [name] | [type] | [plain English description] |
 Assumptions:
  - [Any implicit assumptions the query makes]
 Known limitations:
  - [Edge cases not handled, data quality dependencies, etc.]
 ```
 ---
 ## Quality Checks
 - [ ] Plain English explanation avoids SQL jargon
 - [ ] Optimisation suggestions include before/after SQL
 - [ ] Written queries include inline comments
 - [ ] Output shape is described (columns, row grain, ordering)
 - [ ] Dialect-specific syntax is flagged when non-standard
 ## Example Trigger Phrases
 - "Explain this SQL query: [paste query]"
 - "Optimise this slow query: [paste query]"
 - "Write a SQL query that [natural language description]"
 - "Document this query for my non-technical stakeholders"
 - "Why is this query returning unexpected results?"
@@ -0,0 +1,116 @@
 # A/B Test Planner Skill
 Design experiments that produce trustworthy results — not just directional signals. Every test output includes hypothesis, success metrics, sample size, duration, and a results interpretation guide.
 ## Required Inputs
 Ask the user for these if not provided:
 - **What is being tested** (feature, UI change, copy, pricing, onboarding step)
 - **Hypothesis** (or ask to help formulate one)
 - **Primary metric** (conversion rate, click-through, completion rate, etc.)
 - **Baseline rate** and **minimum detectable effect** (MDE)
 - **Daily eligible users** (to calculate duration)
 ## Experiment Design Checklist
 Before running any test, confirm:
 - [ ] Clear hypothesis with predicted direction
 - [ ] Single primary metric (plus up to 2 guardrail metrics)
 - [ ] Minimum detectable effect (MDE) defined
 - [ ] Sample size calculated
 - [ ] Test duration estimated
 - [ ] Segment isolated (no overlap with other running tests)
 - [ ] Rollback plan defined
 ## Hypothesis Template
 > "We believe that [change] will cause [primary metric] to [increase/decrease] by [X%] for [user segment], because [rationale based on data or insight]."
 Never run a test without a directional hypothesis. "Let's just see what happens" is not a hypothesis.
 ## Sample Size Calculator Logic
 Use this formula (provide the output, not the formula, to the user):
 - **Baseline conversion rate:** Current rate of primary metric
 - **MDE:** Smallest change worth detecting (recommend 10–20% relative lift for most features)
 - **Statistical power:** 80% (standard)
 - **Significance level:** 95% (p < 0.05)
 For common scenarios, provide pre-calculated estimates:
 | Baseline Rate | MDE (Relative) | Required Sample per Variant |
 |---|---|---|
 | 5% | 20% | ~19,000 |
 | 10% | 15% | ~14,000 |
 | 20% | 10% | ~15,000 |
 | 40% | 10% | ~9,500 |
 | 60% | 5% | ~42,000 |
 Always warn: "These are estimates. Use a tool like Evan Miller's calculator or Statsig for precision."
 ## Test Duration Guidance
 Minimum: 2 full weeks (to capture weekly seasonality)
 Maximum: 4 weeks (novelty effect distorts results beyond this)
 `Duration = Required sample ÷ (Daily traffic × % exposed)`
 Flag if traffic is too low to reach significance in under 8 weeks — recommend a different approach (e.g., holdout test, qualitative research).
 ## Output Format
 ### A/B Test Plan — [Test Name] — [Date]
 **Hypothesis:**
 > [Filled hypothesis template]
 **Variants:**
 - Control (A): [Current experience]
 - Treatment (B): [Changed experience — be specific]
 **Primary Metric:** [Metric name + how measured]
 **Guardrail Metrics:** [Metrics that must not degrade]
 **Target Segment:** [Who sees the test — % of traffic, user type]
 **Traffic Split:** [50/50 recommended unless ramp-up needed]
 **Sample Size Required:** ~[N] users per variant
 **Estimated Duration:** [X] weeks (based on [Y] daily eligible users)
 **Significance Threshold:** 95% confidence, 80% power
 **Exclusions:** [Any user segments to exclude and why]
 **Rollback Trigger:** If [guardrail metric] degrades by [X%], stop the test immediately.
 **Results Interpretation Guide:**
 - ✅ Ship if: Treatment shows [X%]+ lift on primary metric at 95% confidence AND guardrail metrics are stable
 - 🔄 Iterate if: Direction is positive but not significant — consider extending or redesigning
 - ❌ Reject if: No lift or negative direction at significance
 - ⚠️ Inconclusive: Do not ship. Do not call it a win.
 ---
 ## Guidelines
 - Always recommend against peeking at results before the test reaches planned sample size — explain p-hacking risk
 - If user wants to test multiple variants, explain the multiple comparisons problem and recommend a Bonferroni correction or a Bayesian approach
 - If traffic is very low (<1,000 users/day), recommend qualitative alternatives: moderated testing, 5-second tests, or user interviews
 - Never approve a test with no guardrail metrics — always protect revenue, retention, or core engagement
 ## Anti-Patterns
 - [ ] Do not run a test without a directional hypothesis — "let's see what happens" produces uninterpretable results
 - [ ] Do not declare a winner before reaching the pre-planned sample size — peeking at results inflates false positive rates
 - [ ] Do not test multiple independent changes in a single variant — you won't know which change caused the result
 - [ ] Do not use engagement metrics (clicks, time-on-page) as the primary metric when the goal is revenue or retention — proxy metrics mislead
 - [ ] Do not ignore guardrail metrics — a conversion lift that causes a support ticket spike is not a win
 ## Quality Checks
 - [ ] Hypothesis is directional (predicts a specific direction and magnitude, not "let's see")
 - [ ] Primary metric is singular (guardrail metrics are secondary)
 - [ ] Sample size is calculated from actual MDE and baseline (not guessed)
 - [ ] Test duration accounts for weekly seasonality (minimum 2 weeks)
 - [ ] Guardrail metrics are defined (at least one to protect revenue or core engagement)
 - [ ] Rollback trigger is specified with a concrete threshold
@@ -0,0 +1,136 @@
 # Go-to-Market Planner Skill
 Produce a complete, cross-functional GTM plan that aligns product, marketing, sales, and support around a single launch — with clear owners, timelines, and success metrics.
 ## Launch Tier Framework
 Before planning, classify the launch:
 | Tier | Scope | Typical Effort | Examples |
 |---|---|---|---|
 | **Tier 1 — Major Launch** | New product / significant platform change | 8–12 weeks | New pricing model, platform rebrand, new product line |
 | **Tier 2 — Feature Launch** | Significant new capability | 4–6 weeks | Major feature, API release, new integration |
 | **Tier 3 — Incremental Release** | Improvement, bug fix, minor feature | 1–2 weeks | UI tweak, performance improvement, small enhancement |
 Always confirm tier with the user before proceeding.
 ---
 ## GTM Plan Output Format
 ### GTM Plan — [Product/Feature Name] — [Launch Date]
 **Launch Tier:** [1 / 2 / 3]
 **Launch Owner (PM):** [Name]
 **Target Launch Date:** [Date]
 **Soft Launch Date (Beta/Limited):** [Date, if applicable]
 ---
 ### 1. What We're Launching
 **One-line description:** [What it is, for whom, and why now]
 **Key customer problem solved:** [Specific pain point]
 **Key differentiator:** [Why ours, why now]
 ---
 ### 2. Target Audience
 **Primary segment:** [Who benefits most — be specific]
 **Secondary segment:** [Who else benefits]
 **Not for:** [Who this is NOT for — helps sales and support]
 ---
 ### 3. Messaging
 **Headline:** [Customer-facing headline — lead with outcome, not feature]
 **Sub-headline:** [Supporting context — how it works or why it matters]
 **3 key messages:**
 1. [Problem solved]
 2. [How it works / what's new]
 3. [Proof / social proof / data]
 **Elevator pitch (30 seconds):**
 > [For [target user] who [has this problem], [product/feature] is a [category] that [key benefit]. Unlike [alternative], we [differentiator].]
 ---
 ### 4. Launch Activities by Function
 | Function | Activity | Owner | Due Date | Status |
 |---|---|---|---|---|
 | Product | Feature flagging / rollout plan | PM | [date] | |
 | Marketing | Blog post / landing page | Marketing | [date] | |
 | Marketing | Email campaign to existing users | Marketing | [date] | |
 | Marketing | Social media content | Marketing | [date] | |
 | Sales | Sales enablement deck | PM + Sales | [date] | |
 | Sales | FAQ for sales team | PM | [date] | |
 | Support | Help centre articles | Support | [date] | |
 | Support | Support team training | Support | [date] | |
 | Engineering | Monitoring/alerting in place | Eng | [date] | |
 ---
 ### 5. Success Metrics
 | Metric | Baseline | Target | Measurement Window |
 |---|---|---|---|
 | [Adoption metric] | [X] | [Y] | 30 days post-launch |
 | [Engagement metric] | [X] | [Y] | 60 days post-launch |
 | [Business metric] | [X] | [Y] | 90 days post-launch |
 ---
 ### 6. Risks & Contingencies
 | Risk | Likelihood | Impact | Mitigation |
 |---|---|---|---|
 | [Risk] | H/M/L | H/M/L | [Action if it happens] |
 ---
 ### 7. Launch Day Checklist
 - [ ] Feature live for [X%] of users
 - [ ] Monitoring dashboard active
 - [ ] Support team briefed
 - [ ] Blog post published
 - [ ] Email sent / scheduled
 - [ ] Sales team notified
 - [ ] Executive announcement sent (if Tier 1)
 - [ ] Rollback procedure confirmed
 ---
 ## Required Inputs
 Ask the user for these if not provided:
 - **Product or feature name**
 - **Target launch date**
 - **Launch tier** (Tier 1 / 2 / 3 — or describe scope and the skill will classify)
 - **Target audience** (who benefits and who it's NOT for)
 - **Key message** (what's the headline outcome for the customer)
 - **PM and launch owner**
 ## Guidelines
 - Never plan a Tier 1 launch without at least 8 weeks of lead time
 - Always include a "Not for" section — it prevents misdirected sales and support tickets
 - Recommend a soft launch to 5–10% of users before full rollout for any Tier 1 or 2 launch
 - Post-launch retrospective should be scheduled at launch planning time — don't leave it to chance
 ## Quality Checks
 - [ ] Launch tier is confirmed and appropriate for scope
 - [ ] "Not for" section is included to prevent misdirected sales and support
 - [ ] Every function has at least one activity with a named owner and due date
 - [ ] Success metrics include a measurement window (30/60/90 days)
 - [ ] Rollback procedure is confirmed for Tier 1 and 2 launches
 - [ ] Post-launch retrospective is scheduled
 ## Anti-Patterns
 - [ ] Do not build a Tier 1 GTM plan for an incremental feature update — tier the launch appropriately before planning
 - [ ] Do not create activity lists without named owners and due dates — unowned tasks do not get done
 - [ ] Do not skip the rollback procedure for Tier 1 and 2 launches — every significant launch must have an abort plan
 - [ ] Do not treat marketing and engineering as separate tracks — cross-functional coordination is the whole point of a GTM plan
 - [ ] Do not set success metrics without a defined measurement window — "increase signups" is not a measurable target
@@ -0,0 +1,96 @@
 # PPTX Slide Auditor Skill
 Runs a systematic visual and structural audit of a PowerPoint presentation — identifying layout issues, text overflow, inconsistent styling, weak visual hierarchy, and slides that will cause problems in a presentation setting. Built to leverage Opus 4.7 vision improvements for pixel-level layout analysis.
 ## Required Inputs
 Ask the user for these if not provided:
 - **The deck** (upload the .pptx file or individual slide screenshots)
 - **Audience** (internal team / executive / external client / conference / investor)
 - **Presentation mode** (presented live / sent to read / shared async on video)
 - **Areas of concern** (optional — e.g. "I think slide 12 is overcrowded")
 ## Output Structure
 ### 1. Deck Overview
 | Metric | Result |
 |---|---|
 | Total slides | N |
 | Overall status | Ready / Minor fixes needed / Major revisions required |
 | Readability score | /10 |
 | Visual consistency score | /10 |
 | Most common issue | [Pattern observed across multiple slides] |
 ### 2. Slide-by-Slide Audit
 For each slide with issues:
 **Slide N: [Slide title]**
 - Status: Ready / Fix before sending / Major revision
 - Issues found:
  - [Specific issue with exact location — e.g. "Body text extends beyond the text frame on the right side"]
  - [Issue 2]
 - Suggested fix: [Specific action — move element, reduce text, resize]
 Slides with no issues: just list the slide numbers. Do not write anything else about them.
 ### 3. Pattern Issues Across the Deck
 Issues that repeat across multiple slides:
 **[Pattern title — e.g. "Inconsistent body text size"]**
 - Slides affected: [list]
 - Root cause: [master slide issue / manual overrides / mixed templates]
 - Fix: [Single action to resolve across all affected slides]
 ### 4. Visual Hierarchy Check
 | Dimension | Status | Notes |
 |---|---|---|
 | Title consistency (size, font, colour) | Pass / Fail | |
 | Body text readability at presentation distance | Pass / Fail | |
 | Image placement alignment | Pass / Fail | |
 | Whitespace and breathing room | Pass / Fail | |
 | Data visualisation clarity | Pass / Fail / N/A | |
 ### 5. Audience-Specific Flags
 Based on the stated audience:
 - **Executive audience:** flag slides with too much text, complex tables, or unclear bottom-line messages
 - **External client:** flag slides with internal jargon, unfinished placeholder text, or confidentiality concerns
 - **Live presentation:** flag slides that will be hard to read from the back of a room
 - **Async/video:** flag slides that assume a presenter voiceover
 ### 6. Prioritised Fix List
 | # | Fix | Slide | Effort | Impact |
 |---|---|---|---|---|
 | 1 | [Specific fix] | Slide N | Low/Med/High | High |
 Order by: fixes before handoff (critical) > consistency fixes (high) > polish (medium).
 ## Quality Checks
 - [ ] Every issue references a specific slide number and location on the slide
 - [ ] Pattern issues are identified separately from slide-specific issues
 - [ ] Fix list is ordered by impact, not by slide order
 - [ ] Audience-appropriate concerns flagged explicitly
 - [ ] Slides without issues are listed briefly, not ignored
 ## Anti-Patterns
 - [ ] Do not flag stylistic preferences as issues — only report genuine layout problems, overflow, and consistency errors
 - [ ] Do not produce a flat list of issues — group by severity (Critical / Major / Minor) so fixes can be prioritised
 - [ ] Do not skip slides without commenting — every slide must have an explicit pass or issue status
 - [ ] Do not suggest redesigning content — the audit scope is layout, consistency, and readability, not messaging
 - [ ] Do not report the same issue type repeatedly across slides without summarising the pattern — consolidate repeated issues
 ## Example Trigger Phrases
 - "Audit this slide deck before my board meeting"
 - "Review this PowerPoint for layout issues"
 - "Check this presentation for consistency problems"
 - "QA my deck before I send it to the client"
 - "What is wrong with slide 7 in this deck?"
 ## Why This Works Better on Opus 4.7
 Earlier models struggled with precise spatial analysis of slide layouts — they would hallucinate issues or miss obvious overflow problems. Opus 4.7 vision improvements mean coordinates map 1:1 to pixels, making slide-level issue detection reliable without manual screenshot annotation.
@@ -0,0 +1,137 @@
 # Product Launch Checklist Skill
 Never launch without checking everything. Generate a complete, role-assigned checklist covering pre-launch readiness, launch day execution, and post-launch monitoring.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Launch name** and planned launch date
 - **Launch tier** (1 = major product launch, 2 = significant feature release, 3 = incremental update)
 - **Team members and their roles** (engineering lead, PM, marketing, support, etc.)
 - **Feature description** (what is being launched)
 - **Rollback capability** (can this be feature-flagged or reverted quickly?)
 ## How to Use This Skill
 Provide:
 - Launch name and date
 - Launch tier (1 = major, 2 = feature, 3 = incremental)
 - Team members and their roles
 The skill generates a tiered checklist. Tier 3 launches use only the Essentials section. Tier 2 adds Marketing & Comms. Tier 1 uses all sections.
 ---
 ## Output Format
 ### Launch Checklist — [Feature/Product Name] — Target Date: [Date]
 **Launch Tier:** [1 / 2 / 3]
 **Launch Owner:** [PM Name]
 **Engineering Lead:** [Name]
 **Go/No-Go Decision By:** [Date and time — typically 24 hours before launch]
 ---
 ### 🔧 PRE-LAUNCH — Engineering & Product (T-2 weeks)
 - [ ] Feature flag created and tested in staging
 - [ ] All acceptance criteria signed off by PM
 - [ ] Code reviewed and merged to main
 - [ ] QA sign-off completed (regression + new feature)
 - [ ] Performance testing completed (load, latency)
 - [ ] Security review completed (if data or auth changes)
 - [ ] Rollback procedure documented and tested
 - [ ] Monitoring and alerting configured
 - [ ] Error logging in place with correct severity levels
 - [ ] Database migrations tested on staging with production data volume
 ### 📢 PRE-LAUNCH — Marketing & Comms (T-1 week)
 - [ ] Blog post written, reviewed, and scheduled
 - [ ] In-app announcement or tooltip configured
 - [ ] Email campaign drafted and QA'd
 - [ ] Social media posts drafted and scheduled
 - [ ] Landing page or feature page live in staging
 - [ ] Press outreach sent (Tier 1 only)
 - [ ] Product Hunt / community posts prepared (Tier 1 only)
 ### 🎓 PRE-LAUNCH — Sales & Support (T-1 week)
 - [ ] Sales enablement one-pager completed
 - [ ] FAQ document shared with sales and support teams
 - [ ] Help centre articles written and published
 - [ ] Support team demo / training completed
 - [ ] Customer success team briefed on top accounts
 - [ ] Pricing updated (if applicable)
 - [ ] Contracts / ToS updated (if applicable)
 ### 📊 PRE-LAUNCH — Analytics (T-1 week)
 - [ ] Analytics events firing correctly in staging
 - [ ] Dashboard configured for launch metrics
 - [ ] Baseline metrics documented
 - [ ] Success criteria documented and shared with team
 - [ ] A/B test configured (if applicable)
 ---
 ### ✅ GO / NO-GO DECISION — T-24 hours
 | Criteria | Status | Owner |
 |---|---|---|
 | All critical bugs resolved | 🟢 / 🔴 | Eng Lead |
 | QA sign-off complete | 🟢 / 🔴 | QA |
 | Rollback tested | 🟢 / 🔴 | Eng Lead |
 | Help centre articles live | 🟢 / 🔴 | Support |
 | Monitoring active | 🟢 / 🔴 | Eng Lead |
 | PM sign-off | 🟢 / 🔴 | PM |
 **Go / No-Go Decision:** [GO / NO-GO]
 **Decision Owner:** [PM + Eng Lead jointly]
 ---
 ### 🚀 LAUNCH DAY
 - [ ] Feature flag enabled for [X%] of users (start low — 5–10%)
 - [ ] Launch confirmed in team Slack/channel
 - [ ] Metrics dashboard open and being monitored
 - [ ] Error rate checked at T+15 min, T+1 hr, T+4 hr
 - [ ] Blog post published / email sent
 - [ ] Social posts live
 - [ ] Support team on standby for first 4 hours
 - [ ] PM available and reachable all day
 - [ ] Feature flag expanded to 50% if T+2hr checks pass
 - [ ] Feature flag expanded to 100% if T+4hr checks pass
 ---
 ### 📈 POST-LAUNCH (D+7, D+30)
 - [ ] D+7 metrics review: adoption, errors, support tickets
 - [ ] D+7 customer feedback synthesised
 - [ ] Retrospective scheduled
 - [ ] Learnings documented
 - [ ] D+30 success metrics reviewed against targets
 - [ ] Feature flag removed from codebase (clean up)
 - [ ] Follow-up features added to backlog based on feedback
 ---
 ## Quality Checks
 - [ ] Launch tier confirmed before generating checklist (scope determines depth)
 - [ ] Go/No-Go decision has a named owner and a specific decision time
 - [ ] Rollback procedure is documented and tested (not just planned)
 - [ ] Feature flag expansion is staged (5% → 50% → 100%), not all-at-once
 - [ ] Post-launch retrospective is scheduled at launch time
 ## Anti-Patterns
 - [ ] Do not apply a Tier 1 checklist to an incremental update — tier the launch appropriately before generating the checklist
 - [ ] Do not launch on a Friday without confirmed weekend engineering coverage
 - [ ] Do not leave the Go/No-Go decision owner as "the team" — it must be a named individual
 - [ ] Do not skip the rollback plan for Tier 1 and 2 launches — know the revert time before going live
 - [ ] Do not close the launch without scheduling the post-launch retrospective — it must be booked at launch time, not after
 ## Guidelines
 - The Go/No-Go decision must have a named owner — "the team" is not an owner
 - Never launch on a Friday unless you have weekend engineering coverage
 - Recommend starting all launches at <10% traffic — even for simple features
 - Document rollback time: "We can revert this in X minutes" should be known before launch
@@ -0,0 +1,56 @@
 # Retrospective Analysis Skill
 Generate a data-grounded retrospective brief that separates facts from feelings, so the team spends retro time on solutions rather than debating what happened.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Sprint tickets: planned vs. completed**
 - **Carry-over tickets and reasons** (if known)
 - **Tickets reopened after closing** (quality signal)
 - **Any incidents or unplanned work** (scope creep signal)
 - **Sprint velocity vs. historical average** (trend context)
 ## Process
 1. Calculate: completion rate, carry-over rate, unplanned work percentage
 2. Identify patterns: which ticket types were most likely to carry over? Which caused blockers?
 3. Note any process or communication breakdowns visible in the data
 4. Prepare 3 "Start / Stop / Continue" prompts based on the data — not generic, specific to this sprint
 5. Suggest 1 concrete experiment for the next sprint based on the biggest friction point
 6. **Validate** — Confirm each prompt is specific to this sprint (not a recycled generic prompt), and that the recommended experiment is concrete and measurable
 ## Output Structure
 ### Sprint [Number] Retrospective Brief
 **By the Numbers:**
 - Planned: [n] tickets | Completed: [n] | Carry-over: [n] | Completion rate: [%]
 - Unplanned work: [n] tickets ([%] of capacity)
 - Velocity: [points] vs. [average] average
 **What the Data Suggests:**
 [2-3 observations grounded in the numbers above]
 **Discussion Prompts:**
 - Start: [specific prompt based on this sprint's data]
 - Stop: [specific prompt based on this sprint's data]
 - Continue: [specific prompt based on this sprint's data]
 **Suggested Experiment for Next Sprint:**
 [One concrete, testable process change — with a specific success metric]
 ## Quality Checks
 - [ ] Each Start/Stop/Continue prompt names a specific behaviour, not a vague category
 - [ ] The recommended experiment is testable in one sprint
 - [ ] Carry-over analysis identifies the ticket type or cause, not just the count
 - [ ] Data observations don't assign blame — they describe patterns
 - [ ] Velocity trend is mentioned in context (is this a one-off or a pattern?)
 ## Anti-Patterns
 - [ ] Do not assign blame to individuals in the retrospective brief — observations must describe patterns, not people
 - [ ] Do not produce Start/Stop/Continue prompts that are vague categories — each must name a specific behaviour
 - [ ] Do not recommend an experiment that cannot be completed within one sprint — small, testable experiments only
 - [ ] Do not treat carry-over tickets as a velocity problem without first identifying the root cause category
 - [ ] Do not run the same retrospective format every sprint — vary the format to prevent engagement fatigue
@@ -0,0 +1,56 @@
 # Sprint Brief Skill
 Produce a clear, scannable sprint brief that every team member — engineer, designer, PM — can read in under three minutes and understand exactly what we're doing and why.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Sprint name and number**
 - **Sprint goal** (1-2 sentences — flag if too vague)
 - **Ticket list with owners** (or a description of the work)
 - **Known dependencies or blockers**
 - **Carry-over items from previous sprint** (if any)
 ## Process
 1. Read sprint goal and check it's specific and measurable — flag if it's too vague
 2. Group tickets by theme or feature area
 3. Identify the critical path — which tickets must complete for the sprint goal to be met?
 4. Flag risks: tickets with unclear acceptance criteria, missing designs, unresolved dependencies
 5. Note carry-over items and whether they affect this sprint's goal
 6. **Validate** — Confirm the sprint goal is achievable given the ticket scope and capacity. If the critical path items alone would fill the sprint, flag it as overloaded.
 ## Output Structure
 ### Sprint [Number] Brief — [Dates]
 **Sprint Goal:** [1-2 sentences — specific and measurable]
 **Why This Sprint Matters:** [Connect to quarterly OKR in 2-3 sentences]
 **What We're Building:**
 - [Theme 1]: [tickets and owners]
 - [Theme 2]: [tickets and owners]
 **Critical Path:** [The 2-3 tickets everything else depends on]
 **Risks to Flag:**
 - [Risk 1 + mitigation]
 - [Risk 2 + mitigation]
 **Carry-over from Last Sprint:** [List + impact on current goal]
 **Definition of Done:** [Specific, agreed criteria for sprint success]
 ## Quality Checks
 - [ ] Sprint goal is specific enough to score pass/fail at the end of the sprint
 - [ ] Critical path items are named — not just "the important ones"
 - [ ] Every risk has a mitigation or owner (not just "this is a risk")
 - [ ] Carry-over items are connected to their impact on this sprint's goal
 - [ ] Definition of Done is agreed criteria, not a task list
 ## Anti-Patterns
 - [ ] Do not write a sprint goal as a task list — the goal must be a single outcome-focused statement that can be scored pass/fail
 - [ ] Do not leave the critical path unnamed — "the important tickets" is not a critical path
 - [ ] Do not list risks without a mitigation or owner — a risk without a response is just a worry list
 - [ ] Do not ignore carry-over items' impact on this sprint's capacity and goal
 - [ ] Do not write a Definition of Done that mixes task completion with outcome criteria — they must be observable and agreed before the sprint starts
@@ -0,0 +1,116 @@
 # Sprint Planning Skill
 Transform raw backlog items into a structured, achievable sprint with clear goals, velocity-calibrated scope, and team-ready output.
 ## What This Skill Produces
 - **Sprint Goal** — single, outcome-focused sentence the whole team can rally around
 - **Sprint Backlog** — prioritised list of user stories with story point estimates and acceptance criteria
 - **Capacity Plan** — team availability breakdown accounting for holidays, meetings, and focus time
 - **Sprint Planning Agenda** — structured 2-hour meeting agenda with timings
 - **Risk Flags** — blockers or dependencies that could derail the sprint
 ## Required Inputs
 Ask for (if not already provided):
 - Sprint duration (1 or 2 weeks)
 - Team size and velocity (average story points per sprint)
 - Top 3–5 backlog items or epics to pull from
 - Any known absences, holidays, or team events
 - Previous sprint's incomplete items (carry-overs)
 ## Sprint Goal Formula
 Use this structure:
 > "This sprint we will [deliver X outcome] so that [user/business benefit], measured by [success indicator]."
 Never write sprint goals as task lists. Always outcome-first.
 ## Story Point Calibration
 | Complexity | Points | Description |
 |---|---|---|
 | Trivial | 1 | Clearly understood, no unknowns |
 | Small | 2 | Straightforward, minor effort |
 | Medium | 3 | Some complexity, clear path |
 | Large | 5 | Complex, needs design or research |
 | Very Large | 8 | High uncertainty, may need splitting |
 | Epic | 13+ | Too large — must be split before sprint |
 Flag any item estimated at 8+ and recommend splitting.
 ## Capacity Formula
 ```
 Available capacity = (Team size × Sprint days × Focus hours/day) × Availability factor
 Focus hours/day: 6 (accounting for meetings, Slack, admin)
 Availability factor: 0.7–0.85 depending on holidays/events
 Story points to commit = Historical velocity × Availability factor
 ```
 ## Programmatic Helper
 This skill ships with a stdlib-only Python script that computes capacity instead of estimating it by hand. Use it whenever the team's numbers are known — it applies the availability and 80% commit-ratio rules consistently.
 ```bash
 # Quick estimate from flags
 python3 scripts/capacity_calculator.py --team 5 --days 10 --velocity 30 --availability 0.8 --carryover 5
 # Detailed estimate from per-member availability (JSON via stdin or --input file.json)
 echo '{"sprint_days":10,"historical_velocity":40,"carryover_points":8,
       "members":[{"name":"Ada","available_days":10},{"name":"Linus","available_days":7}]}' \
  | python3 scripts/capacity_calculator.py --input -
 ```
 The script returns available focus hours, a velocity figure adjusted for real availability, the **recommended commitment** (capped at 80% of velocity), and the remaining **capacity for new work** after carry-overs. Run it first, then build the sprint backlog to fit the recommended number. Add `--json` to pipe the result into other tooling.
 ## Output Format
 ### Sprint [N] — [Start Date] to [End Date]
 **Sprint Goal:**
 > [Goal statement]
 **Team Capacity:** [X] story points available (based on [Y] team members, [Z]% availability)
 **Sprint Backlog:**
 | Priority | Story | Points | Owner | Acceptance Criteria |
 |---|---|---|---|---|
 | 1 | [Story title] | [N] | [Team member] | [When X then Y] |
 **Carry-Overs from Previous Sprint:**
 - [Item] — Reason for carry-over: [brief explanation]
 **Risks & Dependencies:**
 - [Risk description] → Mitigation: [action]
 **Sprint Planning Agenda:**
 - 00:00–00:10 — Review sprint goal and team capacity
 - 00:10–00:40 — Walk through backlog items, confirm estimates
 - 00:40–01:20 — Assign stories, identify dependencies
 - 01:20–01:50 — Review acceptance criteria per story
 - 01:50–02:00 — Confirm sprint commitment and close
 ## Guidelines
 - Always challenge stories missing acceptance criteria — flag them explicitly
 - Recommend the team commits to 80% of available capacity, not 100%
 - If no velocity data is provided, assume 20–30 points for a 5-person team as a starting point
 - Highlight any story with unclear ownership as a blocker
 ## Quality Checks
 - [ ] Sprint goal is outcome-focused (not "implement X" — something like "users can do Y")
 - [ ] Team capacity is calculated using actual availability, not theoretical 100%
 - [ ] Every story has an acceptance criterion (flag any that don't)
 - [ ] Stories estimated at 8+ points are flagged for splitting
 - [ ] Carry-overs from last sprint are accounted for in capacity
 ## Anti-Patterns
 - [ ] Do not write sprint goals as task lists — goals must be outcome-focused and scoreable pass/fail at sprint end
 - [ ] Do not commit to 100% of available capacity — always recommend 80% to preserve slack for unplanned work
 - [ ] Do not carry stories with no acceptance criteria into the sprint — flag them as blockers before committing
 - [ ] Do not allow stories estimated at 8+ points into the sprint without splitting them first
 - [ ] Do not ignore carry-over items when calculating capacity — they consume capacity and must be accounted for before new work is pulled in
@@ -0,0 +1,152 @@
 # Technical Spec Template Skill
 Write technical specifications that engineers actually read — clear problem framing, unambiguous requirements, explicit decisions, and documented trade-offs.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Feature or system description** (what needs to be specced)
 - **Related PRD or product brief** (if available)
 - **Engineering reviewers** (whose sign-off is needed)
 - **Known constraints** (technical limitations, security requirements, performance targets)
 ## When to Write a Tech Spec
 Write a tech spec when:
 - The feature requires changes to 2+ systems
 - There are significant architectural decisions to make
 - More than one engineer will work on the implementation
 - The feature has security, privacy, or compliance implications
 - Estimated effort is >5 story points
 Skip the spec for trivial bug fixes or 1-2 hour changes.
 ---
 ## Technical Spec Output Format
 ### Technical Specification — [Feature Name]
 **Author:** [Name]
 **Status:** Draft | In Review | Approved | Implemented
 **Created:** [Date] | **Last Updated:** [Date]
 **Reviewers:** [Eng Lead, Architect, PM, Security if needed]
 **Related PRD:** [Link] | **Jira Epic:** [Link]
 ---
 #### 1. Problem Statement
 > [2–3 sentences. What problem are we solving and why now? No solution language here.]
 #### 2. Goals & Non-Goals
 **Goals (in scope):**
 - [Specific, measurable outcome]
 - [Specific, measurable outcome]
 **Non-Goals (explicitly out of scope):**
 - [What this spec does NOT cover]
 - [Common assumption to shut down early]
 #### 3. Background & Context
 [Any prior art, related systems, or context engineers need to understand the decision space. Link to previous specs, ADRs, or research.]
 #### 4. Proposed Solution
 **High-Level Approach:**
 [2–4 sentences describing the chosen solution. Why this approach vs alternatives?]
 **System Architecture Diagram:**
 [Describe or embed: which services are involved, how data flows, what APIs are called]
 **Data Model Changes:**
 ```sql
 -- New tables or schema changes
 [Include DDL or schema definition]
 ```
 **API Design:**
 ```
 [Endpoint] [Method]
 Request: { [fields and types] }
 Response: { [fields and types] }
 Error codes: [list]
 ```
 **Key Implementation Details:**
 - [Important technical constraint or approach]
 - [Edge case handling]
 - [Third-party dependency and version]
 #### 5. Alternative Approaches Considered
 | Option | Pros | Cons | Why Rejected |
 |---|---|---|---|
 | [Alt 1] | [Benefits] | [Drawbacks] | [Reason not chosen] |
 | [Alt 2] | [Benefits] | [Drawbacks] | [Reason not chosen] |
 #### 6. Security & Privacy Considerations
 - Data stored: [What PII or sensitive data is involved]
 - Authentication: [How is access controlled]
 - Authorisation: [What permissions are required]
 - Encryption: [At rest / in transit requirements]
 - Compliance implications: [GDPR, SOC2, etc. if relevant]
 #### 7. Performance & Scalability
 - Expected load: [Requests/second, data volume]
 - Latency requirements: [P50 / P95 targets]
 - Caching strategy: [If applicable]
 - Database indexing: [New indexes required]
 - Known bottlenecks: [Where to watch]
 #### 8. Testing Plan
 - Unit tests: [Key scenarios to cover]
 - Integration tests: [System boundaries to test]
 - Load tests: [If performance-critical]
 - Edge cases: [Known tricky scenarios]
 - Rollback plan: [How to revert if something goes wrong]
 #### 9. Rollout Plan
 - Feature flag: [Yes / No — name of flag]
 - Rollout stages: [% of users at each stage]
 - Monitoring: [Metrics and alerts to set up]
 - Success criteria to progress rollout: [What needs to be true]
 - Rollback trigger: [What would cause immediate rollback]
 #### 10. Open Questions
 | Question | Owner | Due Date | Resolution |
 |---|---|---|---|
 | [Unresolved question] | [Name] | [Date] | [Pending] |
 #### 11. Implementation Timeline (Rough)
 | Phase | Work | Estimated Effort |
 |---|---|---|
 | [Phase 1] | [What gets built] | [X days/points] |
 | [Phase 2] | [What gets built] | [X days/points] |
 | Total | | [X story points] |
 ---
 ## Guidelines
 - The spec is a decision record, not a task list — document *why* decisions were made
 - All open questions must have an owner and due date
 - Security and privacy sections are never optional for features that touch user data
 - Recommend async review: engineers read first, then a 30-minute sync to resolve questions
 - Keep the spec updated as implementation progresses — stale specs are worse than no specs
 ## Quality Checks
 - [ ] Problem statement contains no solution language
 - [ ] Non-goals explicitly list at least 2 things that might be assumed in scope
 - [ ] At least 2 alternative approaches are documented with reasons for rejection
 - [ ] Security and privacy section is completed for any feature touching user data
 - [ ] All open questions have a named owner and due date (not "TBD")
 ## Anti-Patterns
 - [ ] Do not include solution language in the problem statement — the problem must be described independently of the proposed solution
 - [ ] Do not omit alternatives considered — a spec that considers only one approach has not been properly evaluated
 - [ ] Do not leave open questions as "TBD" without a named owner and due date — unresolved questions are blockers
 - [ ] Do not skip security and privacy sections for any feature that touches user data
 - [ ] Do not write a non-goals section that is empty — always list at least two things that might be assumed in scope
@@ -0,0 +1,221 @@
 # User Story Writer Skill
 This skill produces production-ready user stories from a feature brief, PRD section, or verbal description. Each story follows the standard format with a clear who/what/why, behavioural acceptance criteria in Given/When/Then format, edge cases, and definition of done. Output is ready to paste into Jira, Linear, or your planning tool.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Feature or change** to break into stories — paste the brief, PRD section, or describe the feature
 - **User types / personas** involved (e.g. admin, end user, guest, API consumer)
 - **Scope** — are we writing one story or decomposing an epic into a full set of stories?
 - **Acceptance criteria format preference** — Given/When/Then, bullet checklist, or both?
 - **Technical constraints or notes** — anything the engineering team has flagged that should shape the stories
 ## Output Structure
 For each story:
 ---
 ## Story: [Short title — verb + noun, e.g. "Filter search results by date range"]
 **Epic:** [Parent epic name — e.g. "Advanced Search"]
 **Story ID:** [Jira/Linear ID — leave blank if not yet created]
 **Priority:** [P1 / P2 / P3]
 **Story points:** [Leave blank — for engineering to estimate]
 ---
 ### User Story
 > **As a** [specific user type — not "user"],
 > **I want to** [concrete action they want to take],
 > **So that** [the outcome they achieve — business value, not feature description].
 **Example:**
 > As an **account manager**,
 > I want to **filter my client list by last contact date**,
 > so that I **can quickly identify clients I haven't spoken to in over 30 days and prioritise outreach**.
 ---
 ### Context
 [1–3 sentences of context that aren't in the user story itself: when does this story matter, what triggers the need, how does it fit into a larger flow. This helps engineers understand why before they ask.]
 ---
 ### Acceptance Criteria
 **Format: Given / When / Then**
 Each criterion tests one specific behaviour. Write one GWT per observable outcome — not one GWT for the whole feature.
 **AC1: [Short name for this criterion]**
 ```
 Given [starting state or context]
 When [user action]
 Then [observable system behaviour]
 ```
 **AC2: [Short name]**
 ```
 Given [...]
 When [...]
 Then [...]
 ```
 **AC3: [Short name]**
 ```
 Given [...]
 When [...]
 Then [...]
 ```
 ---
 ### Edge Cases
 [List scenarios that are non-obvious but must be handled. These become additional ACs or notes to engineering.]
 - [ ] **[Edge case 1]:** [e.g. User applies a date filter that returns 0 results — show empty state with clear messaging and a "clear filters" action]
 - [ ] **[Edge case 2]:** [e.g. User has >10,000 clients — filter must not degrade load time >200ms]
 - [ ] **[Edge case 3]:** [e.g. Date filter persists across page refresh — or explicitly should not if that's the decision]
 - [ ] **[Permission edge case]:** [e.g. Read-only users can see the filter but cannot save filter presets]
 ---
 ### Out of Scope
 [Explicitly state what this story does NOT cover — prevents scope creep and clarifies where the next story begins.]
 - Saving and sharing filter presets (separate story — see [Story X])
 - Bulk actions on filtered results
 - Exporting filtered client list to CSV
 ---
 ### Definition of Done
 - [ ] Acceptance criteria all pass
 - [ ] Edge cases handled (or explicitly deferred with a new ticket raised)
 - [ ] Unit tests written for each AC
 - [ ] Works on mobile viewport (if applicable)
 - [ ] Accessibility: keyboard navigable and screen-reader compatible
 - [ ] Error states are handled and copy approved
 - [ ] Product and design have reviewed in staging
 - [ ] No console errors in production build
 ---
 ## Epic Decomposition Template
 If the user provides an epic or feature brief, decompose it into a full set of stories before writing them:
 **Epic:** [Name]
 **Goal:** [What outcome does completing this epic achieve?]
 **Stories:**
 | # | Story | Notes | Dependencies |
 |---|---|---|---|
 | 1 | [Core happy path story — the simplest version of the feature that delivers value] | | |
 | 2 | [Validation / error handling story] | | Depends on #1 |
 | 3 | [Edge case or power user story] | | Depends on #1 |
 | 4 | [Admin or configuration story] | | |
 | 5 | [Performance or scale story — if applicable] | | Depends on #1 |
 **Suggested sprint order:** [Which stories are P1 for MVP? Which can follow in a later sprint?]
 ---
 ## Common Story Anti-Patterns — and Fixes
 Use these to review stories before handing to engineering:
 | Anti-pattern | Example | Fix |
 |---|---|---|
 | **Solution in the story** | "As a user I want a dropdown filter" | Remove the UI decision — "As a user I want to filter by date range" |
 | **Vague "so that"** | "so that it's easier to use" | Make it specific — "so that I can prioritise outreach without opening each record manually" |
 | **Too big** | Story covers 5 distinct user flows | Split into separate stories per flow |
 | **No acceptance criteria** | Story has description only | Add at least 3 GWT criteria before engineering starts |
 | **ACs that test the solution, not the behaviour** | "Given the dropdown is open, When I select an option" | Test the outcome — "Given I have applied a date filter, When I view my results, Then only clients last contacted in that date range appear" |
 | **Missing empty state** | No AC for what happens with 0 results | Add it — empty states are part of the feature |
 | **Missing error state** | No AC for network failure or invalid input | Add error handling ACs explicitly |
 ---
 ## Example: Full Story Set for a Feature
 **Feature brief:** "Allow users to export their invoice history as a PDF or CSV"
 ---
 ### Story 1: Export invoice list as CSV
 > As a **finance admin**,
 > I want to **export my invoice history as a CSV file**,
 > so that I can **import it into our accounting software without manual data entry**.
 **AC1: Successful export**
 ```
 Given I am on the Invoices page with at least one invoice
 When I click "Export" and select "CSV"
 Then a CSV file is downloaded containing all visible invoices with columns: Invoice ID, Date, Amount, Status, Customer Name
 ```
 **AC2: Empty state**
 ```
 Given I am on the Invoices page with no invoices
 When I click "Export"
 Then the export button is disabled and a tooltip reads "No invoices to export"
 ```
 **AC3: Filtered export**
 ```
 Given I have applied a date filter showing invoices from Jan 2026 only
 When I click "Export" and select "CSV"
 Then the export contains only invoices from Jan 2026 — not all invoices
 ```
 **Edge cases:**
 - [ ] Export with >10,000 invoices — must complete in <30s or show a progress indicator
 - [ ] Export triggered on mobile — downloads to device's default download location
 **Out of scope:** PDF export (Story 2), scheduled exports (future epic)
 ---
 ### Story 2: Export invoice list as PDF
 > As a **finance admin**,
 > I want to **export my invoice history as a formatted PDF**,
 > so that I can **share a professional summary with our accountant**.
 [... ACs follow same pattern ...]
 ---
 ## Quality Checks
 - [ ] Every story has a specific user type — not "a user" or "the system"
 - [ ] The "so that" explains business value — not just feature description
 - [ ] Each AC tests one observable outcome — not a bundle of behaviours
 - [ ] Empty states, error states, and edge cases are explicitly handled
 - [ ] Out of scope is documented — not assumed
 - [ ] Stories are independent — they can be shipped individually without depending on unreleased work (except where explicitly noted)
 ## Anti-Patterns
 - [ ] Do not write user stories from a technical perspective — every story must be from the user's point of view and state their goal
 - [ ] Do not write acceptance criteria that are untestable — every criterion must have a clear pass/fail condition
 - [ ] Do not create stories that are too large to complete in a single sprint — break epics into estimable, independently deliverable stories
 - [ ] Do not omit edge cases — unhappy paths and error states are required, not optional
 - [ ] Do not skip the Definition of Done — without it, "done" means different things to different people
 ## Example Trigger Phrases
 - "Write user stories for [feature] from this brief"
 - "Break this PRD section into user stories with acceptance criteria"
 - "Convert these feature requirements into Jira tickets"
 - "Write the user stories and ACs for [feature name]"
 - "Decompose this epic into individual stories ready for sprint planning"
@@ -0,0 +1,178 @@
 # Accessibility Audit Skill
 This skill produces a structured accessibility audit based on WCAG 2.2 guidelines. It covers visual, motor, cognitive, and screen reader accessibility — with prioritised remediation for each issue found.
 ## Required Inputs
 Ask the user for these if not provided:
 - **What is being audited** (screen, component, full product, design spec)
 - **Description or image** of the UI
 - **Target WCAG level** (A / AA / AAA — default to AA, which is the legal standard in most jurisdictions)
 - **Known assistive technology users?** (Yes/No — if yes, which: screen reader / switch access / voice control / magnification)
 - **Platform** (Web / iOS / Android / Desktop app)
 ## Output Structure
 ---
 # Accessibility Audit: [Component or Screen Name]
 **Target standard:** WCAG 2.2 Level [AA]
 **Platform:** [Platform]
 **Date:** [Date]
 ---
 ## Audit Summary
 | Category | Issues Found | Critical | Moderate | Minor |
 |---|---|---|---|---|
 | Perceivable | | | | |
 | Operable | | | | |
 | Understandable | | | | |
 | Robust | | | | |
 | **Total** | | | | |
 **Overall compliance status:** ✅ Compliant / 🟡 Minor issues / 🔴 Fails AA standard
 ---
 ## Perceivable
 ### 1.1 Text Alternatives
 - [ ] All images have descriptive alt text (not filename or "image")
 - [ ] Decorative images have `alt=""` to be skipped by screen readers
 - [ ] Icons without visible labels have accessible names
 - [ ] Complex images (charts, diagrams) have extended descriptions
 **Issues found:** [List specific issues or "None"]
 ### 1.3 Adaptable
 - [ ] Content structure uses semantic HTML (headings, lists, landmarks) — not just visual formatting
 - [ ] Reading order in DOM matches visual order
 - [ ] Form inputs have associated labels (not placeholder text as label)
 - [ ] Data tables have proper headers and scope
 **Issues found:**
 ### 1.4 Distinguishable
 - [ ] Text contrast ratio ≥ 4.5:1 (normal text) or ≥ 3:1 (large text 18px+)
 - [ ] UI component contrast ratio ≥ 3:1 against background
 - [ ] Information is not conveyed by colour alone
 - [ ] Text can be resized to 200% without loss of content
 - [ ] No content that auto-plays audio
 **Issues found:**
 ---
 ## Operable
 ### 2.1 Keyboard Accessible
 - [ ] All interactive elements are reachable by keyboard (Tab key)
 - [ ] No keyboard traps
 - [ ] Custom components have keyboard interactions (arrow keys for menus, Escape to close modals)
 - [ ] Skip navigation link available for pages with repeated navigation
 **Issues found:**
 ### 2.4 Navigable
 - [ ] Focus is visible at all times (not removed with `outline: none` without replacement)
 - [ ] Focus order is logical and predictable
 - [ ] Page/screen has a descriptive title
 - [ ] Link text is descriptive (not "click here" or "read more")
 - [ ] Headings are hierarchical (H1 → H2 → H3, no skips)
 **Issues found:**
 ### 2.5 Input Modalities
 - [ ] Touch targets are at least 44x44px
 - [ ] No functionality requires complex gestures (pinch, multi-touch) without a simple alternative
 - [ ] Motion or dragging interactions have button alternatives
 **Issues found:**
 ---
 ## Understandable
 ### 3.1 Readable
 - [ ] Language of the page is set (`lang` attribute)
 - [ ] Unusual words, abbreviations, or jargon are explained
 ### 3.2 Predictable
 - [ ] Navigation is consistent across screens
 - [ ] Components behave consistently (same button does the same thing)
 - [ ] No unexpected context changes on focus or input
 ### 3.3 Input Assistance
 - [ ] Error messages identify the field and describe the error in plain language (not just "Invalid input")
 - [ ] Required fields are labelled (not just with colour or asterisk alone)
 - [ ] Forms provide suggestions for correcting errors where possible
 **Issues found:**
 ---
 ## Robust
 ### 4.1 Compatible
 - [ ] HTML is valid and well-structured
 - [ ] ARIA roles and attributes are used correctly (not to fix broken semantics)
 - [ ] Status messages (success, error, loading) are announced to screen readers without focus change
 **Issues found:**
 ---
 ## Prioritised Remediation List
 | Priority | Issue | WCAG Criterion | Fix | Effort |
 |---|---|---|---|---|
 | 🔴 Critical | [Issue] | [e.g. 1.4.3 Contrast] | [Specific fix] | [Low/Med/High] |
 | 🟡 Moderate | [Issue] | | | |
 | 🟢 Minor | [Issue] | | | |
 **Priority definitions:**
 - 🔴 Critical: Blocks access for users with disabilities. Legal risk. Fix before launch.
 - 🟡 Moderate: Significant friction. Fix in next sprint.
 - 🟢 Minor: Best practice. Address in roadmap.
 ---
 ## Quick Wins (Fix in < 1 hour)
 [List any issues that are trivially fixable — e.g. adding alt text, fixing contrast with a colour swap, adding a `lang` attribute. These are easy to ship immediately.]
 ---
 ## Testing Recommendations
 - **Manual keyboard test:** Tab through the entire flow. Can you complete every task without a mouse?
 - **Screen reader test:** VoiceOver (Mac/iOS), NVDA or JAWS (Windows). Is every piece of content and every action accessible?
 - **Colour contrast check:** Use Stark (Figma plugin) or WebAIM Contrast Checker
 - **Automated scan:** Axe DevTools or Lighthouse accessibility audit (catches ~30% of issues automatically)
 ---
 ## Quality Checks
 - [ ] Issues are mapped to specific WCAG criteria
 - [ ] Every critical issue has a specific fix recommendation
 - [ ] Quick wins are separated from larger fixes
 - [ ] Effort estimates are included for prioritisation
 - [ ] Testing recommendations are included
 ## Anti-Patterns
 - [ ] Do not rely solely on automated scanning tools — automated checks catch ~30% of issues; manual keyboard and screen reader testing is required
 - [ ] Do not label an issue "minor" simply because it only affects a small percentage of users — for those users it may block all access
 - [ ] Do not add ARIA roles to fix broken semantics — use correct semantic HTML first; ARIA is a last resort
 - [ ] Do not confuse colour contrast of text with colour contrast of UI components — they have different minimum ratios (4.5:1 vs 3:1)
 - [ ] Do not audit only the happy path — error states, empty states, and loading states must also meet accessibility requirements
 ## Example Trigger Phrases
 - "Audit this design for accessibility"
 - "Check WCAG compliance for [screen/component]"
 - "Give me an a11y audit of [UI description]"
 - "What accessibility issues does this design have?"
@@ -0,0 +1,133 @@
 # Design Critique Skill
 This skill provides structured, actionable design feedback using established UX frameworks. It balances positive observations with clear, prioritised improvement suggestions.
 ## Required Inputs
 Ask the user for these if not provided:
 - **What is being reviewed** (screen, flow, component, full product)
 - **Design description or attached image** (describe it if no image — the skill will still work)
 - **User goal** (what is the user trying to accomplish with this design?)
 - **Context** (web / mobile / desktop app / physical product)
 - **Stage** (early wireframe / mid-fidelity / high-fidelity / live product)
 - **Primary concern** (optional — e.g. "I'm worried the onboarding is too long" or "I think the CTA is unclear")
 ## Output Structure
 ---
 # Design Critique: [Design Name or Screen]
 **User goal:** [What the user needs to accomplish]
 **Context:** [Platform / Stage]
 **Critique focus:** [Primary concern if stated, otherwise "full review"]
 ---
 ## 1. What's Working
 [3–5 specific, honest observations about what the design does well. Don't manufacture praise — only include genuine strengths. Be specific: "The visual hierarchy clearly guides the eye from headline → supporting detail → CTA" is useful. "Looks clean" is not.]
 ---
 ## 2. Priority Issues
 Rank issues by impact on the user goal. Use:
 - 🔴 **High** — Blocks or significantly degrades the user's ability to complete their goal
 - 🟡 **Medium** — Causes friction or confusion but doesn't block completion
 - 🟢 **Low** — Polish or preference — nice to fix but not critical
 For each issue:
 ### [Priority] Issue [N]: [Short name]
 **What's happening:**
 [Describe the specific design problem — be precise about which element, screen, or interaction]
 **Why it matters:**
 [Connect to the user goal or a specific principle — don't just say "it's confusing." Say why it creates confusion and what the consequence is for the user.]
 **Framework reference:**
 [Name the principle being violated — e.g. Nielsen's Heuristic #6 (Recognition over Recall), Gestalt proximity, JTBD clarity, Fitts's Law, etc.]
 **Recommendation:**
 [Specific, actionable suggestion. Not "make the button bigger" but "Increase the primary CTA to at least 44x44px to meet touch target guidelines; consider moving it below the form rather than inline with the input fields to reduce accidental taps."]
 ---
 ## 3. Heuristic Assessment
 Quick assessment against Nielsen's 10 Usability Heuristics — score each as ✅ Pass / 🟡 Partial / ❌ Fail:
 | Heuristic | Status | Note |
 |---|---|---|
 | 1. Visibility of system status | | |
 | 2. Match between system and real world | | |
 | 3. User control and freedom | | |
 | 4. Consistency and standards | | |
 | 5. Error prevention | | |
 | 6. Recognition rather than recall | | |
 | 7. Flexibility and efficiency of use | | |
 | 8. Aesthetic and minimalist design | | |
 | 9. Help users recognise, diagnose, and recover from errors | | |
 | 10. Help and documentation | | |
 Only include heuristics relevant to what's visible in the design — don't penalise for things not in scope.
 ---
 ## 4. Gestalt Principles Check
 [Comment on any Gestalt principles that are either well-applied or violated:]
 - **Proximity:** [Are related elements grouped clearly?]
 - **Similarity:** [Do similar elements look similar?]
 - **Continuity:** [Does the eye flow naturally through the design?]
 - **Figure/Ground:** [Is the primary content clearly distinguished from background?]
 - **Closure:** [Are any implied shapes or containers confusing?]
 ---
 ## 5. JTBD Alignment
 [Assess how well the design serves the stated job-to-be-done:]
 - **Does the design make the user's primary job obvious?** [Yes / Partially / No — explain]
 - **Are there any elements that distract from the primary job?** [List any competing CTAs, distractions, or unclear hierarchy]
 - **What emotional job does this design serve?** [Speed / Confidence / Control / Delight / Other] — and does the visual design match that emotional goal?
 ---
 ## 6. Top 3 Recommended Next Steps
 Prioritised list of the 3 most impactful changes. Each should be actionable in the next design iteration:
 1. [Most impactful change — specific]
 2. [Second priority]
 3. [Third priority]
 ---
 ## Quality Checks
 - [ ] "What's working" includes only genuine, specific observations
 - [ ] Every issue has a framework reference (not just subjective opinion)
 - [ ] Recommendations are specific and actionable
 - [ ] Priority levels (High/Medium/Low) reflect actual impact on user goal
 - [ ] Heuristic assessment only covers visible elements
 ## Anti-Patterns
 - [ ] Do not lead with visual preference (e.g. "I don't like the colour") — every issue must reference a UX principle or user impact
 - [ ] Do not invent problems in the "What's Working" section — manufactured praise undermines the entire critique
 - [ ] Do not provide the same priority level (High/Medium/Low) to every issue — prioritisation requires genuine judgment about user impact
 - [ ] Do not skip the JTBD section for product screens — connecting feedback to the user's job-to-be-done is what separates UX critique from aesthetic opinion
 - [ ] Do not give recommendations that require a full redesign when the user is in high-fidelity — scope recommendations to the design stage
 ## Example Trigger Phrases
 - "Critique this design: [description or image]"
 - "Give me feedback on this UI/UX"
 - "Review this Figma screen for usability issues"
 - "What's wrong with this user flow?"
 - "Do a heuristic evaluation of [screen/product]"
@@ -0,0 +1,218 @@
 # Design System Audit Skill
 This skill produces a structured audit of a design system — covering component coverage, token consistency, documentation quality, accessibility compliance, contribution processes, and adoption health. Output is ready for a design system team, design leadership, or an engineering team evaluating their shared component library.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Design system name** and what product(s) it serves
 - **Audit scope** — component library / design tokens / documentation / contribution process / all of the above
 - **Current tooling** — Figma / Storybook / Zeroheight / custom / combination?
 - **Team using it** — how many designers and engineers, how many products?
 - **Known pain points** — what do teams complain about most?
 - **Governance model** — centralised team / federated contributors / no dedicated team?
 - **Goal of the audit** — improve adoption / prepare for a rebrand / onboard new teams / justify investment?
 ## Output Structure
 ---
 # Design System Audit: [System Name]
 **Products served:** [List of products / apps]
 **Audit scope:** [Full / Components only / Tokens only / Documentation]
 **Auditor:** [Name / Team]
 **Date:** [Date]
 **Stakeholders:** [Design lead, Eng lead, CPO, etc.]
 ---
 ## Overall Health Score
 | Dimension | Score (1–5) | Status |
 |---|---|---|
 | Component coverage | [X/5] | 🟢/🟡/🔴 |
 | Token consistency | [X/5] | 🟢/🟡/🔴 |
 | Documentation quality | [X/5] | 🟢/🟡/🔴 |
 | Accessibility compliance | [X/5] | 🟢/🟡/🔴 |
 | Adoption rate | [X/5] | 🟢/🟡/🔴 |
 | Contribution process | [X/5] | 🟢/🟡/🔴 |
 | **Overall** | **[X/5]** | 🟢/🟡/🔴 |
 **Summary:** [2–3 sentences. What is the overall state of the design system? What are the top 2 issues and what is the biggest strength?]
 ---
 ## 1. Component Coverage Audit
 **How to assess:** Compare components in the design system against the actual UI patterns in the product. Every pattern that exists in production but not in the system is a coverage gap.
 ### Component Inventory
 | Category | Components present | Coverage | Gap |
 |---|---|---|---|
 | **Navigation** | [Navbar, Sidebar, Breadcrumb, Tabs] | [80%] | [Missing: Mega menu, mobile drawer] |
 | **Forms & Inputs** | [Text input, Dropdown, Checkbox, Radio, Toggle, Date picker] | [90%] | [Missing: Multi-select, Rich text editor] |
 | **Feedback & Alerts** | [Toast, Banner, Modal, Tooltip] | [60%] | [Missing: Inline validation, Progress indicator, Skeleton loader] |
 | **Data Display** | [Table, Card, Badge, Avatar] | [50%] | [Missing: Data grid, Stat card, Timeline, Gantt] |
 | **Layout** | [Grid, Container, Divider, Spacer] | [70%] | [Missing: Responsive breakpoint utilities] |
 | **Buttons & Actions** | [Button, Icon button, FAB, Link] | [100%] | [None] |
 **Coverage score:** [X% of production UI patterns are covered by the design system]
 **Most impactful gaps:**
 1. [Most used pattern not in the system — causing most duplication]
 2. [...]
 3. [...]
 ---
 ## 2. Component Quality Audit
 For each component, assess against these quality criteria:
 | Component | States complete | Responsive | Accessibility | Dark mode | Props documented | Code matches Figma |
 |---|---|---|---|---|---|---|
 | Button | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
 | Modal | ⚠️ Loading state missing | ✅ | ✅ | ❌ | ⚠️ Partial | ✅ |
 | Table | ❌ Sorting state missing | ❌ No mobile layout | ⚠️ No aria-sort | ❌ | ❌ | ⚠️ Drift |
 | [Component] | [...] | [...] | [...] | [...] | [...] | [...] |
 **Legend:** ✅ Complete — ⚠️ Partial / inconsistent — ❌ Missing
 **Components with critical quality issues (fix before anything else):**
 - [Component name]: [Specific issue and why it's blocking]
 - [...]
 ---
 ## 3. Design Token Audit
 **Token coverage:**
 | Token type | Defined | Used consistently | Issues |
 |---|---|---|---|
 | **Colour** | [X tokens defined] | [⚠️ — 12 hardcoded hex values found in Figma] | [Inconsistent use of primary-500 vs primary-600 for CTAs across products] |
 | **Typography** | [X tokens defined] | [✅] | [None — all type styles use token scale] |
 | **Spacing** | [X tokens defined] | [⚠️ — custom spacing used in X components] | [Engineers using arbitrary px values instead of spacing tokens in X components] |
 | **Border radius** | [X tokens defined] | [❌ — not defined; each component has hardcoded values] | [Button, card, modal all use different radius values with no token] |
 | **Shadow / elevation** | [X tokens defined] | [⚠️] | [3 different drop-shadow values in use; no elevation scale] |
 | **Animation / motion** | [X tokens defined] | [❌ — not defined] | [Transition durations inconsistent across components] |
 **Semantic token layer:** [Does the system have semantic tokens (e.g. `color.action.primary` on top of `color.blue.500`) or only primitive tokens?]
 **Token drift:** [Are code tokens and Figma tokens in sync? Use a tool like Token Studio, Style Dictionary, or manual comparison.]
 ---
 ## 4. Documentation Quality Audit
 **Assessment per component / pattern:**
 | Document type | Quality | Issues |
 |---|---|---|
 | **Usage guidelines** | [⚠️ — X% of components have guidelines] | [Button and Form components documented; Navigation and Data Display mostly undocumented] |
 | **Do / Don't examples** | [❌ — mostly absent] | [Engineers frequently misuse components because intent is unclear] |
 | **Accessibility notes** | [⚠️ — present for some components] | [No consistent format; accessibility notes missing for interactive components] |
 | **Code examples** | [✅ — all Storybook components have code examples] | [...] |
 | **Changelog** | [❌ — no component-level changelog exists] | [Breaking changes are not communicated; causes unexpected UI regressions] |
 | **Migration guides** | [❌ — absent] | [Teams don't know how to upgrade to new component versions] |
 **Documentation score:** [X% of components have complete, usable documentation]
 **Most common designer / engineer complaint about docs:** [e.g. "I can't find whether to use Modal or Drawer for this use case — no guidance exists"]
 ---
 ## 5. Accessibility Audit
 **WCAG 2.2 compliance status:**
 | Criterion | Level | Status | Components affected |
 |---|---|---|---|
 | Colour contrast (text) | AA | [✅ / ⚠️ / ❌] | [e.g. ❌ — Disabled state text fails 4.5:1 ratio in 3 components] |
 | Colour contrast (UI components) | AA | [✅ / ⚠️ / ❌] | [...] |
 | Keyboard navigation | AA | [✅ / ⚠️ / ❌] | [⚠️ — Modal focus trap not implemented; Dropdown not keyboard accessible] |
 | Focus visible | AA | [✅ / ⚠️ / ❌] | [...] |
 | Screen reader support (ARIA) | AA | [✅ / ⚠️ / ❌] | [❌ — Table component lacks aria-sort; Icon buttons have no aria-label] |
 | Touch target size | AA | [✅ / ⚠️ / ❌] | [⚠️ — Mobile tap targets below 44×44px in X components] |
 | Motion / animation | AA | [✅ / ⚠️ / ❌] | [...] |
 **Critical accessibility blockers (must fix before next release):**
 1. [Most critical issue — e.g. Keyboard users cannot close Modal — focus trap missing]
 2. [...]
 ---
 ## 6. Adoption Audit
 **Adoption by team / product:**
 | Product / Team | Components used from system | Custom components built outside system | Adoption score |
 |---|---|---|---|
 | [Product A] | [X% of UI uses system components] | [Y custom components] | [High / Medium / Low] |
 | [Product B] | [...] | [...] | [...] |
 **Why teams are not adopting:**
 | Barrier | Severity | Evidence |
 |---|---|---|
 | [Component doesn't exist] | High | [Top reason in team survey] |
 | [Component exists but doesn't meet use case] | Medium | [Modal component lacks X state needed by Product B] |
 | [Documentation too sparse to know how to use it] | Medium | [...] |
 | [No one enforces system use — easier to build custom] | High | [...] |
 | [System is out of date with product's current visual language] | Medium | [...] |
 ---
 ## 7. Contribution Process Audit
 | Dimension | Current state | Assessment |
 |---|---|---|
 | **How to contribute** | [Documented / Not documented] | [✅ / ❌] |
 | **Contribution criteria** | [Clear entry bar for what goes in the system] | [⚠️ — unclear who decides what becomes a system component vs stays local] |
 | **Review process** | [Who reviews contributions and how long it takes] | [❌ — no formal review; contributions sit unreviewed for weeks] |
 | **Release cadence** | [How often system releases happen] | [⚠️ — sporadic; no set cadence] |
 | **Breaking change policy** | [How breaking changes are handled and communicated] | [❌ — no policy; breaking changes are a surprise] |
 | **Versioning** | [Semantic versioning in place?] | [✅ — all packages use semver] |
 ---
 ## 8. Prioritised Remediation Roadmap
 | Priority | Initiative | Impact | Effort | Timeline |
 |---|---|---|---|---|
 | P1 | Fix [X] critical accessibility issues (keyboard nav, ARIA) | Critical — legal + user impact | Medium | Sprint 1–2 |
 | P1 | Define and implement border radius and shadow token scale | High — ends inconsistency | Low | Sprint 1 |
 | P1 | Document top 10 most-used components (usage + do/don't) | High — unblocks adoption | Medium | Sprint 2–4 |
 | P2 | Build Skeleton loader + Inline validation components (top 2 gaps) | High — eliminates custom duplication | High | Quarter 2 |
 | P2 | Establish contribution process with SLA for reviews | Medium — enables growth | Low | Sprint 3 |
 | P3 | Dark mode token support | Medium — product parity | High | Quarter 3 |
 | P3 | Design-code token sync tooling (Token Studio / Style Dictionary) | Medium — reduces drift | Medium | Quarter 2–3 |
 ---
 ## Quality Checks
 - [ ] Coverage gaps are identified by comparing the design system to actual production UI, not assumed
 - [ ] Accessibility issues cite specific WCAG criterion and affected components
 - [ ] Adoption barriers are backed by evidence (interviews, survey, usage data) — not assumed
 - [ ] Remediation roadmap has effort estimates and is sequenced by impact
 - [ ] Both Figma and code (Storybook/implementation) are assessed — not just Figma
 - [ ] Stakeholders from design, engineering, and product have reviewed the audit
 ## Anti-Patterns
 - [ ] Do not assess only the Figma library without checking the code implementation — Figma-code drift is one of the most common and costly design system failures
 - [ ] Do not score adoption without interviewing teams — audit tool metrics miss the human reasons teams build custom components instead of using the system
 - [ ] Do not treat all component gaps equally — prioritise gaps based on how many production screens rely on custom implementations, not alphabetically
 - [ ] Do not recommend adding more components without first auditing documentation quality — an undocumented component is often worse than no component
 - [ ] Do not schedule remediation without a named owner per initiative — design system improvements without ownership consistently stall
 ## Example Trigger Phrases
 - "Audit our design system for consistency and coverage"
 - "Review our component library and identify gaps"
 - "Assess the health of our shared design system"
 - "Run a design system audit before we do a rebrand"
 - "What's wrong with our design system and what should we fix first?"
@@ -0,0 +1,163 @@
 # UX Research Plan Skill
 This skill creates a complete, ready-to-execute UX research plan. Output covers everything from research objectives to screener questions, discussion guide, and synthesis framework.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Research question** (what decision will this research inform?)
 - **Product area or feature** being researched
 - **Research type** (Generative / Evaluative / Usability testing / Diary study / Survey)
 - **Stage** (Discovery / Concept validation / Prototype testing / Live product)
 - **Target participants** (role, demographics, behaviour — who should we talk to?)
 - **Timeline and number of sessions**
 - **Existing assumptions or hypotheses** (optional but valuable)
 ## Output Structure
 ---
 # UX Research Plan: [Study Title]
 **Product area:** [Area]
 **Research type:** [Type]
 **Date:** [Timeline]
 **Researcher:** [Leave for user]
 ---
 ## 1. Research Objectives
 State 2–4 clear research objectives. Each objective should map to a decision that will be made differently depending on what you find.
 **Objective [N]:** Understand [specific thing] so we can [decision this informs].
 ---
 ## 2. Research Questions
 [5–8 questions — the actual questions you want research to answer. These are not the interview questions; they're the knowledge gaps. Organised under each objective.]
 **Objective 1:**
 - RQ1.1: [Research question]
 - RQ1.2: [Research question]
 ---
 ## 3. Methodology & Rationale
 **Method chosen:** [e.g. Semi-structured interviews / Usability testing / Concept testing]
 **Why this method:**
 [2–3 sentences. Match method to research type. If evaluative: usability testing. If generative: contextual inquiry or interviews. If testing comprehension: 5-second test or concept test.]
 **What this method will and won't tell us:**
 - **Will tell us:** [What this method is good at revealing]
 - **Won't tell us:** [What's out of scope — be honest about limits]
 **Sample size:** [Recommended number of sessions and why — e.g. "5–6 moderated interviews for generative research; 5–8 usability sessions to identify top issues"]
 ---
 ## 4. Participant Screener
 **Recruitment criteria:**
 | Criterion | Must Have / Nice to Have | Disqualify if |
 |---|---|---|
 | [e.g. Uses project management software daily] | Must Have | [Never uses any PM tool] |
 | [e.g. Works in a team of 5+] | Must Have | — |
 | [e.g. B2B industry] | Nice to Have | — |
 **Screener questions (5–8 questions):**
 [Q1] [Screening question — clear, not leading]
 - [Answer options — flag which qualify/disqualify]
 [Q2] ...
 **Incentive recommendation:** [Amount and format — e.g. "£50 gift voucher for a 60-min session is standard in the UK for professional participants"]
 ---
 ## 5. Discussion Guide
 Structure the session:
 ### Opening (5 min)
 - Introduce yourself and the study
 - "We're testing the design, not you — there are no wrong answers"
 - Permission to record
 - Warm-up: [1–2 easy questions to build rapport — e.g. "Tell me about your role and what a typical week looks like"]
 ### Core Questions (by section)
 **Section [A]: [Topic]** *(~X min)*
 1. [Open question — start broad] *[Probe: Tell me more about...]*
 2. [Follow-up to go deeper] *[Probe: Can you walk me through what happened?]*
 3. [Specific scenario or past behaviour question]
 **Section [B]: [Topic]** *(~X min)*
 [Continue with 2–3 questions per section]
 **Usability tasks (if applicable):**
 > "I'm going to ask you to try a few things with this prototype. Please think aloud as you go."
 - Task [N]: [Clear task instruction — write from the user's perspective, not "click on X" but "find where you would go to do Y"]
  - **Success criteria:** [What "completing this task" looks like]
  - **What to observe:** [Where friction typically appears]
 ### Closing (5 min)
 - "Is there anything about [topic] we haven't covered that you think is important?"
 - "If you could change one thing about [product/concept], what would it be?"
 - Debrief and thank
 ---
 ## 6. Synthesis Framework
 After sessions, use this framework to synthesise findings:
 **Step 1: Session notes → Key observations**
 For each session: 3–5 specific observations (behaviours, quotes, reactions — not interpretations yet)
 **Step 2: Affinity mapping**
 Group observations by theme across all sessions. Aim for 4–7 clusters.
 **Step 3: Insight statements**
 For each cluster: "When [context], users [behaviour/experience], because [underlying need or mental model]."
 **Step 4: Implications**
 For each insight: "This means we should [design/product implication]" or "This challenges our assumption that [assumption]."
 **Step 5: Research report structure:**
 - Key findings (3–5 headlines)
 - Supporting evidence per finding
 - Design recommendations
 - Open questions for next research cycle
 ---
 ## Quality Checks
 - [ ] Research objectives map to real decisions
 - [ ] Discussion guide opens broad before going specific
 - [ ] Screener criteria are specific enough to get the right participants
 - [ ] Tasks (if usability) are written from the user's perspective
 - [ ] Synthesis framework is included
 - [ ] Incentive recommendation is included
 ## Anti-Patterns
 - [ ] Do not write a research plan without clearly stated research objectives — every methodology choice must flow from the objectives
 - [ ] Do not design a plan that mixes generative and evaluative research without clearly separating them
 - [ ] Do not omit screener criteria — recruiting unqualified participants invalidates the research
 - [ ] Do not write discussion guide questions that are leading — questions must be neutral and open-ended
 - [ ] Do not skip the incentive recommendation — uncompensated research has lower participant quality and completion rates
 ## Example Trigger Phrases
 - "Write a research plan for [feature or product area]"
 - "Create a discussion guide for user interviews about [topic]"
 - "Plan a usability test for [prototype or feature]"
 - "Write screener questions for [target user type]"
@@ -0,0 +1,61 @@
 # Assumption Mapper Skill
 Surface and prioritize the untested assumptions embedded in any product plan before development begins.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Product brief, PRD, or concept description** (even rough notes work)
 - **Stage** (concept / discovery / pre-build / post-launch — affects which assumptions matter most)
 ## Process
 1. Read the provided brief, PRD, or concept description
 2. Extract assumptions across four categories:
   - **Desirability** (do users want this?)
   - **Feasibility** (can we build it?)
   - **Viability** (will it sustain the business?)
   - **Usability** (can users actually use it?)
 3. Score each assumption:
   - Confidence (1-5): How sure are we this is true?
   - Impact (1-5): How badly does the plan fail if this assumption is wrong?
   - Priority = Impact − Confidence (higher = test first)
 4. **Validate completeness** — Ensure at least one assumption per category. If a category is empty, re-read the brief looking specifically for that type.
 5. Output a ranked list with recommended validation methods
 ## Output Structure
 ### Assumption Map: [Feature/Product Name]
 | Assumption | Category | Confidence | Impact | Priority | Validation Method |
 |------------|----------|------------|--------|----------|-------------------|
 | [assumption] | [type] | [1-5] | [1-5] | [score] | [method] |
 #### Critical Assumptions (Impact 4+ and Confidence 2 or below)
 [Flagged items with detailed validation recommendations]
 #### Top 3 Assumptions to Validate First
 [Detailed recommendations including specific research method, estimated effort, and what the result would change]
 ## Example (Partial)
 Input: *"We're building a self-serve onboarding flow to reduce time-to-value for SMB customers."*
 | Assumption | Category | Confidence | Impact | Priority | Validation Method |
 |------------|----------|------------|--------|----------|-------------------|
 | SMB users can complete onboarding without human help | Usability | 2 | 5 | 3 | Unmoderated usability test (n=8) |
 | Faster onboarding correlates with higher retention | Viability | 3 | 4 | 1 | Cohort analysis of current onboarding times vs. 90-day retention |
 | The current onboarding is the primary reason for slow time-to-value | Desirability | 2 | 4 | 2 | User interviews with recent churned SMB accounts |
 ## Anti-Patterns
 - [ ] Do not only surface desirability assumptions — feasibility and viability assumptions are equally likely to kill a product and are often overlooked
 - [ ] Do not assign high confidence to an assumption just because it hasn't been challenged yet — absence of evidence is not evidence
 - [ ] Do not recommend "user interviews" as the validation method for every assumption — some assumptions require quantitative data, competitive analysis, or technical spikes
 - [ ] Do not list assumptions that cannot be tested — every assumption in the map must have a plausible validation method, or it should be flagged as unknowable and treated as a risk
 ## Quality Checks
 - [ ] At least one assumption per category (Desirability, Feasibility, Viability, Usability)
 - [ ] All Impact 4+ / Confidence 2− assumptions flagged as CRITICAL
 - [ ] Each validation method is specific (not just "do research" — name the method and sample size)
 - [ ] Priority scores are consistent (Impact − Confidence, higher = more urgent)
@@ -0,0 +1,218 @@
 # Customer Journey Map Skill
 This skill produces a complete customer journey map covering every stage from awareness through advocacy. Each stage includes touchpoints, customer actions, emotions, pain points, and specific improvement opportunities. Output is ready for use in product discovery, UX design, or cross-functional alignment workshops.
 ## Required Inputs
 Ask the user for these if not provided:
 - **Product or service** being mapped
 - **Customer persona** — which customer segment is this map for? (be specific — one persona per map)
 - **Journey scope** — full end-to-end (awareness → advocacy), or a specific phase (e.g. onboarding only)?
 - **Current state or future state?** — mapping how it works today, or designing how it should work?
 - **Data sources** — any research, user interviews, support tickets, NPS comments, analytics available?
 - **Goal of the map** — what decision will this inform? (redesign, prioritisation, stakeholder alignment, new feature)
 ## Output Structure
 ---
 # Customer Journey Map: [Product / Service]
 **Persona:** [Name — e.g. "Sarah, the overwhelmed HR manager"]
 **Journey scope:** [Full end-to-end / Onboarding / Purchase / Renewal]
 **Current or future state:** [Current state / Desired future state]
 **Prepared by:** [Name / Team]
 **Date:** [Date]
 **Based on:** [Research sources — interviews, analytics, support data, assumed/hypothetical]
 ---
 ## Persona Summary
 | | |
 |---|---|
 | **Name** | [Sarah] |
 | **Role** | [HR Manager at a 200-person professional services firm] |
 | **Goal** | [Reduce time spent on manual employee data management] |
 | **Frustrations** | [Too many tools that don't talk to each other; always chasing approvals] |
 | **Tech comfort** | [Moderate — comfortable with SaaS tools but not a power user] |
 | **Decision power** | [Recommends tools; budget approved by CHRO] |
 ---
 ## Journey Overview
 ```
 AWARENESS → CONSIDERATION → DECISION → ONBOARDING → ADOPTION → ADVOCACY
   [Stage 1]      [Stage 2]      [Stage 3]    [Stage 4]     [Stage 5]   [Stage 6]
 ```
 **Overall experience rating (current state):** [😤 Frustrating / 😐 Neutral / 😊 Positive]
 ---
 ## Stage 1: Awareness
 *How does the customer first discover the product exists?*
 **Customer goal at this stage:** [e.g. Realise they have a problem worth solving — or find a solution to a specific pain]
 | Element | Detail |
 |---|---|
 | **Trigger** | [What event makes them start looking? — e.g. Manual process breaks down / peer recommendation / saw ad] |
 | **Where they are** | [Google search / LinkedIn / conference / colleague conversation / email newsletter] |
 | **What they do** | [e.g. Searches "automate employee onboarding" / asks peers in HR community / clicks LinkedIn ad] |
 | **Emotion** | [😤 Frustrated — overwhelmed by manual processes and hoping for a better way] |
 | **Pain points** | [Overwhelming number of options / hard to know which tools are credible / can't tell what's B2B vs B2C from homepage] |
 | **Opportunities** | [SEO content targeting the trigger keyword / LinkedIn thought leadership / peer community presence] |
 ---
 ## Stage 2: Consideration
 *The customer is actively evaluating options. What do they do to decide?*
 | Element | Detail |
 |---|---|
 | **Customer goal** | [Narrow down from many options to a shortlist of 2–3] |
 | **What they do** | [Reads G2/Capterra reviews / watches demo video / downloads comparison guide / asks peers who use something similar] |
 | **Touchpoints** | [Website / review sites / social proof / demo request flow / sales email] |
 | **Emotion** | [😕 Anxious — worried about making the wrong choice; past tool purchases haven't delivered] |
 | **Pain points** | [Pricing not visible on website / demo requires a call before seeing the product / unclear if it works with their existing stack] |
 | **Opportunities** | [Self-serve demo or interactive product tour / transparent pricing page / ROI calculator / case studies from similar company size] |
 ---
 ## Stage 3: Decision
 *The customer is ready to buy — or not. What makes them commit?*
 | Element | Detail |
 |---|---|
 | **Customer goal** | [Get sign-off from CHRO and justify the decision with a business case] |
 | **What they do** | [Books sales call / requests security questionnaire / builds internal business case / negotiates contract] |
 | **Touchpoints** | [AE / sales call / security review / contract / procurement process] |
 | **Emotion** | [😬 Cautious — doesn't want to be wrong; presenting to leadership adds pressure] |
 | **Pain points** | [Sales process is slow / security questionnaire takes weeks / contract terms are non-standard and require legal] |
 | **Opportunities** | [Security FAQ self-serve / standard contract with predictable terms / champion toolkit (slides, business case template) to help them sell internally] |
 ---
 ## Stage 4: Onboarding
 *The customer has bought. Now they need to get value fast.*
 | Element | Detail |
 |---|---|
 | **Customer goal** | [Get the product working and show their CHRO it was a good decision] |
 | **What they do** | [Receives welcome email / attends kickoff call / configures integrations / invites team] |
 | **Touchpoints** | [Onboarding email sequence / in-product onboarding checklist / CSM / help centre / integrations marketplace] |
 | **Emotion** | [😬 Anxious but hopeful — excited about potential but stressed about the setup work] |
 | **Pain points** | [Setup is more complex than expected / IT required for SSO but IT is slow to respond / generic onboarding doesn't match their use case] |
 | **Opportunities** | [Role-specific onboarding paths / IT connector with pre-filled request template / quick win email at day 3 (show them one thing that already works)] |
 **Key moment of truth:** [What single moment in this stage determines whether they'll become an active user or ghost? — e.g. "First time the product saves them 30 minutes on a task they used to do manually"]
 ---
 ## Stage 5: Adoption
 *The customer is using the product. Are they getting consistent value?*
 | Element | Detail |
 |---|---|
 | **Customer goal** | [Make the product a regular part of their workflow; demonstrate ROI to leadership] |
 | **What they do** | [Uses core features daily / discovers new features / hits a limitation / contacts support / attends webinar] |
 | **Touchpoints** | [Product UI / in-app notifications / email / support / community / customer success manager] |
 | **Emotion** | [Variable — some days 😊 when the product works well; some days 😤 when hitting a gap or bug] |
 | **Pain points** | [Feature they expected isn't there / reporting doesn't show the metric leadership wants / power features are too complex / feels like they're underutilising what they're paying for] |
 | **Opportunities** | [Proactive CSM check-in at day 30 / in-product feature discovery / usage dashboard for the customer to see their own ROI / community for peer learning] |
 **Adoption health indicators:**
 - [DAU/MAU ratio — what does healthy look like?]
 - [Feature X used by Y% of seats within Z weeks]
 - [First NPS survey at 60 days — target score]
 ---
 ## Stage 6: Advocacy
 *The customer loves the product. How do you turn them into a referral engine?*
 | Element | Detail |
 |---|---|
 | **Customer goal** | [Solve problems faster; feel like an expert; feel valued as a customer] |
 | **What they do** | [Refers a peer / writes a G2 review / participates in case study / speaks at event / becomes a power user / joins community] |
 | **Touchpoints** | [CSM / community / review request email / referral programme / case study outreach / conference sponsorship] |
 | **Emotion** | [😊 Proud — the tool is part of their professional identity; they feel smart for choosing it] |
 | **Pain points** | [Referral programme is clunky / no structured way to connect with peers / case study process is slow and effortful for them] |
 | **Opportunities** | [One-click G2 review request at high-satisfaction moment / peer community / referral programme with meaningful reward / case study process that does most of the work for them] |
 ---
 ## Emotion Curve
 Plot the customer's emotional experience across the journey:
 ```
 High  😊 │        *                              *          *
          │                                   *
 Neutral 😐│  *         *
          │                  *
 Low   😤 │                        *    *
          └────────────────────────────────────────────────────
            Aware   Consider  Decide  Onboard  Adopt   Advocate
 ```
 **Lowest point:** [Which stage has the worst experience — and why?]
 **Highest point:** [When is the customer most delighted — what drove it?]
 **Biggest drop:** [Where does sentiment fall most sharply — this is usually the biggest opportunity]
 ---
 ## Prioritised Opportunities
 | Opportunity | Stage | Impact on customer | Effort to fix | Priority |
 |---|---|---|---|---|
 | [Self-serve product tour before sales call] | Consideration | [High — removes top buying barrier] | [Medium] | P1 |
 | [Quick win email at day 3] | Onboarding | [High — builds early habit] | [Low] | P1 |
 | [IT SSO setup template] | Onboarding | [Medium — removes specific blocker] | [Low] | P2 |
 | [30-day proactive CSM check-in] | Adoption | [Medium — catches churn signals early] | [Medium] | P2 |
 | [Peer referral programme] | Advocacy | [High for growth — reduces CAC] | [High] | P3 |
 ---
 ## What We Don't Know (Research Gaps)
 | Gap | How to close it | Priority |
 |---|---|---|
 | [What actually triggers the decision to start looking?] | [5 JTBD interviews with recent buyers] | [High] |
 | [What causes customers to stall in onboarding?] | [Drop-off analysis in onboarding funnel + 3 interviews with churned customers] | [High] |
 | [What % of customers have reached the advocacy stage?] | [Product analytics — identify power users; NPS by cohort] | [Medium] |
 ---
 ## Quality Checks
 - [ ] Map covers one specific persona — not "all customers"
 - [ ] Each stage includes the customer's emotional state — not just actions
 - [ ] Pain points are the customer's pain — not the company's pain
 - [ ] Opportunities are specific enough to become backlog items or design prompts
 - [ ] Emotion curve shows the real experience — not an aspirationally positive version
 - [ ] Research gaps are documented — the map reflects what is known, not assumed
 ## Anti-Patterns
 - [ ] Do not build the map from assumptions alone — ground at least the pain points in real customer data or research
 - [ ] Do not treat all journey stages as equally weighted — identify the highest-friction moments explicitly
 - [ ] Do not omit the emotional layer — a journey map without emotions is a process flow, not a customer map
 - [ ] Do not create generic touchpoints that apply to any product — each touchpoint must be specific to this product and customer
 - [ ] Do not leave opportunities unranked — prioritise by impact and feasibility
 ## Example Trigger Phrases
 - "Map the customer journey for [product]"
 - "Build a user journey from awareness to advocacy"
 - "Create a journey map for our onboarding experience"
 - "Map out the touchpoints and pain points for [customer type]"
 - "Design an experience map for [process or product]"
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Mohit	fb2fd84b00	Merge: v21.0.0 release prep	2026-06-19 10:34:15 +01:00
Mohit	197f9f2758	chore(release): v21.0.0 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 10:34:15 +01:00
mohitagw15856	cad2671267	Merge pull request #63 from mohitagw15856/sample-outputs chore(samples): generate sample outputs for the gallery	2026-06-19 10:32:29 +01:00
mohitagw15856	82340b4264	chore(samples): generate sample outputs for the gallery	2026-06-19 09:31:45 +00:00
Mohit	c0ff578d1c	fix(ci): open a PR for generated samples instead of pushing to protected main Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 10:19:05 +01:00
Mohit	438bd5343f	Merge: compare demo GIF + sample-generation workflow	2026-06-19 10:05:17 +01:00
Mohit	7b02261a3c	feat: compare-mode demo GIF, expanded eval cases, sample-generation workflow - Add compare-mode demo GIF + its Playwright recorder; embed in README eval section - Expand evals/cases.json (6 → 15 flagship skills) so more skills can be eval-scored and sample-generated - Add --generate-missing mode to build-samples.mjs - Add generate-samples.yml: workflow_dispatch job that generates real sample outputs via the ANTHROPIC_API_KEY secret (key never leaves GitHub) and commits Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 10:05:17 +01:00
Mohit	3f9c319b79	Merge: workflow recipes, eval badges, MCP, playground upgrades, sample gallery, skill-of-the-week	2026-06-19 09:56:11 +01:00
Mohit	54f76456ab	feat: workflow recipes, eval badges, one-click MCP, playground upgrades, sample gallery, skill-of-the-week Signature features that turn breadth (174 skills) into a differentiated product: - Workflow recipes: 5 cross-profession chains (workflows.json) that pass each output forward — slash commands (/ship-a-feature etc.), WORKFLOWS.md generated by scripts/build-workflows.mjs, README + MCP (list_workflows/get_workflow) wired - Eval-backed quality: real per-skill scores from evals/results.json surfaced as badges in the playground and an honest README section (6 scored skills) - One-click MCP: 'claude mcp add' install + workflow tools, works in any MCP client - Playground: 'which skill?' recommender, with/without compare toggle, shareable ?skill= deep-links with prefilled inputs - Sample-output gallery: hand-written examples for the hero five + generator (scripts/build-samples.mjs) + web/examples.html - Skill-of-the-week: scheduled workflow + script that composes X/LinkedIn posts and posts to an optional webhook Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 09:56:11 +01:00
Mohit	7f06f0a993	Merge: README front-door section + hero five + launch kit	2026-06-19 09:12:45 +01:00
Mohit	fe320edf1b	docs: add first-timer front-door section, hero five, and launch kit - Add a '👋 New here? Start in 30 seconds' section: problem-solution hook plus a 'hero five' table (give-it / get-back) linking the strongest skills - Add docs/launch-posts.md: ready-to-post copy for HN, Product Hunt, X, Reddit, and LinkedIn, plus an ongoing distribution checklist Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 09:12:45 +01:00
Mohit	ba69ffb29e	Merge: README lifecycle diagram + animated demo GIF	2026-06-19 09:05:53 +01:00
Mohit	0d1dbf25a5	docs: add workflow lifecycle diagram and animated demo GIF to README - Reframe the 174 skills as a DISCOVER→DECIDE→BUILD→SHIP→MEASURE→COMMUNICATE workflow with an ASCII diagram and a phase→skills table (hook for new visitors) - Replace the static playground screenshot with an animated demo GIF - Add record-demo.mjs (Playwright) to auto-generate the demo, and document it Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 09:05:53 +01:00
mohitagw15856	1608cc1a92	Merge pull request #62 from mohitagw15856/eval-results chore(evals): refresh leaderboard results	2026-06-18 21:42:23 +01:00
mohitagw15856	bbfcd725f4	chore(evals): refresh leaderboard results	2026-06-18 20:35:20 +00:00
mohitagw15856	c6cdbf6908	fix(marketplace): add missing pm-social plugin.json manifest (#61 ) pm-social was the last bundle without a .claude-plugin/plugin.json (a latent gap that can block clean plugin installation). Adds it, matching the marketplace.json entry (version 1.0.0). Independent of PR #23. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 21:26:02 +01:00
mohitagw15856	7936572c44	fix(marketplace): update version/count and wire orphan skills into bundles (#60 ) The Claude plugins marketplace reads .claude-plugin/marketplace.json, which was stale (version 14.0.0, '167 skills') and three skills lived only in root skills/ with no bundle, so they could never appear in the marketplace: - Bump marketplace version 14.0.0 -> 20.2.0 and description 167 -> 174. - Wire the orphan skills into their natural bundles (identical copies, matching the repo's dual-maintenance convention): youtube-script-writer -> pm-writers (1.0.0 -> 1.1.0) launch-readiness -> pm-delivery (3.2.0 -> 3.3.0) skill-security-auditor -> pm-engineering (4.1.0 -> 4.2.0) - Add the missing pm-writers plugin.json manifest; bump the pm-delivery and pm-engineering manifests to match and mention the new skills. - Regenerate exports (they move from other/ into the bundle folders) and web/skills.json. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 21:20:50 +01:00
mohitagw15856	c53aa6b669	Merge pull request #59 from mohitagw15856/eval-results chore(evals): refresh leaderboard results	2026-06-18 21:14:03 +01:00
mohitagw15856	b7aa4aa2d9	chore(evals): refresh leaderboard results	2026-06-18 20:13:31 +00:00
mohitagw15856	616811e0e8	release: v20.2.0 — community PRs, new skill & catalog reconciliation (#58 ) - Bump to 20.2.0 (20.1.0 is already published; these merged after it). - Split changelog: 20.1.0 keeps its as-released scope (star nudges + eval hardening); new 20.2.0 covers the community PRs (#47/#48/#50), the new YouTube skill, and the check hardening. - Reconcile the README to the true 174-skill count everywhere (title, badge, TOC, intro, 'All Skills' header, sponsor line) — was a stale 167. - Add catalog entries for the 3 skills that were missing from the table: Skill Security Auditor (#168), Launch Readiness (#169), YouTube Script Writer (#170). - package.json description 167 -> 174. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 21:04:02 +01:00
mohitagw15856	337314b4e7	release: finalize v20.1.0 (community PRs + new skill) and harden check (#57 ) - Roll the merged community PRs (#47 install safety, #48 prioritisation helper, #50 YouTube skill) and the now-174 skill count into the v20.1.0 changelog and README 'What's New' / latest-release line. - Harden 'npm run check' to rebuild web/skills.json and fail on drift, so a stale playground index can't pass locally and break CI (root cause of the check-generated failure after #48). Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 20:55:01 +01:00
mohitagw15856	fc58eb7c67	feat: add YouTube Script Writer skill (experimental) (#50 ) (#56 ) Reapplies @prajwal-28's PR #50 onto current main WITHOUT the LF->CRLF line-ending conversion the original PR introduced across skill-tiers.json and the export indexes: - adds skills/youtube-script-writer/SKILL.md (retention-optimized video scripts) - registers it in the experimental tier - regenerates the 5 platform exports with LF endings (no whole-file churn) Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: prajwal-28 <prajwal-28@users.noreply.github.com>	2026-06-18 20:50:50 +01:00
mohitagw15856	077215381d	feat: add stdlib feature-prioritisation helper script (#48 , closes #39 ) (#55 ) Reapplies @zeotrix's PR #48 onto current main: - adds a dependency-free Python script computing RICE/ICE rankings so scoring is consistent across sessions (skills/ + plugins/ copies, kept identical) - documents it in a 'Programmatic Helper' section in both SKILL.md files - regenerates the platform exports so the check-generated CI stays green Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: zeotrix <zeotrix@users.noreply.github.com>	2026-06-18 20:47:38 +01:00
mohitagw15856	66249df30b	fix: guard install path + robust frontmatter parsing (#47 ) (#54 ) Reapplies @MatrixNeoKozak's PR #47 onto current main (resolves the bin/cli.mjs conflict with the star-nudge changes): - resolve() the install target and refuse system-critical dirs (/, /usr, /etc, /root, ...) so a typo'd --target can't clobber the system - skillcheck frontmatter parser tolerates leading whitespace and CRLF/LF Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: MatrixNeoKozak <MatrixNeoKozak@users.noreply.github.com>	2026-06-18 20:43:45 +01:00
mohitagw15856	83bfff4f2f	docs: surface v20.1.0 in README changelog and latest-release line (#53 ) The structured CHANGELOG.md already had 20.1.0; this updates the README's embedded 'What's New' changelog and the top 'Latest release' line to match. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 20:35:03 +01:00
mohitagw15856	0c33330211	release: v20.1.0 — star nudges & eval hardening (#52 ) Bump to 20.1.0. Folds the prior Unreleased items (CI leaderboard, PR-based results flow, faster/hang-proof evals) plus the new star CTAs into a [20.1.0] changelog section. Updates the README version badge. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 20:29:01 +01:00
mohitagw15856	82beaed5c6	feat: add star CTA to CLI list output and MCP server banner (#51 ) More touchpoints to convert users into stargazers: the `list` command footer and the MCP server's stderr startup banner (stderr is safe — it never corrupts the JSON-RPC stream on stdout). Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 20:25:13 +01:00
mohitagw15856	511bad19b0	feat: nudge npm users to star the repo (CLI + README + funding) (#49 ) - CLI prints a star CTA after a successful install and in --help - README adds a prominent star line below the badges (npm renders this) - package.json gains a funding field so npm shows a Fund/Star link Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 19:03:41 +01:00
mohitagw15856	63cef03324	Merge pull request #46 from mohitagw15856/eval-results chore(evals): refresh leaderboard results	2026-06-18 13:41:58 +01:00
mohitagw15856	c28825dd38	chore(evals): refresh leaderboard results	2026-06-18 12:40:15 +00:00
mohitagw15856	4209963cff	Leaderboard workflow: open a PR instead of pushing to protected main (#45 ) The eval run worked (12 scored runs) but the final step failed: it pushed evals/results.json directly to main, which the branch ruleset blocks ("Changes must be made through a pull request"). - eval-leaderboard.yml: replace the direct commit/push with peter-evans/create-pull-request@v7 (branch eval-results), add pull-requests: write. Merging that PR triggers the Pages deploy (which watches evals/results.json) to publish real numbers. - evals/README documents the PR flow + the required "Allow GitHub Actions to create and approve pull requests" setting. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 13:33:15 +01:00
mohitagw15856	827d7f62ec	Make evals fast and hang-proof (timeout, retry, concurrency) (#44 ) The "Run evals" step ran 24 API calls sequentially with no request timeout, so it was slow and could stall indefinitely if one call hung. - bin/lib/anthropic.mjs: per-request timeout (120s) via AbortController + retry (2x, backoff) on 429/5xx/timeout. Fails fast on 4xx (bad key/model). - evals/run-evals.mjs: run (case × model) tasks through a concurrency pool (default 4, --concurrency to tune); preserves result order. - eval-leaderboard.yml: job timeout-minutes: 20 as a safety net. Applies to the next run. The hardening also benefits the Action runner and `generate`, which share the client. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 13:30:06 +01:00
mohitagw15856	edb663ad72	CI workflow to run evals and update the leaderboard (#43 ) Lets the leaderboard show real numbers without a local key: the new "Update Skill Leaderboard" workflow (workflow_dispatch) runs the eval harness with the ANTHROPIC_API_KEY secret, commits evals/results.json, and the Pages deploy re-renders the public leaderboard with real data. - .github/workflows/eval-leaderboard.yml: manual trigger, contents: write, runs run-evals.mjs + build-leaderboard.mjs, commits results.json. - deploy-playground.yml: also trigger on evals/results.json (and the build scripts) so the committed results refresh the live page. - evals/README + CHANGELOG document the CI route. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 12:58:45 +01:00
mohitagw15856	3ccfd6b5c7	Dogfood the Action + bump to v20.0.0 (Agentic Tooling) (#42 ) - .github/workflows/pr-description.yml: uses our own Action (uses: ./action) to auto-write this repo's PR descriptions when a PR opens empty; skips quietly without ANTHROPIC_API_KEY and on forks. A living demo. - Version -> 20.0.0 (Agentic Tooling): bundles the GitHub Action, generate command, and evals/leaderboard for npm. README badge + What's New (v19 collapsed), CHANGELOG [Unreleased] -> [20.0.0], SECURITY table. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 12:52:37 +01:00
mohitagw15856	51bf4be52f	AI-powered tooling: GitHub Action, generate command, evals + leaderboard (#41 ) Three features riding 2026 trends (agentic CI, codegen, evals), sharing one dependency-free Anthropic client (bin/lib/anthropic.mjs). 1. GitHub Action (action/) — run any skill in a consumer repo's CI: uses: mohitagw15856/pm-claude-skills/action@main. Composite action + run.mjs (loads the bundled SKILL.md, calls the API, exposes result as a step output / file). Docs with auto-PR-description example. 2. generate command — `npx pm-claude-skills generate --from <url\|file>` turns a team's docs into a SKILL.md following the authoring standard (bin/generate.mjs, wired into the CLI; needs ANTHROPIC_API_KEY). 3. Skill evals + Leaderboard — evals/run-evals.mjs runs each case across models and scores output with an LLM judge (structure/completeness/usefulness/ grounding); scripts/build-leaderboard.mjs renders web/leaderboard.html (built in the Pages deploy, falls back to clearly-labelled example data). Linked from README, catalog, and playground. Offline-testable parts verified (prompt building, skill loading, graceful errors, leaderboard render). SkillCheck/audit/exports all green. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 08:37:40 +01:00
mohitagw15856	288a340dbe	Bump to v19.0.0 (Security Auditor, Personas & Catalog) (#36 ) - package.json -> 19.0.0 - README badge + "What's New in v19.0.0" (v18 collapsed), latest-release line - CHANGELOG: promote [Unreleased] -> [19.0.0] with compare links - SECURITY.md supported-versions table Ships the security auditor, personas, orchestration guide, docs catalog, and roadmap to npm on publish. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 08:15:14 +01:00
mohitagw15856	e9bc1d0626	Security auditor, personas, orchestration, docs catalog & roadmap (#35 ) Closes the remaining gaps vs alirezarezvani/claude-skills across trust, content types, discoverability, and community. Security (trust signal + useful): - scripts/skill-audit.mjs scans skills/*/SKILL.md + each skill's scripts/ for prompt injection, exfiltration, dynamic code exec, destructive shell, secrets, and hidden text. HIGH fails CI (.github/workflows/skill-audit.yml) + a badge. - New skill-security-auditor skill teaches the same review (production tier). Content types: - output-styles/ — 4 personas (Startup CTO, Growth Marketer, Solo Founder, Product Leader) as Claude Code output styles; --agent claude installs them too. - ORCHESTRATION.md — Skill Chain / Multi-Agent Handoff / Domain Deep-Dive / Solo Sprint patterns. Discoverability: - scripts/build-docs.mjs generates a server-rendered, SEO-indexable web/catalog.html of all skills (built in the Pages deploy; gitignored). Linked from README + playground. Community: - ROADMAP.md (now/next/later + good-first-issues). README badges/sections, TIERS (47 production), CHANGELOG, package.json files, and exports/web index all updated. SkillCheck + security audit + exports verified. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-18 08:09:14 +01:00
mohitagw15856	32ff3a96ee	Bump to v18.0.0 (Windsurf, Aider & MCP server) (#34 ) - package.json -> 18.0.0 - README badge + "What's New in v18.0.0" (v17 collapsed), latest-release line - CHANGELOG: promote [Unreleased] -> [18.0.0] with compare links - SECURITY.md supported-versions table Ships Windsurf/Aider targets and the MCP server to npm on publish. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-17 23:25:11 +01:00
mohitagw15856	036511ab3e	Windsurf + Aider targets, MCP server, and demo placement (#33 ) Broadens both reach (more tools) and content types (an MCP server), continuing the multi-platform story. Windsurf + Aider: - build-exports.mjs gains two platforms: exports/windsurf/.md (workspace rules, trigger: model_decision) and exports/aider/.md (conventions for `aider --read`). Now 5 platforms (ChatGPT, Gemini, Cursor, Windsurf, Aider). - install.sh + bin/cli.mjs install both (windsurf -> .windsurf/rules, aider -> .aider/skills with a --read hint); generated README index is excluded from copies. - One-line windsurf-install.sh / aider-install.sh wrappers for parity. MCP server (new content type): - mcp/server.mjs — zero-dependency stdio MCP server exposing list_skills, search_skills, get_skill. Published as a second bin (pm-claude-skills-mcp). Logs to stderr; reads bundled skills/ at startup. mcp/README.md documents client config. Also: README hero "See it in action" demo placement (ready to swap in a GIF; recording guide in web/docs-assets/README.md), Works-With table + exports + install docs updated, CHANGELOG Unreleased. package.json files/bin updated. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-17 23:15:38 +01:00
mohitagw15856	123aabe5e3	README: npm badges + package link (now published) (#32 ) The package is live on npm (pm-claude-skills@17.0.0). Add npm version + monthly-installs badges and link the Quick Install npx command to the package. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-17 23:01:15 +01:00
mohitagw15856	1d18e50c68	Fix npm-publish tag check to accept capital-V tags (#31 ) The release tag check stripped only a lowercase 'v', so a tag like V17.0.0 failed against package.json 17.0.0. Strip a leading v OR V. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>	2026-06-17 22:50:34 +01:00