Compare commits

...

69 Commits

Author SHA1 Message Date
Mohit 5721cd3a49 fix(web): propagate mid-stream API errors and raise max_tokens
- Streaming loop swallowed errors: a mid-stream error event (e.g.
  overloaded_error) was thrown inside the same try/catch used to skip
  unparseable SSE lines, so it was silently ignored and the run reported
  "Done." with truncated output. Separate JSON parsing from event handling
  so real errors surface to the user.
- Raise max_tokens 4096 -> 8192 to avoid truncating long skill outputs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 12:07:18 +01:00
Mohit 735df19a9b docs(readme): add live GitHub Pages link to Skill Playground section
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 12:02:31 +01:00
Mohit f956b4c329 ci: auto-deploy Skill Playground to GitHub Pages
On push to main, rebuild web/skills.json from the SKILL.md files and publish
web/ to GitHub Pages, so the live site always reflects the current skill
library. Manual runs supported via workflow_dispatch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 12:01:09 +01:00
Mohit 2e58766814 feat(web): add Skill Playground — browser UI to run any skill with your own key
A zero-backend static web app to run any of the 172 skills directly in the
browser using the user's own Claude API key (stored only in localStorage,
sent straight to api.anthropic.com).

- build-skills.mjs: generates skills.json from skills/*/SKILL.md, parsing
  frontmatter, the Required Inputs section (-> form fields), and a one-line
  summary for each skill tile.
- Tile gallery with bundle tag, title, and one-line description; search +
  bundle filter; click a tile to open an auto-generated input form.
- Streams output via the Anthropic Messages API (direct browser access),
  with copy/download, model picker, and Show/Hide key toggle.
- Product Notes logo in the header.
- README: add Skill Playground section + screenshot, a table of contents,
  and collapse the long changelog and full skills list into <details> blocks.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 11:58:59 +01:00
mohitagw15856 bd7d5afce1 Merge pull request #19 from mohitagw15856/claude/admiring-cori-murZN
Update read me
2026-06-08 14:45:21 +01:00
Mohit 7f9331f5b4 docs(readme): add Plugin Directory section with descriptions for all 25 plugins
https://claude.ai/code/session_01MuGKn3a3Gbqoe8uM5Lmuqt
2026-06-08 13:42:10 +00:00
mohitagw15856 5d4d007aeb Merge pull request #18 from mohitagw15856/claude/admiring-cori-murZN
Quality improvements of skills - Anti-Patterns section, Description verb-when-produces, Required Inputs section, Quality Checks binary format, Frontmatter YAML
2026-06-08 14:07:56 +01:00
Mohit affae033fe fix(plugins): sync all 171 plugin SKILL.md files with fixed skills/ versions
Propagates Anti-Patterns sections, description rewrites, Required Inputs
additions, and Quality Checks format fixes from skills/ to matching plugin
SKILL.md copies.

https://claude.ai/code/session_01MuGKn3a3Gbqoe8uM5Lmuqt
2026-06-08 13:06:21 +00:00
Mohit fb85a1cb55 fix(skills): add Anti-Patterns and fix descriptions for remaining skills (batch 3)
Processed 27 skills: teaching-lesson-plan through feature-prioritisation and all figma skills.
Added Anti-Patterns sections to all 27 skills.
Added Quality Checks section to financial-due-diligence (was missing entirely).
Converted user-research-synthesis Quality Standards to binary checkbox format.
Rewrote descriptions for figma-design-critique-pm, figma-design-qa, figma-design-review,
team-health-check, and user-interview-synthesis.

https://claude.ai/code/session_01MuGKn3a3Gbqoe8uM5Lmuqt
2026-06-08 13:01:36 +00:00
Mohit f170eed437 fix(skills): add Anti-Patterns and fix descriptions for remaining skills (batch 2)
Processed 24 skills: pr-description-writer through tax-planning-checklist.
Added Anti-Patterns sections to all 24 skills.
Added Required Inputs section to product-launch-checklist.
Rewrote descriptions for retro-analysis, substack-notes-scraper, and sycophancy-challenger.

https://claude.ai/code/session_01MuGKn3a3Gbqoe8uM5Lmuqt
2026-06-08 12:57:05 +00:00
Mohit a33b4f7003 fix(skills): add Anti-Patterns and fix descriptions for remaining skills (batch 1)
Processed 29 skills: content-calendar through pptx-slide-auditor.
Added Anti-Patterns sections to all 29 skills.
Rewrote descriptions for instagram-post-downloader and job-application.

https://claude.ai/code/session_01MuGKn3a3Gbqoe8uM5Lmuqt
2026-06-08 12:53:18 +00:00
Mohit 74f3ef79ad fix(skills): add Anti-Patterns and fixes for partial batch 2 skills
https://claude.ai/code/session_01MuGKn3a3Gbqoe8uM5Lmuqt
2026-06-08 12:47:41 +00:00
Mohit 4ff88bdbb1 fix(skills): add Anti-Patterns sections, fix descriptions, quality checks, and required inputs
- Add Anti-Patterns section (3-5 binary checkboxes) to all modified skills
- Fix Quality Checks to use binary checkbox format where needed
- Rewrite descriptions to verb-when-produces format where needed
- Add Required Inputs sections to skills missing them
- Fix email-triage frontmatter YAML quoting

https://claude.ai/code/session_01MuGKn3a3Gbqoe8uM5Lmuqt
2026-06-08 10:20:50 +00:00
mohitagw15856 44f69a541f Merge pull request #16 from mohitagw15856/claude/lucid-sagan-YnJQS
Claude/lucid sagan yn jqs
2026-05-27 23:37:26 +01:00
Mohit Aggarwal 20eda05cc6 feat: v14.0.0 — 12 community-inspired skills, pm-writers profession, extend pm-cross/operations/engineering
New profession: Writers & Content Creators (pm-writers bundle, skills 156–160)
- instagram-post-downloader: Downloads Instagram images/carousels as high-res files + PDF stitch
- aeo-optimizer: Restructures articles for AI citation (AEO) — question H2s, answer capsules, trust signal audit
- thumbnail-creator: Generates brand-aligned thumbnail candidates via Gemini API with computer vision eval
- substack-notes-scraper: Scrapes Substack Notes engagement data to formatted .xlsx
- notes-humanizer: Strips AI writing patterns across 3 phases; injects genuine human signals

Extended pm-cross (+3 skills, skills 161–163):
- sycophancy-challenger: Argues against your idea first, holds position under pushback
- last-30-days-research: Multi-platform research (Reddit, X, web) with signal confidence scoring
- notebooklm-connector: Automates NotebookLM from Claude Code via Chrome extension

Extended pm-operations (+2 skills, skills 164–165):
- email-triage: Reads Gmail and surfaces only actionable emails with priority + reply starters
- morning-intelligence: 15-question interview → personalised master news brief prompt

Extended pm-engineering (+2 skills, skills 166–167):
- context-mode: Output filtering + session log for long Claude Code sessions
- claude-superpowers: Plan→Isolate→Test→Double-review framework for Claude Code

Updated: marketplace.json v14.0.0 (167 skills, 18 professions, 26 bundles)
Updated: README.md — title, badges, What's New, All 167 Skills table, install list

Credits: skills inspired by Frank & Diana Dovgopol, Gencay (LearnAIwithMe), Karen Spinner (Wondering About AI), Orel (TheIndiepreneur), Joel Salinas (Leadership in Change), Ilia Karelin (Prosper), Ashwin Francis (Cash&Cache), Nate Herk

https://claude.ai/code/session_01E4bTUWxx4Zo5rsFpad5X5B
2026-05-27 12:29:45 +00:00
Mohit Aggarwal 6bb25a8c13 feat: add aeo-optimizer, context-mode, claude-superpowers skills
https://claude.ai/code/session_01E4bTUWxx4Zo5rsFpad5X5B
2026-05-27 09:32:40 +00:00
Mohit Aggarwal 5f12fcff50 feat: add email triage community skill
https://claude.ai/code/session_01E4bTUWxx4Zo5rsFpad5X5B
2026-05-27 09:31:16 +00:00
Mohit Aggarwal 84abb1583d feat: add 3 more community skills (partial batch 2/3) — sycophancy challenger, notebooklm connector, morning intelligence
https://claude.ai/code/session_01E4bTUWxx4Zo5rsFpad5X5B
2026-05-27 09:31:02 +00:00
Mohit Aggarwal 2c92636980 feat: add 4 community skills (partial batch 1/3) — instagram downloader, substack scraper, notes humanizer, last-30-days research
https://claude.ai/code/session_01E4bTUWxx4Zo5rsFpad5X5B
2026-05-27 09:30:04 +00:00
mohitagw15856 dc579c7512 Merge pull request #15 from mohitagw15856/claude/lucid-sagan-YnJQS
feat: add Social Media profession — 5 new skills, pm-social bundle, v…
2026-05-27 08:27:06 +01:00
Mohit Aggarwal d213ccde1c feat: add Social Media profession — 5 new skills, pm-social bundle, v13.0.0
New profession: Social Media (Skills 151–155)

Skills added:
- social-media-audit: Scored platform audit with competitive benchmarking and prioritised action plan
- influencer-brief: Complete creator partnership brief with deliverables, approval workflow, and commercial terms
- community-management-playbook: Response frameworks, moderation rules, escalation tiers, and DM templates
- social-ad-campaign: Full-funnel paid social plan with ad copy for every format and A/B testing plan
- viral-content-framework: 6 hook formulas, 5 content structures, platform playbooks, and content testing system

Changes:
- Added plugins/pm-social/ bundle with all 5 skills
- Updated .claude-plugin/marketplace.json to v13.0.0 (155 skills, 17 professions, 24 bundles)
- Updated README.md: title, badges, description, What's New section, All Skills table, plugin bundle list

https://claude.ai/code/session_01E4bTUWxx4Zo5rsFpad5X5B
2026-05-27 07:24:57 +00:00
mohitagw15856 ae6ea4d53e feat: v12.0.0 — 150-skill milestone, 15 new skills across 10 bundles
Adds 15 new skills reaching the 150-skill milestone:

Data & Analytics (pm-data):
- cohort-analysis: retention curves, LTV projection, behavioural segmentation, SQL reference queries
- data-pipeline-spec: ETL/ELT design with SLAs, DQ rules, error handling, compliance

Customer Success (pm-cs):
- renewal-playbook: health snapshot, value story, commercial scenarios, objection responses, 16-week timeline
- customer-success-plan: joint success plan with milestones, mutual commitments, escalation path

People & Leadership (pm-people):
- 360-feedback-template: survey instrument + narrative report with strengths and development themes
- team-health-check: Spotify-model assessment across 7 dimensions with facilitation guide

Operations (pm-operations):
- risk-register: L×I scoring, RAG heat map, mitigation and contingency plans
- raci-matrix: role definitions, decision map, anti-pattern guide, communication template

Marketing & GTM (pm-gtm):
- social-media-strategy: audience profile, content pillars, KPIs, 4-week starter calendar
- product-positioning-doc: April Dunford-style positioning, messaging hierarchy, persona messaging

Discovery (pm-discovery):
- customer-journey-map: stage-by-stage journey with touchpoints, emotions, and prioritised opportunities

Delivery (pm-delivery):
- user-story-writer: Given/When/Then ACs, edge cases, definition of done, epic decomposition

Advanced (pm-advanced):
- ai-ethics-review: fairness, bias, transparency, privacy, safety, accountability, societal impact

Sales (pm-sales):
- partnership-proposal: mutual value, commercial model, joint GTM plan, governance

Design (pm-design):
- design-system-audit: component coverage, token consistency, WCAG, adoption, remediation roadmap

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 21:58:13 +01:00
mohitagw15856 94e53d38a8 Merge pull request #14 from mohitagw15856/claude/add-engineering-skills-IfBhz
quality: improve 10 v7.0.0-era engineering skills
2026-05-20 13:29:33 +01:00
mohitagw15856 01c10eb625 Content quality improvements to remaining 5 engineering skills
Completes the quality pass across all 10 skills:
- incident-postmortem: fix opening paragraph (blameless framing emphasis),
  add root cause circular check + action item specificity quality checks
- pr-description-writer: add title format quality check, fix
  risk-appropriate reviewer guidance quality check
- system-design-interview: rewrite architecture diagram instruction
  (system-specific not generic template), fix capacity estimates to show
  arithmetic, add trade-offs non-empty check
- api-docs-writer: add API Version + Rate Limits inputs, clarify output
  format options, add error codes completeness check, fix code examples check
- architecture-decision-record: add ADR Number + Team Context inputs,
  fix Implementation Notes + Review Date guidance, fix quality checks for
  context specificity and rejected option reasoning

Both skills/ and plugins/pm-engineering/skills/ copies updated.

https://claude.ai/code/session_01C3HwChrccJd145vJ6Z7ajF
2026-05-20 12:06:26 +00:00
mohitagw15856 49137bd1b6 Content quality improvements to 7 engineering skills (partial batch)
Applies reviewer-feedback-driven improvements across 7 skills:
- code-review-checklist: add Section 1 header, optional diff input, precise
  review time estimate, stronger quality checks
- debugging-log-analyser: improve Context input, add Frequency input,
  add Section 1 Error Classification header, stronger quality checks
- changelog-generator: add Previous Version Behaviour + Scope inputs,
  clarify Formatting Rules are skill-internal, stronger quality checks
- pr-description-writer: add Target Branch + Linked Issue inputs, fix
  Screenshots omission instruction, stronger quality checks
- test-strategy-doc: split Existing Coverage from Tech Stack, add
  Deployment Cadence input, fix Performance Tests conditional,
  stronger quality checks
- runbook-writer: add Monitoring Tools + Key Environment Details inputs,
  fix Last Updated placeholder, stronger quality checks
- incident-postmortem: add Responders + Customer Communications inputs

Both skills/ and plugins/pm-engineering/skills/ copies updated.

https://claude.ai/code/session_01C3HwChrccJd145vJ6Z7ajF
2026-05-20 12:06:26 +00:00
mohitagw15856 929fa3ad7f Restore trigger phrases as ## Usage Examples across 10 engineering skills
Renamed ## Example Trigger Phrases → ## Usage Examples to make the section
clearly human-facing documentation rather than a system instruction.
Restores content that was removed in the previous quality pass.

Skills updated (both skills/ and plugins/pm-engineering/skills/):
code-review-checklist, debugging-log-analyser, changelog-generator,
pr-description-writer, system-design-interview, test-strategy-doc,
runbook-writer, incident-postmortem, api-docs-writer,
architecture-decision-record

https://claude.ai/code/session_01C3HwChrccJd145vJ6Z7ajF
2026-05-20 12:06:26 +00:00
mohitagw15856 e366a77cf0 Quality-improve 10 v7.0.0-era engineering skills
Applies three consistent fixes across the v7.0.0 batch:
- Rename `## Output Structure` → `## Output Format` for consistency
- Wrap output template in `---` document separators (code-review-checklist,
  debugging-log-analyser needed full structural upgrade; remaining 8 already
  had the wrapper)
- Remove `## Example Trigger Phrases` section from all 10 skills

Skills updated: code-review-checklist, debugging-log-analyser,
changelog-generator, pr-description-writer, system-design-interview,
test-strategy-doc, runbook-writer, incident-postmortem, api-docs-writer,
architecture-decision-record

Both `skills/` and `plugins/pm-engineering/skills/` copies synced.

https://claude.ai/code/session_01C3HwChrccJd145vJ6Z7ajF
2026-05-20 12:06:11 +00:00
mohitagw15856 bf65c16222 Merge pull request #12 from mohitagw15856/claude/add-engineering-skills-IfBhz
Add 21 engineering skills — complete the 500-star milestone
2026-05-20 08:32:18 +01:00
Claude beecb1cb31 Add 21 engineering skills — complete the 500-star milestone
pm-engineering grows from 14 to 35 skills (v4.0.0), completing the full
25-skill promise made at the 500-star milestone. The library grows from
114 to 135 total skills.

New skills added (21):
- security-threat-model: STRIDE-based threat model with trust boundaries, per-component threat enumeration, risk scores, and mitigations
- performance-budget: Performance budgets for Core Web Vitals and backend latency SLOs with CI enforcement
- database-schema-design: Schema documentation with ER diagram, DDL definitions, index strategy, and access pattern analysis
- database-migration-plan: Zero-downtime expand-contract migration plan with per-step rollback and data validation queries
- technical-debt-register: Debt inventory with impact scoring, effort estimates, and quarterly resolution roadmap
- rfc-writer: Engineering RFC covering problem, proposed solution, alternatives-with-rejection-reasons, and rollout plan
- capacity-planning: Traffic forecasts, resource requirements by tier, scaling strategy, and infrastructure roadmap
- load-testing-plan: Load test plan with baseline/stress/spike/soak scenarios, k6/Locust skeleton, and CI gates
- disaster-recovery-plan: DR plan with RPO/RTO targets, per-scenario runbooks, game day testing, and communication templates
- feature-flag-guide: Feature flag lifecycle — taxonomy, rollout strategy, monitoring requirements, cleanup policy, governance
- dependency-audit: CVE vulnerabilities, license compliance, outdated packages, and 30-day remediation plan
- service-catalog-entry: Microservice catalog entry with SLAs, API contract, data classification, and runbook links
- monitoring-setup-guide: Four golden signals, alert rules spec, log schema, tracing setup, dashboard layout spec
- local-dev-setup: Local development guide — prerequisites, env vars, Docker deps, test commands, 5 failure fixes
- api-versioning-strategy: Versioning scheme, lifecycle policy, breaking change classification table, deprecation process
- infra-as-code-review: IaC review for Terraform/CloudFormation/Pulumi with severity-classified findings
- engineering-weekly-report: Consistent weekly status — shipped/blocked, metrics, decisions, risks, next week
- tech-radar: ThoughtWorks-format radar with Adopt/Trial/Assess/Hold, blip rationales, maintenance process
- sprint-velocity-analysis: Velocity trends, completion patterns, improvement recommendations, capacity forecast
- microservices-decomposition: Domain-driven service boundaries, communication patterns, data ownership, migration plan
- engineering-hiring-rubric: Technical interview rubric with level expectations, coding/system design scorecards, debrief guide

Also:
- plugin.json bumped to v4.0.0 with all 35 skills listed
- marketplace.json updated to v11.0.0, library count 135
- README updated: skill count, all section numbers, engineering table expanded, star milestone marked complete

https://claude.ai/code/session_01C3HwChrccJd145vJ6Z7ajF
2026-05-20 07:28:51 +00:00
mohitagw15856 8caa9c29b9 Add new plugins for Customer Success and Engineering 2026-05-17 15:45:45 +05:30
mohitagw15856 af29d30631 rebrand: PM = Professional, not just Product Management
Reposition the library without changing the repo name or URLs.
Adds 'PM stands for Professional' tagline to README header and
marketplace.json description to reflect the library now covering
16 professions beyond product management.
2026-05-17 11:14:40 +01:00
mohitagw15856 bfdbec17a3 feat: v10.0.0 — 8 new skills across Customer Success and Engineering (500-star milestone)
Two star milestones shipped together:

Customer Success bundle (pm-cs) — 250-star milestone:
- cs-health-scorecard: weighted RAG health score across 5 dimensions with renewal forecast
- qbr-deck: slide-by-slide QBR structure with value narrative and mutual commitments
- cs-escalation-brief: 4-level escalation framework with root cause, impact, and decision required
- churn-analysis: voluntary/unavoidable churn split, early warning signals, prioritised interventions

Engineering expansion (pm-engineering) — 500-star milestone:
- cicd-playbook: full pipeline playbook from build through post-deploy checks and rollback
- slo-error-budget: SLI definitions, burn rate alerts, and error budget policy
- developer-onboarding-doc: first-week guide covering architecture, setup, testing, and contacts
- oncall-runbook: per-alert response procedures, escalation matrix, and handoff template

Also:
- Added pm-cs plugin to marketplace.json
- Updated pm-engineering plugin.json to v3.0.0 (14 skills)
- Updated marketplace.json to v10.0.0 (114 skills, 23 bundles, 16 professions)
- README updated with new CS section, corrected skill numbering (106 → 114)
- Added bug report link to Contributing section
- Star milestones updated to show 250 and 500 as unlocked
2026-05-17 10:55:58 +01:00
mohitagw15856 48fd4dd6ad Update README with new plugin installation commands
Added additional plugin installation commands for various professions.
2026-05-08 12:40:20 +05:30
mohitagw15856 ad92de9637 Add Part 16 to the skills library section 2026-05-08 03:11:28 +05:30
mohitagw15856 450dbde74d Bump version to 9.0.0 and update description
Updated version and description to reflect new features.
2026-05-08 03:08:07 +05:30
mohitagw15856 af23bcc170 Update README.md 2026-05-08 03:06:32 +05:30
mohitagw15856 59c4510055 feat: v9.0.0 — three new agent templates (Discovery, Stakeholder Comms, Launch)
This release adds three new agent templates to the library, bringing the total to four.

New templates:
- PM Discovery Agent: synthesises customer interviews from Notion or Google Drive,
  identifies cross-interview themes, scores assumption confidence, generates follow-up questions
- PM Stakeholder Comms Agent: detects audience type (executive/investor/stakeholder/board),
  pulls activity from Linear/Jira/Drive, drafts in audience-appropriate format
- PM Launch Agent: end-to-end launch coordination with channel-specific content,
  calendar, success metrics, and launch checklist

Each template follows the established pattern: README, AGENT.md, orchestrate.sh,
2 subagents, connectors with example configs, examples, smoke test.

Total file count: 37 new files across 3 templates.

Updated README to position library as 4-template collection.
Bumped marketplace.json from v8.0.0 to v9.0.0.
2026-05-07 22:30:34 +01:00
mohitagw15856 9274b3d378 Add Part 15 to skills list in README 2026-05-06 15:22:39 +01:00
mohitagw15856 a0ed6e52a5 Update version badge from 7.0.0 to 8.0.0 2026-05-06 09:20:03 +01:00
mohitagw15856 84eefcabd6 fix: move templates contributing guide to templates/CONTRIBUTING.md 2026-05-05 23:31:59 +01:00
mohitagw15856 7df025ffaa Bump version to 8.0.0 and update description
Updated version and description to reflect new features and coverage.
2026-05-06 03:57:39 +05:30
mohitagw15856 e5377ca61a feat: v8.0.0 — first agent template (PM Sprint Agent) following Anthropic's agent template architecture
- Added templates/pm-sprint-agent/ directory with full agent template
  - AGENT.md system prompt with explicit step-by-step workflow
  - 2 subagents: capacity-analyst and risk-scorer
  - 2 connectors: linear and jira (with example configs)
  - Symlinked skills from main library: sprint-planning, sprint-brief
  - orchestrate.sh end-to-end workflow script
  - examples/ folder with input and output examples
  - tests/ folder with smoke test
- Updated README to position skills as building blocks for agent templates
- Added Anthropic agent templates announcement reference (May 5, 2026)
- Bumped marketplace.json to v8.0.0
- Listed 7 candidate agent templates this library supports

This is the first agent template in the library. More to follow.
2026-05-05 23:26:08 +01:00
mohitagw15856 bd38a36468 Revise README with new skills and sponsor details
Updated README to include new skills and sponsorship information.
2026-04-27 00:54:51 +05:30
mohitagw15856 c1d47fa1ae update path 2026-04-23 15:36:09 +01:00
mohitagw15856 48be8596d9 Merge pull request #6 from mohitagw15856/feat/v7-engineering-skills
feat: v7.0.0 — 6 new engineering skills, star milestone tracker, SKILL_REQUEST.md
2026-04-23 15:24:27 +01:00
mohitagw15856 c0544fb76a feat: v7.0.0 — 6 new engineering skills, badges, milestone tracker, SKILL_REQUEST.md
New skills added to pm-engineering bundle (now 10 skills total):
- debugging-log-analyser: stack trace → structured root cause diagnosis + fix
- pr-description-writer: diff/commits → reviewer-ready PR description
- system-design-interview: full system design with capacity, components, trade-offs
- changelog-generator: git log → polished Keep a Changelog entry
- test-strategy-doc: spec/PRD → complete test strategy with P0/P1 test cases
- runbook-writer: operational runbooks with exact commands, rollback, and escalation

README updates:
- 5 shields.io badges (stars, skill count, version, install, license)
- "See It in Action" demo section
- pm-engineering added to Quick Install list
- Star Milestone Tracker (100/250/500/1000 stars roadmap)
- Engineering table extended from 4 to 10 skills (41–50)
- Article 14 link resolved from remote merge

Config updates:
- marketplace.json: v6.0.0 → v7.0.0, "106 skills"
- pm-engineering plugin.json: v1.0.0 → v2.0.0

New file: SKILL_REQUEST.md — community skill voting board

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 15:21:43 +01:00
mohitagw15856 ce35e8c5c0 Update link in README for article series 2026-04-22 13:03:21 +05:30
mohitagw15856 a7ee053aac Merge pull request #5 from mohitagw15856/mohitagw15856-patch-1
Update Part 14 link in README.md
2026-04-21 14:15:24 +01:00
mohitagw15856 5b3eb3ea53 Update Part 14 link in README.md 2026-04-21 14:14:26 +01:00
mohitagw15856 44d211b934 fix: update marketplace.json to v6.0.0 with 100 skills
Bumps top-level version from 5.2.0 → 6.0.0, updates description to
reflect 100 skills, and syncs 6 plugin entries (pm-gtm, pm-finance,
pm-hr, pm-sales, pm-operations, pm-cross) to version 1.1.0 with
updated descriptions including the 7 new skills.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 21:05:31 +01:00
mohitagw15856 35364c7512 fix: update plugin.json for 6 bundles with new skills and version bumps
- pm-gtm v1.1.0: added seo-content-brief, media-pitch
- pm-finance v1.1.0: added tax-planning-checklist
- pm-hr v1.1.0: added change-management-plan
- pm-sales v1.1.0: added sales-forecasting-model
- pm-operations v1.1.0: added workshop-facilitation-guide
- pm-cross v1.1.0: added teaching-lesson-plan

Updated descriptions and keywords in all 6 plugin.json files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 21:02:14 +01:00
mohitagw15856 513e1d3ce7 fix: sync all skill updates and new skills into plugin bundles
- Synced 97 existing skill SKILL.md files from skills/ to their plugin bundle copies
- Added 7 new skills to plugin bundles:
  - seo-content-brief, media-pitch -> pm-gtm
  - tax-planning-checklist -> pm-finance
  - change-management-plan -> pm-hr
  - sales-forecasting-model -> pm-sales
  - workshop-facilitation-guide -> pm-operations
  - teaching-lesson-plan -> pm-cross

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 21:00:08 +01:00
mohitagw15856 d7f6c2cd05 Update README.md 2026-04-21 01:26:58 +05:30
mohitagw15856 844e97f81f Delete MEDIUM_ARTICLE_DRAFT.md 2026-04-21 01:25:34 +05:30
mohitagw15856 b6e0cbc31b merge: incorporate remote README updates (article links for parts 10-11)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 20:54:59 +01:00
mohitagw15856 f3b9d008fe feat: 100 skills milestone — 7 new skills + quality improvements across all 93
New skills added:
- teaching-lesson-plan: structured lesson plans for any subject/audience/setting
- seo-content-brief: complete SEO briefs with intent, competitor gaps, and outline
- media-pitch: story-first journalist pitches with angle development framework
- change-management-plan: stakeholder analysis, comms strategy, adoption metrics
- workshop-facilitation-guide: activity instructions, decision protocols, facilitator moves
- sales-forecasting-model: pipeline model, scenario analysis, assumption log
- tax-planning-checklist: year-end tax planning across income, pension, CGT, reliefs

Quality improvements across all 93 existing skills:
- Standardised description format: "Verb the thing. Use when X. Produces Y."
- Added Required Inputs section to all skills missing it (prompts for missing info)
- Added Quality Checks section to all skills missing it (specific, not generic)
- Fixed broken multiline YAML descriptions
- Removed non-standard frontmatter keys (tool_integration, metadata blocks)

README updated to v6.0.0 with 100-skill count, new skill tables, and article series

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 20:52:31 +01:00
mohitagw15856 93c5ab7d71 Update README.md 2026-04-17 16:47:06 +05:30
mohitagw15856 34b0f780e6 feat: Opus 4.7 release — 3 new vision/document skills, 3 updated skills (v5.2.0, 93 skills) 2026-04-17 12:07:27 +01:00
mohitagw15856 7dbf5a47a3 Rename setup-marketplace.sh to scripts/setup-marketplace.sh 2026-04-13 09:09:06 +01:00
mohitagw15856 f9d075ce3d Rename add-plugin-json.sh to scripts/add-plugin-json.sh 2026-04-13 09:08:48 +01:00
mohitagw15856 81bc090869 Delete pm-figma-example.md 2026-04-13 09:08:09 +01:00
mohitagw15856 ff46498e46 Delete setup-pm-figma.sh 2026-04-13 09:07:17 +01:00
mohitagw15856 31c45072ec Delete setup-80-skills.sh 2026-04-13 09:07:06 +01:00
mohitagw15856 d650957c6a Delete create-plugin-jsons.sh 2026-04-13 09:06:55 +01:00
mohitagw15856 4dac8817cf Delete create-plugin-json-pm-figma.sh 2026-04-13 09:06:39 +01:00
mohitagw15856 6deaa51bf6 Delete update-marketplace_new.sh 2026-04-13 09:05:58 +01:00
mohitagw15856 06243650b9 Delete update-marketplace.sh 2026-04-13 09:05:45 +01:00
mohitagw15856 69a319688f fix: update marketplace.json to v5.1.0 — 22 plugins including pm-figma 2026-04-08 20:07:01 +01:00
mohitagw15856 254e389593 fix: add missing pm-figma plugin.json 2026-04-08 20:00:52 +01:00
445 changed files with 51688 additions and 5396 deletions
Vendored
BIN
View File
Binary file not shown.
+108 -20
View File
@@ -1,8 +1,8 @@
{
"$schema": "https://anthropic.com/claude-code/marketplace.schema.json",
"name": "pm-claude-skills",
"version": "4.0.0",
"description": "53 Claude Skills across 6 professions — product management, marketing, engineering, data, design, and leadership. Save 10-15 hours per week.",
"version": "14.0.0",
"description": "PM stands for Professional, not just Product Management. 167 Claude Skills + 4 agent templates across 26 bundles covering 18 professions — engineering, customer success, legal, finance, HR, sales, design, Figma, marketing, social media, writers, and more. Built by a PM, used by everyone. Building blocks for the Anthropic agent template architecture.",
"owner": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
@@ -10,16 +10,16 @@
"plugins": [
{
"name": "pm-essentials",
"description": "Core PM skills: PRD Template, Meeting Notes, Stakeholder Update, User Research Synthesis, Competitive Analysis. The 5 skills every PM needs first.",
"version": "3.0.0",
"description": "Core PM skills: PRD Template, Meeting Notes, Stakeholder Update, User Research Synthesis, Competitive Analysis, Word Doc Tracked Changes. The essentials every PM needs first.",
"version": "3.1.0",
"category": "productivity",
"source": "./plugins/pm-essentials",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-discovery",
"description": "Discovery & research skills: Discovery Interview Guide, Job Story Mapper, User Interview Synthesis, Assumption Mapper. Structure user research from screener to synthesis.",
"version": "3.0.0",
"description": "Discovery & research skills: Discovery Interview Guide, Job Story Mapper, User Interview Synthesis, Assumption Mapper, Customer Journey Map. Structure user research from screener to synthesis — including end-to-end journey mapping with touchpoints, emotions, and prioritised opportunities.",
"version": "3.1.0",
"category": "productivity",
"source": "./plugins/pm-discovery",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
@@ -34,8 +34,8 @@
},
{
"name": "pm-delivery",
"description": "Sprint & delivery skills: Sprint Planning, Technical Spec Template, A/B Test Planner, Go-to-Market Planner, Product Launch Checklist, Sprint Brief, Retro Analysis.",
"version": "3.0.0",
"description": "Sprint & delivery skills: Sprint Planning, Technical Spec, A/B Test Planner, Go-to-Market Planner, Launch Checklist, Sprint Brief, Retro Analysis, PPTX Slide Auditor, User Story Writer. Write production-ready user stories with Given/When/Then acceptance criteria, edge cases, and definition of done.",
"version": "3.2.0",
"category": "productivity",
"source": "./plugins/pm-delivery",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
@@ -58,8 +58,8 @@
},
{
"name": "pm-advanced",
"description": "Advanced PM skills: AI Product Canvas, Multi-Source Signal Synthesiser, Experiment Designer, Design Handoff Brief, Stakeholder Update. For senior PMs working on complex products.",
"version": "3.0.0",
"description": "Advanced PM skills: AI Product Canvas, Multi-Source Signal Synthesiser, Experiment Designer, Design Handoff Brief, AI Ethics Review. For senior PMs working on complex products — including a structured ethical review framework for AI/ML features covering fairness, transparency, privacy, safety, and accountability.",
"version": "3.1.0",
"category": "productivity",
"source": "./plugins/pm-advanced",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
@@ -74,40 +74,48 @@
},
{
"name": "pm-gtm",
"description": "Marketing & GTM skills: Go-To-Market Planner, Content Calendar, Competitor Teardown, Email Campaign. Build positioning statements, messaging pillars, feature lists, use cases, and launch campaigns.",
"version": "1.0.0",
"description": "Marketing & GTM skills: Go-To-Market Planner, Content Calendar, Competitor Teardown, Email Campaign, SEO Content Brief, Media Pitch, Social Media Strategy, Product Positioning Doc. Build positioning docs, messaging frameworks, content pillars, social strategies with KPIs, launch campaigns, and journalist pitches.",
"version": "1.2.0",
"category": "productivity",
"source": "./plugins/pm-gtm",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-engineering",
"description": "Engineering & tech skills: Code Review Checklist, Incident Postmortem, API Docs Writer, Architecture Decision Record. Structured outputs for engineering teams and technical PMs.",
"version": "1.0.0",
"description": "Engineering & tech skills: Code Review Checklist, Incident Postmortem, API Docs Writer, Architecture Decision Record, Debugging Log Analyser, PR Description Writer, System Design Interview, Changelog Generator, Test Strategy Doc, Runbook Writer, CI/CD Playbook, SLO & Error Budget, Developer Onboarding Doc, On-Call Runbook, Security Threat Model, Performance Budget, Database Schema Design, Database Migration Plan, Technical Debt Register, RFC Writer, Capacity Planning, Load Testing Plan, Disaster Recovery Plan, Feature Flag Guide, Dependency Audit, Service Catalog Entry, Monitoring Setup Guide, Local Dev Setup, API Versioning Strategy, Infra-as-Code Review, Engineering Weekly Report, Tech Radar, Sprint Velocity Analysis, Microservices Decomposition, Engineering Hiring Rubric, Context Mode, Claude Superpowers. 37 structured skills for engineering teams, SREs, technical PMs, and Claude Code power users.",
"version": "4.1.0",
"category": "productivity",
"source": "./plugins/pm-engineering",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-cs",
"description": "Customer Success skills: Customer Health Scorecard, QBR Deck, Escalation Brief, Churn Analysis, Renewal Playbook, Customer Success Plan. Score health, build QBRs, write escalation briefs, plan renewals with commercial strategy and objection responses, and build joint success plans with milestones and mutual commitments.",
"version": "1.1.0",
"category": "productivity",
"source": "./plugins/pm-cs",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-data",
"description": "Data & analytics skills: Metrics Framework, SQL Query Explainer, Dashboard Brief. Build North Star metric trees, explain and optimise SQL, and spec dashboards from business questions.",
"version": "1.0.0",
"description": "Data & analytics skills: Metrics Framework, SQL Query Explainer, Dashboard Brief, Chart Data Extractor, Cohort Analysis, Data Pipeline Spec. Build metric trees, explain SQL, spec dashboards, run cohort retention analysis with LTV modelling, and design ETL/ELT pipeline specifications with SLAs and data quality rules.",
"version": "1.2.0",
"category": "productivity",
"source": "./plugins/pm-data",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-people",
"description": "Leadership & people skills: Performance Review, Hiring Rubric, Team Offsite Planner. Write structured reviews, build interview scorecards, and plan offsites from goals to minute-by-minute agenda.",
"version": "1.0.0",
"description": "Leadership & people skills: Performance Review, Hiring Rubric, Team Offsite Planner, 360-Degree Feedback Template, Team Health Check. Write reviews, build scorecards, run Spotify-model team health assessments, and design 360 feedback surveys with structured narrative reports.",
"version": "1.1.0",
"category": "productivity",
"source": "./plugins/pm-people",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-design",
"description": "Design & UX skills: UX Research Plan, Design Critique, Accessibility Audit. Create research plans with discussion guides, critique designs using JTBD and Gestalt principles, audit for WCAG 2.2 compliance.",
"version": "1.0.0",
"description": "Design & UX skills: UX Research Plan, Design Critique, Accessibility Audit, Design System Audit. Create research plans, critique designs using JTBD and Gestalt principles, audit for WCAG 2.2 compliance, and audit design systems for component coverage, token consistency, and adoption health.",
"version": "1.1.0",
"category": "productivity",
"source": "./plugins/pm-design",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
@@ -119,6 +127,86 @@
"category": "productivity",
"source": "./plugins/pm-business",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-legal",
"description": "Legal skills: Contract Review, NDA Analyser, Legal Brief, Compliance Checklist. Flag risks in contracts and NDAs, draft legal memos in IRAC format, and generate GDPR, SOC 2, FCA and other compliance checklists.",
"version": "1.1.0",
"category": "productivity",
"source": "./plugins/pm-legal",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-finance",
"description": "Finance skills: Financial Model Narrative, Budget Variance Analysis, Investor Pitch Deck, Financial Due Diligence, Tax Planning Checklist. Turn numbers into board-ready narratives, explain variances, structure pitch decks, generate DD checklists, and review year-end tax planning opportunities.",
"version": "1.1.0",
"category": "productivity",
"source": "./plugins/pm-finance",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-hr",
"description": "HR skills: Job Description Writer, Onboarding Plan, Employee Engagement Survey, Redundancy Consultation, Change Management Plan. Write inclusive JDs, build 30/60/90-day plans, design engagement surveys, structure redundancy processes, and manage organisational change with stakeholder analysis and adoption metrics.",
"version": "1.1.0",
"category": "productivity",
"source": "./plugins/pm-hr",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-sales",
"description": "Sales skills: Sales Battlecard, Discovery Call Prep, Proposal Writer, Account Plan, Sales Forecasting Model, Partnership Proposal. Build battlecards, prepare calls, write proposals, create account plans, build forecasts, and structure B2B partnership proposals with mutual value, commercial terms, and joint GTM plans.",
"version": "1.2.0",
"category": "productivity",
"source": "./plugins/pm-sales",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-operations",
"description": "Operations skills: Process Documentation, SOP Writer, Vendor Evaluation, Project Status Report, Workshop Facilitation Guide, Risk Register, RACI Matrix, Email Triage, Morning Intelligence. Document workflows, write SOPs, build risk registers, define RACI matrices, triage your inbox to only what needs action, and auto-generate a personalised morning news brief.",
"version": "1.3.0",
"category": "productivity",
"source": "./plugins/pm-operations",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-research",
"description": "Research and healthcare skills: Clinical Case Summary, Research Protocol, Patient Communication, Literature Review. Write SBAR handovers, design research protocols, draft accessible patient communications, and structure literature reviews.",
"version": "1.0.0",
"category": "productivity",
"source": "./plugins/pm-research",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-cross",
"description": "Cross-profession skills: Press Release, Grant Proposal, Executive Summary, Teaching Lesson Plan, Sycophancy Challenger, Last 30 Days Research, NotebookLM Connector. Get genuine push-back on your ideas (not validation), gather multi-platform research from the last 30 days, and automate NotebookLM from Claude.",
"version": "1.2.0",
"category": "productivity",
"source": "./plugins/pm-cross",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-figma",
"description": "Figma skills for PMs and designers: Component Audit, Design Brief, Annotation Guide, Design Review, User Flow Planner, Variant Matrix, Spacing System, Prototype Plan, Design QA, PM Design Critique. Work smarter across the full Figma design lifecycle.",
"version": "1.1.0",
"category": "productivity",
"source": "./plugins/pm-figma",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-social",
"description": "Social Media skills: Social Media Audit, Influencer Brief, Community Management Playbook, Social Ad Campaign, Viral Content Framework. Score your social presence, brief influencer partnerships, manage communities at scale, plan paid social campaigns with full ad copy, and build a repeatable system for shareable content.",
"version": "1.0.0",
"category": "productivity",
"source": "./plugins/pm-social",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
},
{
"name": "pm-writers",
"description": "Writers & Content Creators skills: Instagram Post Downloader, AEO Optimizer, Thumbnail Creator, Substack Notes Scraper, Notes Humanizer. Download Instagram carousels as PDFs, restructure articles for AI citation, generate thumbnail candidates via Gemini, export Substack Notes analytics to Excel, and strip AI writing patterns from any text.",
"version": "1.0.0",
"category": "productivity",
"source": "./plugins/pm-writers",
"homepage": "https://github.com/mohitagw15856/pm-claude-skills"
}
]
}
+58
View File
@@ -0,0 +1,58 @@
name: Deploy Skill Playground
# Rebuilds web/skills.json from the SKILL.md files and publishes web/ to
# GitHub Pages. Runs on every push to main that touches skills or the web app,
# so the live site always reflects the current skill library.
on:
push:
branches: [main]
paths:
- 'skills/**'
- 'web/**'
- '.github/workflows/deploy-playground.yml'
workflow_dispatch:
permissions:
contents: read
pages: write
id-token: write
# Allow one concurrent deployment; cancel in-progress runs for the same ref.
concurrency:
group: pages
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Rebuild skills.json from SKILL.md files
run: node web/build-skills.mjs
- name: Configure Pages
uses: actions/configure-pages@v5
- name: Upload web/ as Pages artifact
uses: actions/upload-pages-artifact@v3
with:
path: web
deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
+638 -101
View File
@@ -1,8 +1,29 @@
# 🧠 Claude Skills Library — 90 Skills for Every Profession
# 🧠 PM Claude Skills — 167 Skills for Every Profession
> **Save 810 hours per week across 14 professions. Install in 2 minutes.**
[![Stars](https://img.shields.io/github/stars/mohitagw15856/pm-claude-skills?style=social)](https://github.com/mohitagw15856/pm-claude-skills/stargazers)
[![Skills](https://img.shields.io/badge/skills-167-blue)](https://github.com/mohitagw15856/pm-claude-skills)
[![Version](https://img.shields.io/badge/version-14.0.0-brightgreen)](https://github.com/mohitagw15856/pm-claude-skills/releases)
[![Install](https://img.shields.io/badge/Install%20in%20Claude%20Code-2%20minutes-orange)](https://github.com/mohitagw15856/pm-claude-skills#-quick-install-2-minutes)
[![License](https://img.shields.io/badge/license-MIT-lightgrey)](LICENSE)
[![Sponsor](https://img.shields.io/badge/sponsor-❤️-ff69b4)](https://github.com/sponsors/mohitagw15856)
A community-built library of Claude Skills covering product management, marketing, engineering, data, design, Figma, leadership, legal, finance, HR, sales, operations, research, and more. Each skill is a structured SKILL.md file that teaches Claude how to produce professional-grade outputs for your specific workflows.
> **PM stands for Professional, not just Product Management.**
> 167 Claude Skills + 4 agent templates across 26 bundles covering 18 professions. Built by a PM, used by everyone.
A community-built library of Claude Skills for professionals across every field — product management, engineering, customer success, marketing, social media, writers, design, legal, finance, HR, sales, operations, research, and more. Each skill is a structured SKILL.md file that teaches Claude how to produce professional-grade outputs for your specific workflows.
**🆕 Latest release (v14.0.0):** 12 new community-inspired skills across 4 bundles — a brand new Writers & Content Creators profession (Instagram downloader, AEO optimizer, thumbnail creator, Substack scraper, notes humanizer), plus decision-making, productivity, and Claude Code power tools.
---
## Contents
- [🚀 Quick Install](#-quick-install-2-minutes)
- [🌐 Skill Playground — try any skill in your browser](#-skill-playground--try-any-skill-in-your-browser)
- [📦 Plugin Directory](#-plugin-directory)
- [🤖 Building Blocks for Agent Templates](#-building-blocks-for-agent-templates)
- [🗂️ All 167 Skills](#-all-167-skills)
- [📋 Changelog](#-changelog)
- [🤝 Contributing](#-contributing--add-your-skill)
---
@@ -10,21 +31,40 @@ A community-built library of Claude Skills covering product management, marketin
In Claude Code, run:
/plugin marketplace add https://github.com/mohitagw15856/pm-claude-skills
/plugin marketplace add mohitagw15856/pm-claude-skills
Or install by profession:
claude plugin install pm-essentials@pm-claude-skills # Core PM
claude plugin install pm-essentials@pm-claude-skills # Core PM + Word tracked changes
claude plugin install pm-delivery@pm-claude-skills # Delivery + PowerPoint auditor
claude plugin install pm-engineering@pm-claude-skills # Engineering (35 skills) 🆕
claude plugin install pm-cs@pm-claude-skills # Customer Success 🆕
claude plugin install pm-data@pm-claude-skills # Data + chart data extractor
claude plugin install pm-legal@pm-claude-skills # Legal
claude plugin install pm-finance@pm-claude-skills # Finance
claude plugin install pm-hr@pm-claude-skills # HR
claude plugin install pm-sales@pm-claude-skills # Sales
claude plugin install pm-operations@pm-claude-skills # Operations
claude plugin install pm-research@pm-claude-skills # Research & Healthcare
claude plugin install pm-cross@pm-claude-skills # Cross-profession
claude plugin install pm-figma@pm-claude-skills # Figma
claude plugin install pm-social@pm-claude-skills # Social Media 🆕
claude plugin install pm-writers@pm-claude-skills # Writers & Content Creators 🆕
Or clone and symlink for auto-updates:
@@ -32,6 +72,304 @@ git clone https://github.com/mohitagw15856/pm-claude-skills.git ~/pm-claude-skil
mkdir -p ~/.claude/skills
ln -s ~/pm-claude-skills/skills/* ~/.claude/skills/
---
## 🌐 Skill Playground — Try Any Skill in Your Browser
**▶ Live: [mohitagw15856.github.io/pm-claude-skills](https://mohitagw15856.github.io/pm-claude-skills/)**
Don't want to install anything yet? Run any of these skills from a **zero-backend web app** using **your own Claude API key**. Pick a skill, fill in the auto-generated form, and Claude streams the result. Your key is stored only in your browser (`localStorage`) and sent directly to the Anthropic API — nothing touches a server we own.
![Skill Playground — pick a skill, fill the form, run it with your own Claude key](web/docs-assets/playground.png)
**Run it locally:**
```bash
git clone https://github.com/mohitagw15856/pm-claude-skills.git
cd pm-claude-skills
node web/build-skills.mjs # generate the skill index (skills.json)
cd web && python3 -m http.server 8000 # serve over HTTP (not file://)
# open http://localhost:8000 and paste a key from console.anthropic.com
```
It's fully static — deploy the `web/` folder to GitHub Pages, Netlify, or Vercel with no environment variables. Full details in [`web/README.md`](web/README.md).
---
## 📦 Plugin Directory
Not sure which plugin to install? Here's what each one covers:
| Plugin | Skills | Best for |
|---|---|---|
| **pm-essentials** | competitive-analysis, meeting-notes, prd-template, stakeholder-update, user-research-synthesis, docx-tracked-changes | The core PM toolkit — start here if you're new. Covers the documents you write every week: PRDs, stakeholder updates, meeting notes, and competitive analysis. |
| **pm-advanced** | ai-ethics-review, ai-product-canvas, experiment-designer, design-handoff-brief, multi-source-signal-synthesiser | For PMs working on AI products or running sophisticated experiments. Covers ethical review of AI features, AI-native product canvases, and experiment design. |
| **pm-analytics** | data-analysis-standard, product-health-analysis, retention-analysis | Turn raw data into PM-ready narratives. Use when you need to frame an analysis, explain health metrics to leadership, or diagnose retention drop-offs. |
| **pm-business** | board-deck-narrative, investor-update, job-application | For PMs operating at the business layer — writing board narratives, investor updates, or crafting a standout job application. |
| **pm-cross** | executive-summary, grant-proposal, last-30-days-research, notebooklm-connector, press-release, sycophancy-challenger, teaching-lesson-plan | Cross-profession utility skills that work outside a single domain — writing executive summaries, press releases, running research, and challenging sycophantic AI output. |
| **pm-cs** | churn-analysis, cs-escalation-brief, cs-health-scorecard, customer-success-plan, qbr-deck, renewal-playbook | For PMs or CSMs responsible for retention. Covers churn diagnosis, escalation briefs, QBR decks, health scorecards, and renewal plays. |
| **pm-data** | chart-data-extractor, cohort-analysis, dashboard-brief, data-pipeline-spec, metrics-framework, sql-query-explainer | Data-heavy work: extracting insights from charts, building metrics frameworks, explaining SQL queries, designing dashboards, and speccing data pipelines. |
| **pm-delivery** | ab-test-planner, go-to-market-planner, pptx-slide-auditor, product-launch-checklist, retro-analysis, sprint-brief, sprint-planning, technical-spec-template, user-story-writer | Everything you need to ship: sprint planning, user stories, launch checklists, A/B test design, retros, and PowerPoint auditing. The most-used plugin for day-to-day delivery. |
| **pm-design** | accessibility-audit, design-critique, design-system-audit, ux-research-plan | For PMs who work closely with design. Covers accessibility audits, structured design critiques, design system reviews, and UX research planning. |
| **pm-discovery** | assumption-mapper, customer-journey-map, discovery-interview-guide, job-story-mapper, user-interview-synthesis | The discovery toolkit: map assumptions, build journey maps, write interview guides, synthesise user interviews, and reframe features as job stories. |
| **pm-engineering** | 37 skills across API docs, architecture, CI/CD, incident response, security, observability, and more | The largest plugin — built for PMs embedded in engineering teams. Covers technical specs, runbooks, on-call processes, architecture decisions, and engineering hiring. |
| **pm-figma** | figma-annotation-guide, figma-component-audit, figma-design-brief, figma-design-critique-pm, figma-design-qa, figma-design-review, figma-prototype-plan, figma-spacing-system, figma-user-flow-planner, figma-variant-matrix | Purpose-built for Figma workflows. Covers design QA, component audits, spacing systems, user flow planning, variant matrices, and design briefs — all from a PM perspective. |
| **pm-finance** | budget-variance-analysis, financial-due-diligence, financial-model-narrative, investor-pitch-deck, tax-planning-checklist | For PMs who touch financials — explaining budget variances, building investor pitch decks, narrating financial models, and running due diligence reviews. |
| **pm-gtm** | competitor-teardown, content-calendar, email-campaign, go-to-market, media-pitch, product-positioning-doc, seo-content-brief, social-media-strategy | The go-to-market toolkit: positioning docs, competitor teardowns, GTM plans, content calendars, email campaigns, and SEO briefs. Best for PMs who own launch and demand. |
| **pm-hr** | change-management-plan, employee-engagement-survey, job-description-writer, onboarding-plan, redundancy-consultation | People operations skills — writing job descriptions, managing change, designing onboarding, running engagement surveys, and handling redundancy consultations. |
| **pm-legal** | compliance-checklist, contract-review, legal-brief, nda-analyser | For PMs navigating legal and compliance work: reviewing NDAs, summarising contracts, creating compliance checklists, and preparing legal briefs. |
| **pm-operations** | email-triage, morning-intelligence, process-documentation, project-status-report, raci-matrix, risk-register, sop-writer, vendor-evaluation, workshop-facilitation-guide | Operational efficiency skills — managing your inbox, running status reports, documenting processes, evaluating vendors, writing SOPs, and facilitating workshops. |
| **pm-people** | 360-feedback-template, hiring-rubric, performance-review, team-health-check, team-offsite-planner | For people managers and team leads: writing performance reviews, running 360 feedback, designing hiring rubrics, checking team health, and planning offsites. |
| **pm-planning** | feature-prioritisation, okr-builder, pricing-strategy, rice-impact-matrix, rice-prioritisation, roadmap-narrative, roadmap-presentation | Strategic planning from roadmaps to OKRs — prioritising features with RICE, writing roadmap narratives, setting pricing, building OKRs, and presenting strategy to stakeholders. |
| **pm-research** | clinical-case-summary, literature-review, patient-communication, research-protocol | For PMs in healthcare and research settings. Covers clinical case summaries, literature reviews, research protocols, and patient-facing communication. |
| **pm-rituals** | pm-weekly-review | A single powerful skill for the PM weekly review ritual — reflecting on progress, blockers, and priorities in a structured, consistent format. |
| **pm-sales** | account-plan, discovery-call-prep, partnership-proposal, proposal-writer, sales-battlecard, sales-forecasting-model | For PMs who work alongside sales — writing battlecards, preparing for discovery calls, building account plans, crafting partnership proposals, and forecasting. |
| **pm-social** | community-management-playbook, influencer-brief, social-ad-campaign, social-media-audit, viral-content-framework | Social media and community skills: running ad campaigns, briefing influencers, auditing social presence, building community playbooks, and designing viral content. |
| **pm-strategy** | ambiguity-resolver, competitive-intelligence-monitor, competitor-signal-tracker, executive-update, stakeholder-influence-mapper, strategic-narrative-generator | Senior PM and strategic work — resolving ambiguity, tracking competitive signals, mapping stakeholder influence, writing executive updates, and building strategic narratives. |
| **pm-writers** | aeo-optimizer, instagram-post-downloader, notes-humanizer, substack-notes-scraper, thumbnail-creator | For content creators and writers using Claude: optimising for AI search engines, humanising notes, scraping research from Substack, and generating thumbnail concepts. |
---
## 🎬 See It in Action
**Debugging Log Analyser** — paste a stack trace or error log, get a structured root cause diagnosis with probable cause, affected code path, a specific fix, and next debugging steps.
**PR Description Writer** — share your diff or commit list, get a reviewer-friendly PR description with summary, changes made, testing steps, and reviewer notes.
**Sprint Planning Skill** — paste your sprint goals and backlog items, get a complete structured sprint plan with capacity, commitments, risks, and a day-one kickoff agenda.
> 📹 Drop a demo in [Discussions](../../discussions) and we'll feature it here.
---
## 🤖 Building Blocks for Agent Templates
On May 5, 2026, Anthropic [released their first agent templates](https://www.anthropic.com/news/finance-agents) — pre-packaged Claude agents that combine **skills, connectors, and subagents** into ready-to-run workflows for financial services.
This library is the largest open-source collection of professional skills available — covering 17 professions beyond financial services. **The 167 skills here are the building blocks for agent templates outside of finance.**
### What is an agent template?
An agent template packages three things into one runnable workflow:
| Component | What it is | Example from this library |
|---|---|---|
| **Skills** | Markdown files that teach Claude how to produce structured professional outputs | `sprint-planning`, `contract-review`, `investor-update` |
| **Connectors** | Governed access to your team's data sources | Linear, Jira, Slack, Google Drive, Notion |
| **Subagents** | Focused Claude models for sub-tasks within the larger workflow | Capacity analyst, risk scorer, comparables selector |
A skill alone gives Claude a structured output format. An agent template gives Claude a complete workflow — pulling data, running specialised analysis, producing the output, and routing it where it needs to go.
### How to use this library to build your own agent template
Pick a recurring workflow on your team. Identify which existing skills cover the structured outputs that workflow needs. Add the connectors that let Claude reach the data. Add subagents for the analytical sub-tasks. That's the template.
Examples of agent templates this library supports:
| Template | Skills used | Connectors needed | Subagents |
|---|---|---|---|
| **PM Sprint Agent** | sprint-planning, sprint-brief, retro, project-status-report | Linear or Jira, Slack | Capacity analyst, risk scorer |
| **Legal Contract Review Agent** | contract-review, nda-analyser, compliance-checklist | Google Drive or SharePoint | Clause-by-clause risk scorer |
| **PM Discovery Agent** | discovery-interview-guide, user-interview-synthesis, assumption-mapper | Granola or Otter, Notion | Theme synthesiser |
| **Sales Pursuit Agent** | sales-battlecard, discovery-call-prep, proposal-writer, account-plan | Salesforce or HubSpot, Gong | Competitive intel analyst |
| **HR Onboarding Agent** | onboarding-plan, job-description-writer, change-management-plan | Workday or BambooHR, Slack | First-week scheduler |
| **Finance Board Pack Agent** | investor-update, board-deck-narrative, financial-model-narrative | NetSuite or Xero, Google Drive | KPI variance analyst |
| **Marketing Launch Agent** | go-to-market, content-calendar, email-campaign, media-pitch | HubSpot, Notion | Channel strategist |
### Available agent templates
The pm-claude-skills library now includes four working agent templates, each built from existing skills in this library combined with subagents and connectors. All four follow the architecture Anthropic introduced for [financial services agent templates](https://www.anthropic.com/news/finance-agents) on May 5, 2026.
| Template | What it does | Skills used | Connectors | Time saved |
|---|---|---|---|---|
| **[PM Sprint Agent](./templates/pm-sprint-agent/)** | End-to-end sprint planning — pulls backlog, calculates capacity, drafts plan, scores risks | sprint-planning, sprint-brief | Linear, Jira | 90 min → 90 sec |
| **[PM Discovery Agent](./templates/pm-discovery-agent/)** | Customer discovery synthesis — reads interview notes, finds themes, scores assumption confidence | user-interview-synthesis, job-story-mapper | Notion, Google Drive | 1 day → 5 min |
| **[PM Stakeholder Comms Agent](./templates/pm-stakeholder-comms-agent/)** | Audience-tailored stakeholder updates — exec, investor, cross-functional, or board | executive-update, investor-update, stakeholder-update, board-deck-narrative | Linear, Jira, Google Drive | 90 min → 1 min |
| **[PM Launch Agent](./templates/pm-launch-agent/)** | End-to-end launch coordination — content for every channel, calendar, metrics, checklist | go-to-market, content-calendar, media-pitch, email-campaign, launch-checklist | Notion (optional) | 4-6 hours → 3 min |
Each template includes:
- Working orchestration script
- Two or more focused subagents
- Connector configurations with documented setup
- Working examples (input + output)
- Smoke test for verifying installations
### How to install a template
All templates are part of the main library — installing the marketplace gives you all four.
/plugin marketplace add mohitagw15856/pm-claude-skills
Then navigate to the template you want and follow its README:
cd templates/pm-sprint-agent # or pm-discovery-agent, etc.
cat README.md # full setup instructions
### Building your own template
If you want to build a template for a workflow not covered above — Legal Contract Review, Sales Pursuit, Finance Board Pack, HR Onboarding, Marketing Campaign — see the [template contribution guide](./templates/CONTRIBUTING.md).
The pattern is consistent: pick a multi-step workflow, identify which existing skills cover the structured outputs, add connectors for data access, and define subagents for specialised analysis. The four templates above are reference implementations.
It combines four skills, two connectors, and two subagents into a single workflow that handles end-to-end sprint planning.
Documentation, working orchestration script, and example outputs are included in the template folder.
More templates will follow. If you want to contribute one, see the [template contribution guide](./templates/CONTRIBUTING.md).
---
## 📋 Changelog
<details>
<summary><strong>Release history — v6.0.0 → v14.0.0</strong> (click to expand)</summary>
### 🆕 What's New in v14.0.0 — Writers & Content Creators + 7 Community Skills
**12 new community-inspired skills across 4 bundles:**
### New profession: ✍️ Writers & Content Creators (`pm-writers`)
| Skill | What It Does |
|---|---|
| **Instagram Post Downloader** 🆕 | Downloads Instagram images and carousels as high-res files; stitches carousel slides into a single PDF |
| **AEO Optimizer** 🆕 | Restructures articles for AI citation — rewrites H2s as questions, adds 5080 word answer capsules, audits paragraph length and trust signals |
| **Thumbnail Creator** 🆕 | Generates brand-aligned thumbnail candidates via Gemini API from article copy; Claude evaluates results via computer vision |
| **Substack Notes Scraper** 🆕 | Scrapes Substack Notes engagement data (likes, comments, restacks) and exports a formatted .xlsx with filters and conditional formatting |
| **Notes Humanizer** 🆕 | Strips AI writing patterns (em dashes, filler phrases, uniform rhythm) and injects genuine human signals — opinion, varied rhythm, specific detail |
### Extended: `pm-cross` (+3 skills)
| Skill | What It Does |
|---|---|
| **Sycophancy Challenger** 🆕 | Flips Claude's default — argues the strongest case *against* your idea first, holds its position under pushback, and only backs down with new evidence |
| **Last 30 Days Research** 🆕 | Searches Reddit, X, and the web for the last 30 days on any topic and returns a structured report: consensus, disagreements, pain points, and signal confidence |
| **NotebookLM Connector** 🆕 | Automates NotebookLM from Claude Code via Chrome extension — create notebooks, add sources, generate mindmaps and audio overviews |
### Extended: `pm-operations` (+2 skills)
| Skill | What It Does |
|---|---|
| **Email Triage** 🆕 | Reads Gmail for a configurable window, filters out receipts/notifications, and surfaces only what needs a reply or decision — with priority, urgency, and a reply starter |
| **Morning Intelligence** 🆕 | 15-question interview that writes a personalised master prompt for your morning news brief, ready to drop into a Cowork Scheduled Task or Claude Code Routine |
### Extended: `pm-engineering` (+2 skills — for Claude Code users)
| Skill | What It Does |
|---|---|
| **Context Mode** 🆕 | Solves Claude Code context bloat and memory loss — filters raw command output and maintains a session log so Claude resumes exactly where it left off after a reset |
| **Claude Superpowers** 🆕 | Forces Claude Code to plan before coding, work in isolation, write tests first, and review its own work twice — from 60% first pass to 80%+ |
The library now includes **167 skills** across **18 professions** + 4 working agent templates.
---
### 🆕 What's New in v13.0.0 — Social Media Profession
**5 new skills — a complete Social Media profession bundle:**
| Skill | Bundle | What It Does |
|---|---|---|
| **Social Media Audit** 🆕 | pm-social | Scored audit across all platforms — profile completeness, content performance, competitive benchmarking, and a prioritised action plan |
| **Influencer Brief** 🆕 | pm-social | Complete creator partnership brief with deliverables, creative guidelines, approval workflow, commercial terms, and campaign measurement |
| **Community Management Playbook** 🆕 | pm-social | Response frameworks, moderation rules, escalation tiers, DM templates, tone-of-voice guidance, and community health metrics |
| **Social Ad Campaign** 🆕 | pm-social | Full-funnel paid social campaign plan with audience targeting, ad set architecture, copy for every format (video, static, carousel, lead gen), budget allocation, and A/B testing plan |
| **Viral Content Framework** 🆕 | pm-social | Psychology of sharing, 6 proven hook formulas, 5 content structures, platform-specific playbooks for LinkedIn/TikTok/Instagram/X/YouTube, and a repeatable content testing system |
The library now includes **167 skills** across **18 professions** + 4 working agent templates.
Install the new bundle:
claude plugin install pm-social@pm-claude-skills
---
### 🆕 What's New in v12.0.0 — 150 Skills Milestone
**15 new skills across 10 bundles:**
| Skill | Bundle | What It Does |
|---|---|---|
| **Cohort Analysis** 🆕 | pm-data | Retention curves, LTV projection, behavioural segmentation, and churn leading indicators — with SQL reference queries |
| **Data Pipeline Spec** 🆕 | pm-data | ETL/ELT pipeline design with sources, transforms, SLAs, DQ rules, error handling, and security/compliance notes |
| **Renewal Playbook** 🆕 | pm-cs | Renewal brief with health snapshot, stakeholder map, value story, commercial scenarios, objection responses, and a 16-week timeline |
| **Customer Success Plan** 🆕 | pm-cs | Joint success plan with business goals, success metrics, milestone roadmap, mutual commitments, and escalation path |
| **360-Degree Feedback Template** 🆕 | pm-people | Either a complete survey instrument with GWT acceptance criteria, or a structured narrative feedback report with themes and development actions |
| **Team Health Check** 🆕 | pm-people | Spotify-model health assessment across 7 dimensions — delivery, safety, morale, speed, purpose, and collaboration — with facilitation guide |
| **Risk Register** 🆕 | pm-operations | L×I risk scoring, RAG heat map, top-risk executive summary, and per-risk mitigation and contingency plans |
| **RACI Matrix** 🆕 | pm-operations | Complete RACI with role definitions, decision map, anti-pattern guide, and a communication template for all involved teams |
| **Social Media Strategy** 🆕 | pm-gtm | Audience profile, platform rationale, content pillars, posting cadence, tone of voice, KPIs, and a 4-week starter calendar |
| **Product Positioning Doc** 🆕 | pm-gtm | April Dunford-style positioning doc with category, target customer, competitive alternatives, differentiation, proof points, messaging hierarchy, and persona messaging |
| **Customer Journey Map** 🆕 | pm-discovery | Stage-by-stage journey from awareness to advocacy with touchpoints, emotions, pain points, an emotion curve, and prioritised opportunities |
| **User Story Writer** 🆕 | pm-delivery | Production-ready user stories with Given/When/Then ACs, edge cases, out-of-scope, definition of done, and epic decomposition |
| **AI Ethics Review** 🆕 | pm-advanced | Structured ethical review covering fairness, bias, transparency, privacy, safety, accountability, and societal impact — with risk tier and pre-deployment checklist |
| **Partnership Proposal** 🆕 | pm-sales | B2B partnership proposal with mutual value, commercial model, joint GTM plan, governance, and risks |
| **Design System Audit** 🆕 | pm-design | Component coverage audit, token consistency, documentation quality, WCAG 2.2 accessibility, adoption barriers, and a remediation roadmap |
The library now includes **150 skills** across **16 professions** + 4 working agent templates.
---
### 🆕 What's New in v10.0.0
**Two star milestones unlocked — 8 new skills shipped:**
**Customer Success bundle (250 ⭐ milestone):**
| Skill | Bundle | What It Does |
|---|---|---|
| **Customer Health Scorecard** 🆕 | pm-cs | Weighted health score across adoption, engagement, outcomes, support, and commercial — with RAG status and renewal forecast |
| **QBR Deck** 🆕 | pm-cs | Slide-by-slide quarterly business review structure with talking points, value narrative, and mutual commitments |
| **Escalation Brief** 🆕 | pm-cs | Structured escalation brief for at-risk accounts — root cause, business impact, resolution plan, and decision required |
| **Churn Analysis** 🆕 | pm-cs | Churn rate breakdown by category and segment, early warning signals, and prioritised interventions |
**Engineering expansion (500 ⭐ milestone):**
| Skill | Bundle | What It Does |
|---|---|---|
| **CI/CD Playbook** 🆕 | pm-engineering | Complete pipeline playbook covering every stage, rollback procedures, secrets management, and on-call responsibilities |
| **SLO & Error Budget** 🆕 | pm-engineering | SLI definitions, SLO targets, error budget calculation, burn rate alerts, and error budget policy |
| **Developer Onboarding Doc** 🆕 | pm-engineering | Everything a new engineer needs in their first week — architecture, local setup, testing, deployment, and key contacts |
| **On-Call Runbook** 🆕 | pm-engineering | Per-alert response procedures, escalation matrix, diagnostic cheat sheet, and handoff template |
The library now includes **114 skills** across **16 professions** + 4 working agent templates.
| Skill | Bundle | What It Does |
|---|---|---|
| **Debugging Log Analyser** 🆕 | pm-engineering | Parse stack traces and error logs into a structured root cause diagnosis with a specific fix |
| **PR Description Writer** 🆕 | pm-engineering | Write reviewer-friendly PR descriptions from a diff, commit list, or change summary |
| **System Design Interview** 🆕 | pm-engineering | Structure complete system design answers with capacity estimates, component deep-dives, and trade-offs |
| **Changelog Generator** 🆕 | pm-engineering | Convert git commits into a polished, user-facing changelog following Keep a Changelog format |
| **Test Strategy Doc** 🆕 | pm-engineering | Write a complete test strategy with risk assessment, test types, coverage targets, and P0/P1 test cases |
| **Runbook Writer** 🆕 | pm-engineering | Write operational runbooks for deployments, incidents, and maintenance with exact commands and rollback steps |
The `pm-engineering` bundle now has **10 skills** — the most complete engineering toolkit in the library.
**Read the full story:** [Part 14 — I Rebuilt All 93 Skills and Added 7 More: What 100 Skills Taught Me About What Makes a Great Skill](https://medium.com/product-powerhouse/a-pull-request-made-me-rebuild-all-93-of-my-claude-skills-then-i-added-7-more-16d5fe3e7f85)
---
### 📖 v6.0.0 — 100 Skills Milestone
**7 skills added:**
| Skill | Bundle | What It Does |
|---|---|---|
| **Teaching Lesson Plan** | pm-cross | Structured lesson plans for any subject, audience, or setting — with objectives, activities, and formative assessment |
| **SEO Content Brief** | pm-gtm | Complete SEO briefs with search intent analysis, competitor gaps, content outline, and on-page requirements |
| **Media Pitch** | pm-gtm | Story-first journalist pitches with angle development framework and pitch rules |
| **Change Management Plan** | pm-hr | Full change plan covering stakeholder analysis, communication strategy, training, and adoption metrics |
| **Workshop Facilitation Guide** | pm-operations | Complete facilitation guides with activity instructions, decision protocols, and facilitator moves |
| **Sales Forecasting Model** | pm-sales | Pipeline-based forecast with stage model, scenario analysis, assumption log, and activity sanity check |
| **Tax Planning Checklist** | pm-finance | Year-end tax planning review framework across income, pension, CGT, business reliefs, and ISAs |
</details>
---
@@ -50,212 +388,367 @@ This repo was built alongside a published article series. Read the full story:
| Part 7 | 33 Claude Skills for PMs Are Now in the Claude Code Marketplace | [Read →](https://medium.com/product-powerhouse/33-claude-skills-for-pms-are-now-in-the-claude-code-marketplace-heres-how-to-install-them-7968ab6bb1e1) |
| Part 8 | I Added 20 New Claude Skills Beyond Product Management | [Read →](https://medium.com/product-powerhouse/i-built-20-new-claude-skills-for-every-profession-heres-the-full-library-50278e00bf72) |
| Part 9 | 80 Claude Skills for Every Profession — Lawyers, Doctors, Finance, HR, Sales and More | [Read →](https://medium.com/@mohit15856/80-claude-skills-for-every-profession-lawyers-doctors-finance-hr-sales-and-more-3dfde9ec0033) |
| Part 10 | A Day in the Life With 80 Claude Skills — What Actually Gets Triggered | [Read →](https://medium.com/@mohit15856/80-claude-skills-for-every-profession-lawyers-doctors-finance-hr-sales-and-more-3dfde9ec0033)|
| Part 11 | 10 Figma Claude Skills for PMs and Designers — The Complete Figma Toolkit | *Latest — Link TBC* |
| Part 10 | A Day in the Life With 80 Claude Skills | [Read →](https://medium.com/@mohit15856/a-day-in-the-life-with-80-claude-skills-what-actually-gets-triggered-7caf9f5c159e) |
| Part 11 | 10 Figma Claude Skills for PMs and Designers | [Read →](https://medium.com/@mohit15856/10-figma-claude-skills-for-pms-and-designers-the-complete-figma-toolkit-784441d07a78)|
| Part 12 | I Built the Same Skills Library for ChatGPT — Here's What's Different | [Read →](https://medium.com/product-powerhouse/i-built-the-same-skills-library-for-chatgpt-heres-what-s-different-a9305f9c20b9) |
| Part 13 | I Re-Tested My 90 Claude Skills on Opus 4.7 — Here's What Got Better | [Read →](https://medium.com/all-about-claude/i-re-tested-my-90-claude-skills-on-opus-4-7-heres-what-actually-got-better-dd4b9369329e)|
| Part 14 | I Rebuilt All 93 Skills and Added 7 More: What 100 Skills Taught Me About What Makes a Great Skill | [Read →](https://medium.com/product-powerhouse/a-pull-request-made-me-rebuild-all-93-of-my-claude-skills-then-i-added-7-more-16d5fe3e7f85) |
| Part 15 | Im a Product Manager. I Just Shipped 6 Engineering Skills to My Open-Source Claude Library. | [Read →](https://medium.com/product-powerhouse/im-a-product-manager-i-just-shipped-6-engineering-skills-to-my-open-source-claude-library-8745aaa2ecf9) |
| Part 16 | Anthropic Just Released 10 Agent Templates. Heres the First One I Built Using My 106 Skills. | [Read →](https://medium.com/product-powerhouse/anthropic-just-released-10-agent-templates-heres-the-first-one-i-built-using-my-106-skills-a6708f9bd3ea) |
---
## 🗂️ All 90 Skills
## 🗂️ All 167 Skills
### 🛠️ Product Management (Skills 133)
The [Plugin Directory](#-plugin-directory) above summarises every bundle. Expand below for the full per-skill breakdown with folder paths.
<details>
<summary><strong>Browse all 167 skills by profession</strong> (click to expand)</summary>
### 🛠️ Product Management (Skills 137)
**Bundles:** `pm-essentials` · `pm-discovery` · `pm-planning` · `pm-delivery` · `pm-analytics` · `pm-strategy` · `pm-advanced` · `pm-rituals`
> The original toolkit covering the full PM lifecycle — discovery, prioritisation, delivery, strategy, stakeholder comms, and weekly rituals.
> The original toolkit covering the full PM lifecycle — discovery, prioritisation, delivery, strategy, stakeholder comms, and weekly rituals. Now includes Word tracked changes and PowerPoint slide auditing.
| # | Skill | What It Does |
|---|---|---|
| 15 | **pm-essentials** | PRD Template, Meeting Notes, Stakeholder Update, User Research Synthesis, Competitive Analysis |
| 69 | **pm-discovery** | Discovery Interview Guide, Job Story Mapper, User Interview Synthesis, Assumption Mapper |
| 1015 | **pm-planning** | OKR Builder, Feature Prioritisation (RICE/MoSCoW/Kano/ICE), Roadmap Presentation, Pricing Strategy |
| 1622 | **pm-delivery** | Sprint Planning, Technical Spec, A/B Test Planner, Go-to-Market Planner, Launch Checklist, Sprint Brief, Retro |
| 2325 | **pm-analytics** | Data Analysis Standard, Retention Analysis, Product Health Analysis |
| 2631 | **pm-strategy** | Competitor Signal Tracker, Competitive Intelligence Monitor, Stakeholder Influence Mapper, Strategic Narrative, Executive Update, Ambiguity Resolver |
| 3233 | **pm-advanced** | AI Product Canvas, Multi-Source Signal Synthesiser, Experiment Designer, Design Handoff Brief |
| 16 | **pm-essentials** | PRD Template, Meeting Notes, Stakeholder Update, User Research Synthesis, Competitive Analysis, **Word Doc Tracked Changes** |
| 711 | **pm-discovery** | Discovery Interview Guide, Job Story Mapper, User Interview Synthesis, Assumption Mapper, **Customer Journey Map** 🆕 |
| 1217 | **pm-planning** | OKR Builder, Feature Prioritisation (RICE/MoSCoW/Kano/ICE), Roadmap Presentation, Pricing Strategy, RICE Impact Matrix, Roadmap Narrative |
| 1826 | **pm-delivery** | Sprint Planning, Technical Spec, A/B Test Planner, Go-to-Market Planner, Launch Checklist, Sprint Brief, Retro, PPTX Slide Auditor, **User Story Writer** 🆕 |
| 2729 | **pm-analytics** | Data Analysis Standard, Retention Analysis, Product Health Analysis |
| 3035 | **pm-strategy** | Competitor Signal Tracker, Competitive Intelligence Monitor, Stakeholder Influence Mapper, Strategic Narrative, Executive Update, Ambiguity Resolver |
| 3637 | **pm-advanced** | AI Product Canvas, Multi-Source Signal Synthesiser, Experiment Designer, Design Handoff Brief, **AI Ethics Review** 🆕 |
> See [Part 7 article](https://medium.com/product-powerhouse/33-claude-skills-for-pms-are-now-in-the-claude-code-marketplace-heres-how-to-install-them-7968ab6bb1e1) for full PM skills detail.
---
### 📣 Marketing & GTM (Skills 3437)
### 📣 Marketing & GTM (Skills 3845)
**Bundle:** `pm-gtm`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 34 | **Go-To-Market** | `skills/go-to-market/` | Positioning statements, messaging pillars, feature/benefit mapping, role-specific use cases |
| 35 | **Content Calendar** | `skills/content-calendar/` | Multi-channel content calendars with opening hooks, formats, and repurposing map |
| 36 | **Competitor Teardown** | `skills/competitor-teardown/` | Full competitive analysis: positioning map, feature comparison, messaging gaps, SWOT, recommendations |
| 37 | **Email Campaign** | `skills/email-campaign/` | Sequenced email campaigns with subject lines, preview text, body copy, and CTAs |
| 38 | **Go-To-Market** | `skills/go-to-market/` | Positioning statements, messaging pillars, feature/benefit mapping, role-specific use cases |
| 39 | **Content Calendar** | `skills/content-calendar/` | Multi-channel content calendars with opening hooks, formats, and repurposing map |
| 40 | **Competitor Teardown** | `skills/competitor-teardown/` | Full competitive analysis: positioning map, feature comparison, messaging gaps, SWOT, recommendations |
| 41 | **Email Campaign** | `skills/email-campaign/` | Sequenced email campaigns with subject lines, preview text, body copy, and CTAs |
| 42 | **SEO Content Brief** | `skills/seo-content-brief/` | Complete SEO briefs with search intent, competitor gap analysis, content outline, and on-page requirements |
| 43 | **Media Pitch** | `skills/media-pitch/` | Story-first journalist pitches with angle development framework and pitch writing rules |
| 44 | **Social Media Strategy** 🆕 | `skills/social-media-strategy/` | Audience profile, platform rationale, content pillars, posting cadence, KPIs, and a 4-week starter calendar |
| 45 | **Product Positioning Doc** 🆕 | `skills/product-positioning-doc/` | April Dunford-style positioning with category, differentiation, proof points, messaging hierarchy, and persona messaging |
---
### 👩‍💻 Engineering & Tech (Skills 3841)
### 👩‍💻 Engineering & Tech (Skills 4680, 166167)
**Bundle:** `pm-engineering`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 38 | **Code Review Checklist** | `skills/code-review-checklist/` | Tailored PR review checklists by language, type, and risk level |
| 39 | **Incident Postmortem** | `skills/incident-postmortem/` | Blameless postmortems with timeline, RCA, impact, and action items |
| 40 | **API Docs Writer** | `skills/api-docs-writer/` | Developer-facing API docs: endpoints, parameters, response schemas, code examples |
| 41 | **Architecture Decision Record** | `skills/architecture-decision-record/` | ADRs with context, options considered, decision, consequences, and risks |
| 46 | **Code Review Checklist** | `skills/code-review-checklist/` | Tailored PR review checklists by language, type, and risk level |
| 47 | **Incident Postmortem** | `skills/incident-postmortem/` | Blameless postmortems with timeline, RCA, impact, and action items |
| 48 | **API Docs Writer** | `skills/api-docs-writer/` | Developer-facing API docs: endpoints, parameters, response schemas, code examples |
| 49 | **Architecture Decision Record** | `skills/architecture-decision-record/` | ADRs with context, options considered, decision, consequences, and risks |
| 50 | **Debugging Log Analyser** | `skills/debugging-log-analyser/` | Parse stack traces and error logs into a structured root cause diagnosis with a specific fix |
| 51 | **PR Description Writer** | `skills/pr-description-writer/` | Write reviewer-friendly PR descriptions from a diff, commit list, or change summary |
| 52 | **System Design Interview** | `skills/system-design-interview/` | Structure complete system design answers with capacity estimates, component deep-dives, and trade-offs |
| 53 | **Changelog Generator** | `skills/changelog-generator/` | Convert git commits into a polished, user-facing changelog following Keep a Changelog format |
| 54 | **Test Strategy Doc** | `skills/test-strategy-doc/` | Write a complete test strategy with risk assessment, test types, coverage targets, and P0/P1 test cases |
| 55 | **Runbook Writer** | `skills/runbook-writer/` | Write operational runbooks for deployments, incidents, and maintenance with exact commands and rollback steps |
| 56 | **CI/CD Playbook** | `skills/cicd-playbook/` | Complete pipeline playbook covering every stage, rollback procedures, secrets management, and on-call responsibilities |
| 57 | **SLO & Error Budget** | `skills/slo-error-budget/` | SLI definitions, SLO targets, error budget calculation, burn rate alerts, and error budget policy |
| 58 | **Developer Onboarding Doc** | `skills/developer-onboarding-doc/` | Everything a new engineer needs in their first week — architecture, local setup, testing, deployment, and key contacts |
| 59 | **On-Call Runbook** | `skills/oncall-runbook/` | Per-alert response procedures, escalation matrix, diagnostic cheat sheet, and handoff template |
| 60 | **Security Threat Model** 🆕 | `skills/security-threat-model/` | STRIDE-based threat model with asset register, trust boundaries, per-component threat enumeration, risk scores, and mitigations |
| 61 | **Performance Budget** 🆕 | `skills/performance-budget/` | Performance budgets for Core Web Vitals and backend latency SLOs with CI enforcement and breach response policy |
| 62 | **Database Schema Design** 🆕 | `skills/database-schema-design/` | Database schema documentation with ER diagram, DDL definitions, index strategy, and access pattern analysis |
| 63 | **Database Migration Plan** 🆕 | `skills/database-migration-plan/` | Safe zero-downtime migration plan using expand-contract pattern with per-step rollback and data validation queries |
| 64 | **Technical Debt Register** 🆕 | `skills/technical-debt-register/` | Debt inventory with business impact scoring, effort estimates, priority matrix, and quarterly resolution roadmap |
| 65 | **RFC Writer** 🆕 | `skills/rfc-writer/` | Engineering Request for Comments covering problem, proposed solution, alternatives-with-rejection-reasons, and rollout plan |
| 66 | **Capacity Planning** 🆕 | `skills/capacity-planning/` | Traffic forecasts, resource requirements per tier, scaling strategy, cost projections, and infrastructure action roadmap |
| 67 | **Load Testing Plan** 🆕 | `skills/load-testing-plan/` | Load test plan with scenario definitions (baseline/stress/spike/soak), k6/Locust skeleton, thresholds, and CI gates |
| 68 | **Disaster Recovery Plan** 🆕 | `skills/disaster-recovery-plan/` | DR plan with RPO/RTO targets, per-scenario runbooks, backup procedures, game day testing, and communication templates |
| 69 | **Feature Flag Guide** 🆕 | `skills/feature-flag-guide/` | Feature flag lifecycle playbook — taxonomy, rollout strategy, monitoring requirements, cleanup policy, and governance |
| 70 | **Dependency Audit** 🆕 | `skills/dependency-audit/` | Dependency audit for CVE vulnerabilities, license compliance, outdated packages, and 30-day remediation plan |
| 71 | **Service Catalog Entry** 🆕 | `skills/service-catalog-entry/` | Microservice catalog entry with ownership, SLAs, API contract, data classification, and operational runbook links |
| 72 | **Monitoring Setup Guide** 🆕 | `skills/monitoring-setup-guide/` | Four golden signals applied to a service, alert rules spec, structured log schema, tracing setup, and dashboard layout |
| 73 | **Local Dev Setup** 🆕 | `skills/local-dev-setup/` | Local development setup guide — prerequisites, env vars, dependencies, test commands, and 5 common failure fixes |
| 74 | **API Versioning Strategy** 🆕 | `skills/api-versioning-strategy/` | API versioning scheme, lifecycle policy, breaking change classification table, deprecation process, and migration guide template |
| 75 | **Infra-as-Code Review** 🆕 | `skills/infra-as-code-review/` | IaC review for Terraform/CloudFormation/Pulumi — security, naming, state, cost, and drift risk with severity-classified findings |
| 76 | **Engineering Weekly Report** 🆕 | `skills/engineering-weekly-report/` | Weekly engineering status in a consistent format — shipped/in-progress/blocked, metrics, decisions, risks, and next week |
| 77 | **Tech Radar** 🆕 | `skills/tech-radar/` | ThoughtWorks-format technology radar with Adopt/Trial/Assess/Hold quadrants, per-blip rationale, and maintenance process |
| 78 | **Sprint Velocity Analysis** 🆕 | `skills/sprint-velocity-analysis/` | Velocity trend analysis, completion rate patterns, blocker frequency, improvement recommendations, and capacity forecast |
| 79 | **Microservices Decomposition** 🆕 | `skills/microservices-decomposition/` | Domain-driven service boundary design with bounded context map, communication patterns, data ownership, and strangler fig migration plan |
| 80 | **Engineering Hiring Rubric** 🆕 | `skills/engineering-hiring-rubric/` | Technical interview rubric with level expectations, coding scorecard, system design guide, behavioural question bank, and debrief template |
| 166 | **Context Mode** 🆕 | `skills/context-mode/` | Filters command output noise and maintains a session log so Claude resumes exactly where it left off after a context reset |
| 167 | **Claude Superpowers** 🆕 | `skills/claude-superpowers/` | Forces Claude Code to plan first, work in isolation, write tests before code, and double-review its own output — consistently better first passes |
---
### 📊 Data & Analytics (Skills 4244)
### 🤝 Customer Success (Skills 7681)
**Bundle:** `pm-cs`
> Install:
claude plugin install pm-cs@pm-claude-skills
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 76 | **Customer Health Scorecard** | `skills/cs-health-scorecard/` | Weighted health score across adoption, engagement, outcomes, support, and commercial — RAG status and renewal forecast |
| 77 | **QBR Deck** | `skills/qbr-deck/` | Slide-by-slide quarterly business review with talking points, value narrative, and mutual commitments |
| 78 | **Escalation Brief** | `skills/cs-escalation-brief/` | Structured brief for at-risk accounts — root cause, business impact, resolution plan, and decision required |
| 79 | **Churn Analysis** | `skills/churn-analysis/` | Churn breakdown by category and segment, early warning signals, and prioritised interventions |
| 80 | **Renewal Playbook** 🆕 | `skills/renewal-playbook/` | Renewal brief with health snapshot, value story, commercial scenarios, objection responses, and 16-week execution timeline |
| 81 | **Customer Success Plan** 🆕 | `skills/customer-success-plan/` | Joint success plan with business goals, success metrics, milestone roadmap, mutual commitments, and escalation path |
---
### 📊 Data & Analytics (Skills 8287)
**Bundle:** `pm-data`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 42 | **Metrics Framework** | `skills/metrics-framework/` | North Star + metric tree, dashboard tiers, counter-metrics |
| 43 | **SQL Query Explainer** | `skills/sql-query-explainer/` | Explain, optimise, write, and document SQL in plain English |
| 44 | **Dashboard Brief** | `skills/dashboard-brief/` | Complete dashboard spec: KPIs, charts, filters, layout, data requirements |
| 82 | **Metrics Framework** | `skills/metrics-framework/` | North Star + metric tree, dashboard tiers, counter-metrics |
| 83 | **SQL Query Explainer** | `skills/sql-query-explainer/` | Explain, optimise, write, and document SQL in plain English |
| 84 | **Dashboard Brief** | `skills/dashboard-brief/` | Complete dashboard spec: KPIs, charts, filters, layout, data requirements |
| 85 | **Chart Data Extractor** | `skills/chart-data-extractor/` | Extract pixel-level data from chart images into structured data tables |
| 86 | **Cohort Analysis** 🆕 | `skills/cohort-analysis/` | Retention curves, LTV projection, behavioural segmentation, churn leading indicators, and SQL reference queries |
| 87 | **Data Pipeline Spec** 🆕 | `skills/data-pipeline-spec/` | ETL/ELT pipeline design with sources, transforms, SLAs, DQ rules, error handling, and compliance notes |
---
### 🧑‍💼 Leadership & People (Skills 4547)
### 🧑‍💼 Leadership & People (Skills 8892)
**Bundle:** `pm-people`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 45 | **Performance Review** | `skills/performance-review/` | Structured reviews from bullet-point notes — self, manager, peer, and upward |
| 46 | **Hiring Rubric** | `skills/hiring-rubric/` | Interview scorecards with competencies, behavioural questions, and panel guide |
| 47 | **Team Offsite Planner** | `skills/team-offsite-planner/` | Full offsite agenda, session facilitation notes, and logistics checklist |
| 88 | **Performance Review** | `skills/performance-review/` | Structured reviews from bullet-point notes — self, manager, peer, and upward |
| 89 | **Hiring Rubric** | `skills/hiring-rubric/` | Interview scorecards with competencies, behavioural questions, and panel guide |
| 90 | **Team Offsite Planner** | `skills/team-offsite-planner/` | Full offsite agenda, session facilitation notes, and logistics checklist |
| 91 | **360-Degree Feedback Template** 🆕 | `skills/360-feedback-template/` | Survey instrument with GWT-anchored questions, or a structured narrative report with strengths and development themes |
| 92 | **Team Health Check** 🆕 | `skills/team-health-check/` | Spotify-model assessment across 7 dimensions — delivery, safety, morale, speed, purpose, and collaboration |
---
### 🎨 Design & UX (Skills 4850)
### 🎨 Design & UX (Skills 9396)
**Bundle:** `pm-design`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 48 | **UX Research Plan** | `skills/ux-research-plan/` | Research plans with screener, discussion guide, and synthesis framework |
| 49 | **Design Critique** | `skills/design-critique/` | Structured feedback using JTBD, Gestalt principles, and Nielsen's heuristics |
| 50 | **Accessibility Audit** | `skills/accessibility-audit/` | WCAG 2.2 audit with prioritised remediation and quick wins |
| 93 | **UX Research Plan** | `skills/ux-research-plan/` | Research plans with screener, discussion guide, and synthesis framework |
| 94 | **Design Critique** | `skills/design-critique/` | Structured feedback using JTBD, Gestalt principles, and Nielsen's heuristics |
| 95 | **Accessibility Audit** | `skills/accessibility-audit/` | WCAG 2.2 audit with prioritised remediation and quick wins |
| 96 | **Design System Audit** 🆕 | `skills/design-system-audit/` | Audit component coverage, token consistency, documentation quality, WCAG compliance, adoption barriers, and remediation roadmap |
---
### 🏢 Business & Strategy (Skills 5153)
### 🏢 Business & Strategy (Skills 9799)
**Bundle:** `pm-business`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 51 | **Investor Update** | `skills/investor-update/` | Monthly/quarterly investor updates: metrics, highlights, challenges, and asks |
| 52 | **Board Deck Narrative** | `skills/board-deck-narrative/` | Slide-by-slide board presentation structure with narrative beats and talking points |
| 53 | **Job Application** | `skills/job-application/` | Tailored CV summary, ATS keyword optimisation, and cover letter for any JD |
| 97 | **Investor Update** | `skills/investor-update/` | Monthly/quarterly investor updates: metrics, highlights, challenges, and asks |
| 98 | **Board Deck Narrative** | `skills/board-deck-narrative/` | Slide-by-slide board presentation structure with narrative beats and talking points |
| 99 | **Job Application** | `skills/job-application/` | Tailored CV summary, ATS keyword optimisation, and cover letter for any JD |
---
### ⚖️ Legal (Skills 5457)
### ⚖️ Legal (Skills 100103)
**Bundle:** `pm-legal`
> ⚠️ All legal skills include a disclaimer. Not a substitute for qualified legal advice.
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 54 | **Contract Review** | `skills/contract-review/` | Structured review with key terms, flagged clauses, risk rating, and plain English summary |
| 55 | **NDA Analyser** | `skills/nda-analyser/` | Clause-by-clause NDA analysis with risk flags and negotiation checklist |
| 56 | **Legal Brief** | `skills/legal-brief/` | Legal memos and argument outlines in IRAC format (Issue, Rule, Application, Conclusion) |
| 57 | **Compliance Checklist** | `skills/compliance-checklist/` | GDPR, SOC 2, ISO 27001, FCA, HIPAA compliance checklists with prioritised gap analysis |
| 100 | **Contract Review** | `skills/contract-review/` | Structured review with key terms, flagged clauses, risk rating, and plain English summary |
| 101 | **NDA Analyser** | `skills/nda-analyser/` | Clause-by-clause NDA analysis with risk flags and negotiation checklist |
| 102 | **Legal Brief** | `skills/legal-brief/` | Legal memos and argument outlines in IRAC format (Issue, Rule, Application, Conclusion) |
| 103 | **Compliance Checklist** | `skills/compliance-checklist/` | GDPR, SOC 2, ISO 27001, FCA, HIPAA compliance checklists with prioritised gap analysis |
---
### 💰 Finance (Skills 5861)
### 💰 Finance (Skills 104108)
**Bundle:** `pm-finance`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 58 | **Financial Model Narrative** | `skills/financial-model-narrative/` | Turns P&L and model outputs into board-ready written narratives |
| 59 | **Budget Variance Analysis** | `skills/budget-variance-analysis/` | Variance table with root cause commentary and management summary |
| 60 | **Investor Pitch Deck** | `skills/investor-pitch-deck/` | Slide-by-slide pitch deck structure with what each slide must prove |
| 61 | **Financial Due Diligence** | `skills/financial-due-diligence/` | DD document request list, analytical questions, and red flags checklist |
| 104 | **Financial Model Narrative** | `skills/financial-model-narrative/` | Turns P&L and model outputs into board-ready written narratives |
| 105 | **Budget Variance Analysis** | `skills/budget-variance-analysis/` | Variance table with root cause commentary and management summary |
| 106 | **Investor Pitch Deck** | `skills/investor-pitch-deck/` | Slide-by-slide pitch deck structure with what each slide must prove |
| 107 | **Financial Due Diligence** | `skills/financial-due-diligence/` | DD document request list, analytical questions, and red flags checklist |
| 108 | **Tax Planning Checklist** | `skills/tax-planning-checklist/` | Year-end tax planning framework across income, pension, CGT, business reliefs, and ISAs |
---
### 👥 HR (Skills 6265)
### 👥 HR (Skills 109113)
**Bundle:** `pm-hr`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 62 | **Job Description Writer** | `skills/job-description-writer/` | Inclusive, structured JDs with built-in language review and salary range nudge |
| 63 | **Onboarding Plan** | `skills/onboarding-plan/` | 30/60/90-day plans with week-by-week structure, milestones, and manager checklist |
| 64 | **Employee Engagement Survey** | `skills/employee-engagement-survey/` | Survey design + results analysis mode with eNPS and action planning template |
| 65 | **Redundancy Consultation** | `skills/redundancy-consultation/` | Process timeline, at-risk letter, consultation script, and confirmation letter — UK law |
| 109 | **Job Description Writer** | `skills/job-description-writer/` | Inclusive, structured JDs with built-in language review and salary range nudge |
| 110 | **Onboarding Plan** | `skills/onboarding-plan/` | 30/60/90-day plans with week-by-week structure, milestones, and manager checklist |
| 111 | **Employee Engagement Survey** | `skills/employee-engagement-survey/` | Survey design + results analysis mode with eNPS and action planning template |
| 112 | **Redundancy Consultation** | `skills/redundancy-consultation/` | Process timeline, at-risk letter, consultation script, and confirmation letter — UK law |
| 113 | **Change Management Plan** | `skills/change-management-plan/` | Full change plan covering stakeholder analysis, communication strategy, training, and adoption metrics |
---
### 🤝 Sales (Skills 6669)
### 🤝 Sales (Skills 114119)
**Bundle:** `pm-sales`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 66 | **Sales Battlecard** | `skills/sales-battlecard/` | One-page competitive battlecard with objection responses and landmine questions |
| 67 | **Discovery Call Prep** | `skills/discovery-call-prep/` | Call brief with research summary, hypothesis, structured questions, and success criteria |
| 68 | **Proposal Writer** | `skills/proposal-writer/` | Commercial proposals structured around the prospect's problem, not the product |
| 69 | **Account Plan** | `skills/account-plan/` | Strategic account plan with relationship map, whitespace analysis, risks, and 90-day actions |
| 114 | **Sales Battlecard** | `skills/sales-battlecard/` | One-page competitive battlecard with objection responses and landmine questions |
| 115 | **Discovery Call Prep** | `skills/discovery-call-prep/` | Call brief with research summary, hypothesis, structured questions, and success criteria |
| 116 | **Proposal Writer** | `skills/proposal-writer/` | Commercial proposals structured around the prospect's problem, not the product |
| 117 | **Account Plan** | `skills/account-plan/` | Strategic account plan with relationship map, whitespace analysis, risks, and 90-day actions |
| 118 | **Sales Forecasting Model** | `skills/sales-forecasting-model/` | Pipeline-based forecast with stage model, scenario analysis, assumption log, and activity sanity check |
| 119 | **Partnership Proposal** 🆕 | `skills/partnership-proposal/` | B2B partnership proposal with mutual value, commercial model, joint GTM plan, governance, and risk register |
---
### ⚙️ Operations (Skills 7073)
### ⚙️ Operations (Skills 120126, 164165)
**Bundle:** `pm-operations`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 70 | **Process Documentation** | `skills/process-documentation/` | Clear process docs with steps, roles, edge cases — followable by a new starter |
| 71 | **SOP Writer** | `skills/sop-writer/` | Formal, audit-ready SOPs with version control, quality checks, and non-conformance process |
| 72 | **Vendor Evaluation** | `skills/vendor-evaluation/` | Weighted vendor scorecard, RFP questions, reference check template, and recommendation |
| 73 | **Project Status Report** | `skills/project-status-report/` | RAG status reports with milestone progress, issues, risks, and decisions required |
| 120 | **Process Documentation** | `skills/process-documentation/` | Clear process docs with steps, roles, edge cases — followable by a new starter |
| 121 | **SOP Writer** | `skills/sop-writer/` | Formal, audit-ready SOPs with version control, quality checks, and non-conformance process |
| 122 | **Vendor Evaluation** | `skills/vendor-evaluation/` | Weighted vendor scorecard, RFP questions, reference check template, and recommendation |
| 123 | **Project Status Report** | `skills/project-status-report/` | RAG status reports with milestone progress, issues, risks, and decisions required |
| 124 | **Workshop Facilitation Guide** | `skills/workshop-facilitation-guide/` | Complete facilitation guides with activity instructions, decision protocols, and facilitator moves |
| 125 | **Risk Register** 🆕 | `skills/risk-register/` | L×I risk scoring, RAG heat map, top-risk executive summary, and per-risk mitigation and contingency plans |
| 126 | **RACI Matrix** 🆕 | `skills/raci-matrix/` | RACI with role definitions, decision map, anti-pattern guide, and a communication template for all teams |
| 164 | **Email Triage** 🆕 | `skills/email-triage/` | Reads Gmail for a configurable window and surfaces only what needs action — priority-ranked with urgency ratings and reply starters |
| 165 | **Morning Intelligence** 🆕 | `skills/morning-intelligence/` | 15-question interview that writes a personalised master prompt for your daily news brief, ready for Cowork Scheduled Tasks or Claude Code Routines |
---
### 🏥 Research & Healthcare (Skills 7477)
### 🏥 Research & Healthcare (Skills 127130)
**Bundle:** `pm-research`
> ⚠️ Healthcare skills are for documentation and educational purposes only. All clinical content must be reviewed by a qualified professional.
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 74 | **Clinical Case Summary** | `skills/clinical-case-summary/` | SBAR handovers, SOAP notes, and case reports for educational and documentation use |
| 75 | **Research Protocol** | `skills/research-protocol/` | Complete study protocols with objectives, methodology, ethics, and analysis plan |
| 76 | **Patient Communication** | `skills/patient-communication/` | Plain English patient letters, leaflets, and results communications at Grade 6 reading level |
| 77 | **Literature Review** | `skills/literature-review/` | Thematically organised literature reviews with synthesis, critical analysis, and gap identification |
| 127 | **Clinical Case Summary** | `skills/clinical-case-summary/` | SBAR handovers, SOAP notes, and case reports for educational and documentation use |
| 128 | **Research Protocol** | `skills/research-protocol/` | Complete study protocols with objectives, methodology, ethics, and analysis plan |
| 129 | **Patient Communication** | `skills/patient-communication/` | Plain English patient letters, leaflets, and results communications at Grade 6 reading level |
| 130 | **Literature Review** | `skills/literature-review/` | Thematically organised literature reviews with synthesis, critical analysis, and gap identification |
---
### 🌐 Cross-Profession (Skills 7880)
### 🌐 Cross-Profession (Skills 131134, 161163)
**Bundle:** `pm-cross`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 78 | **Press Release** | `skills/press-release/` | Journalist-ready press releases with headline rules, boilerplate, and journalist test |
| 79 | **Grant Proposal** | `skills/grant-proposal/` | Complete grant applications aligned to funder priorities with budget narrative |
| 80 | **Executive Summary** | `skills/executive-summary/` | Decision-ready executive summaries with bottom line upfront, adapted for any audience |
| 131 | **Press Release** | `skills/press-release/` | Journalist-ready press releases with headline rules, boilerplate, and journalist test |
| 132 | **Grant Proposal** | `skills/grant-proposal/` | Complete grant applications aligned to funder priorities with budget narrative |
| 133 | **Executive Summary** | `skills/executive-summary/` | Decision-ready executive summaries with bottom line upfront, adapted for any audience |
| 134 | **Teaching Lesson Plan** | `skills/teaching-lesson-plan/` | Complete lesson plans for any subject, audience, or setting — with objectives, activities, and formative assessment |
| 161 | **Sycophancy Challenger** 🆕 | `skills/sycophancy-challenger/` | Argues the strongest case *against* your idea first — a genuine thinking partner that holds its position under pressure |
| 162 | **Last 30 Days Research** 🆕 | `skills/last-30-days-research/` | Searches Reddit, X, and the web for the last 30 days on any topic and returns consensus, disagreements, pain points, and signal confidence |
| 163 | **NotebookLM Connector** 🆕 | `skills/notebooklm-connector/` | Automates NotebookLM via Chrome extension — create notebooks, add sources, generate mindmaps and audio overviews from Claude Code |
---
### 🖼️ Figma (Skills 8190)
### 🖼️ Figma (Skills 135144)
**Bundle:** `pm-figma`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 81 | **Figma Component Audit** | `skills/figma-component-audit/` | Audit component library for naming issues, coverage gaps, and variant completeness |
| 82 | **Figma Design Brief** | `skills/figma-design-brief/` | Convert PRDs and feature requests into structured Figma design briefs |
| 83 | **Figma Annotation Guide** | `skills/figma-annotation-guide/` | Generate complete developer handoff annotations covering all states and edge cases |
| 84 | **Figma Design Review** | `skills/figma-design-review/` | PM design review against requirements with explicit approval status |
| 85 | **Figma User Flow Planner** | `skills/figma-user-flow-planner/` | Map all screens, states, and decision points before opening Figma |
| 86 | **Figma Variant Matrix** | `skills/figma-variant-matrix/` | Define all component variants, properties, and states before building |
| 87 | **Figma Spacing System** | `skills/figma-spacing-system/` | Design a complete spacing scale, grid, and token system |
| 88 | **Figma Prototype Plan** | `skills/figma-prototype-plan/` | Plan prototype scope, interactions, and test task scripts for user testing |
| 89 | **Figma Design QA** | `skills/figma-design-qa/` | Pre-handoff QA checklist covering file hygiene, states, accessibility, and handoff readiness |
| 90 | **Figma Design Critique (PM)** | `skills/figma-design-critique-pm/` | PM-perspective design critique focused on product outcomes, not aesthetics |
| 135 | **Figma Component Audit** | `skills/figma-component-audit/` | Audit component library for naming issues, coverage gaps, and variant completeness |
| 136 | **Figma Design Brief** | `skills/figma-design-brief/` | Convert PRDs and feature requests into structured Figma design briefs |
| 137 | **Figma Annotation Guide** | `skills/figma-annotation-guide/` | Generate complete developer handoff annotations covering all states and edge cases |
| 138 | **Figma Design Review** | `skills/figma-design-review/` | PM design review against requirements with explicit approval status |
| 139 | **Figma User Flow Planner** | `skills/figma-user-flow-planner/` | Map all screens, states, and decision points before opening Figma |
| 140 | **Figma Variant Matrix** | `skills/figma-variant-matrix/` | Define all component variants, properties, and states before building |
| 141 | **Figma Spacing System** | `skills/figma-spacing-system/` | Design a complete spacing scale, grid, and token system |
| 142 | **Figma Prototype Plan** | `skills/figma-prototype-plan/` | Plan prototype scope, interactions, and test task scripts for user testing |
| 143 | **Figma Design QA** | `skills/figma-design-qa/` | Pre-handoff QA checklist covering file hygiene, states, accessibility, and handoff readiness |
| 144 | **Figma Design Critique (PM)** | `skills/figma-design-critique-pm/` | PM-perspective design critique focused on product outcomes, not aesthetics |
claude plugin install pm-figma@pm-claude-skills
---
### 📅 PM Rituals (Skills 145150)
**Bundle:** `pm-rituals`
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 145 | **PM Weekly Review** | `skills/pm-weekly-review/` | Weekly PM review and planning ritual — metrics, shipping progress, blockers, and next week's priorities |
---
### 📱 Social Media (Skills 151155)
**Bundle:** `pm-social`
> Install:
```
claude plugin install pm-social@pm-claude-skills
```
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 151 | **Social Media Audit** 🆕 | `skills/social-media-audit/` | Scored audit across all platforms — profile completeness, content performance, competitive benchmarking, and a prioritised action plan with 30-day quick wins |
| 152 | **Influencer Brief** 🆕 | `skills/influencer-brief/` | Complete creator partnership brief with deliverables spec, creative guidelines, key messages, approval workflow, commercial terms, and campaign measurement |
| 153 | **Community Management Playbook** 🆕 | `skills/community-management-playbook/` | Response frameworks, moderation rules, escalation tiers, DM templates, tone-of-voice guidance, platform-specific notes, and community health metrics |
| 154 | **Social Ad Campaign** 🆕 | `skills/social-ad-campaign/` | Full-funnel paid social plan with audience targeting, ad set architecture, copy for every format (video, static, carousel, lead gen), budget allocation, bidding strategy, and A/B testing plan |
| 155 | **Viral Content Framework** 🆕 | `skills/viral-content-framework/` | Psychology of sharing, 6 proven hook formulas, 5 content structures, platform-specific playbooks for LinkedIn/TikTok/Instagram/X/YouTube, and a repeatable content testing system |
---
### ✍️ Writers & Content Creators (Skills 156160)
**Bundle:** `pm-writers`
> Install:
```
claude plugin install pm-writers@pm-claude-skills
```
| # | Skill | Folder | What It Does |
|---|---|---|---|
| 156 | **Instagram Post Downloader** 🆕 | `skills/instagram-post-downloader/` | Downloads Instagram images and full carousels as high-res files; stitches carousel slides into a single PDF. Requires `*.cdninstagram.com` on domain allowlist |
| 157 | **AEO Optimizer** 🆕 | `skills/aeo-optimizer/` | Restructures any article for AI citation — rewrites H2s as questions, adds 5080 word answer capsules under each, audits paragraph length, and flags trust signals |
| 158 | **Thumbnail Creator** 🆕 | `skills/thumbnail-creator/` | Generates brand-aligned thumbnail candidates via Gemini API; Claude evaluates results via computer vision and returns ranked candidates with rationale |
| 159 | **Substack Notes Scraper** 🆕 | `skills/substack-notes-scraper/` | Scrapes Substack Notes and exports likes, comments, and restacks to a formatted .xlsx with frozen headers, filters, and top-performer highlighting |
| 160 | **Notes Humanizer** 🆕 | `skills/notes-humanizer/` | Strips AI writing patterns (em dashes, filler phrases, uniform rhythm) across 3 phases: audit, strip, inject — returns side-by-side comparison and clean final text |
</details>
---
## ❤️ Sponsor This Work
Building and maintaining 167 skills across 26 bundles takes real time — testing skills against new model releases, building new ones from community requests, writing the article series, and keeping documentation current.
If these skills save you time at work, consider sponsoring:
**[💖 Become a Sponsor →](https://github.com/sponsors/mohitagw15856)**
Sponsorships from $5/month (coffee tier) up to $500/month (sustaining sponsor with logo placement). Every sponsor directly funds:
- New skills based on community votes in [SKILL_REQUEST.md](SKILL_REQUEST.md)
- Updates to existing skills when new Claude models ship
- Continued free, ad-free Medium articles documenting what works
- Quality improvements across the library
Higher tiers include custom skill development for your team, direct access for support, and logo placement in this README. See the [sponsor page](https://github.com/sponsors/mohitagw15856) for full tier details.
---
## 🤝 Contributing — Add Your Skill
This is an open-source community library. If you've built a skill that saves you time, share it here.
**Found a bug?** [Open a bug report →](../../issues/new?template=bug-report.md) — use the template so it's easy to triage.
**How to contribute:**
1. Fork this repo
@@ -284,16 +777,13 @@ description: "One sentence. Use when [trigger condition]. Produces [output descr
| Skill | Profession | Use Case |
|---|---|---|
| `teaching-lesson-plan` | Education | Structured lesson plans from curriculum objectives |
| `seo-content-brief` | Marketing | Content briefs with keyword strategy and outline |
| `grant-report` | Non-profit | Funder progress reports against grant objectives |
| `architectural-spec` | Architecture | Project specifications and technical drawing briefs |
| `media-pitch` | Journalism | Story pitches to editors and commissioning briefs |
| `clinical-guideline-summary` | Healthcare | Plain English summaries of clinical guidelines |
| `tax-planning-checklist` | Finance | Year-end tax planning checklist by entity type |
| `sales-forecasting-model` | Sales | Structured pipeline forecasting and commentary |
| `pitch-deck-feedback` | Startup | Investor-perspective critique of a pitch deck |
| `board-minutes` | Governance | Formal board meeting minutes from discussion notes |
Have a skill idea? [Open an issue](../../issues) or raise it in [Discussions](../../discussions).
Have a skill idea? Add it to [SKILL_REQUEST.md](SKILL_REQUEST.md), [open an issue](../../issues), or raise it in [Discussions](../../discussions). Most-voted requests get built first.
**Contributors** get credited in this README and in the article series. 🙌
@@ -304,38 +794,74 @@ Have a skill idea? [Open an issue](../../issues) or raise it in [Discussions](..
Install the whole library or just the bundles you need:
# Install everything
/plugin marketplace add https://github.com/mohitagw15856/pm-claude-skills
/plugin marketplace add mohitagw15856/pm-claude-skills
# Install by profession
claude plugin install pm-essentials@pm-claude-skills
claude plugin install pm-discovery@pm-claude-skills
claude plugin install pm-planning@pm-claude-skills
claude plugin install pm-delivery@pm-claude-skills
claude plugin install pm-analytics@pm-claude-skills
claude plugin install pm-strategy@pm-claude-skills
claude plugin install pm-advanced@pm-claude-skills
claude plugin install pm-rituals@pm-claude-skills
claude plugin install pm-gtm@pm-claude-skills
claude plugin install pm-engineering@pm-claude-skills
claude plugin install pm-engineering@pm-claude-skills # Engineering (35 skills)
claude plugin install pm-cs@pm-claude-skills # Customer Success
claude plugin install pm-data@pm-claude-skills
claude plugin install pm-people@pm-claude-skills
claude plugin install pm-design@pm-claude-skills
claude plugin install pm-business@pm-claude-skills
claude plugin install pm-legal@pm-claude-skills
claude plugin install pm-finance@pm-claude-skills
claude plugin install pm-hr@pm-claude-skills
claude plugin install pm-sales@pm-claude-skills
claude plugin install pm-operations@pm-claude-skills
claude plugin install pm-research@pm-claude-skills
claude plugin install pm-cross@pm-claude-skills
claude plugin install pm-figma@pm-claude-skills
claude plugin install pm-social@pm-claude-skills # Social Media 🆕
claude plugin install pm-writers@pm-claude-skills # Writers & Content Creators 🆕
---
## 🤖 Companion Repository — ChatGPT Custom GPTs
If you use ChatGPT instead of Claude Code, there's a companion repo with the same professional frameworks built as Custom GPT system prompts:
**[professional-gpt-library](https://github.com/mohitagw15856/professional-gpt-library)** — 10 starter GPTs across 8 professions, MIT licence.
Read the full breakdown: [Part 12 — I Built the Same Skills Library for ChatGPT](https://medium.com/product-powerhouse/i-built-the-same-skills-library-for-chatgpt-heres-what-s-different-a9305f9c20b9)
---
## 🛠️ Custom Skills for Your Team
The 90 skills in this library are built for general professional workflows. But the most powerful version of Claude Skills is one built specifically for *your* team — your templates, your terminology, your processes, your quality standards.
The 155 skills in this library are built for general professional workflows. But the most powerful version of Claude Skills is one built specifically for *your* team — your templates, your terminology, your processes, your quality standards.
**What custom skills look like in practice:**
@@ -365,10 +891,21 @@ Learn more: [Anthropic's Skills documentation](https://code.claude.com/docs/en/s
---
## ⭐ If this is useful
## ⭐ Star Milestones
Star the repo so others can find it. And if you build something with these skills — raise a PR, open a discussion, or tag me in your article.
Stars unlock the next wave of skills. Here's the roadmap:
| Milestone | Unlocks | Status |
|---|---|---|
| 100 ⭐ | 10 Figma skills + quality rebuild across all 93 skills | ✅ Shipped (v6.0.0) |
| 250 ⭐ | 10 Customer Success skills (health scorecard, QBR deck, escalation brief, churn analysis) | ✅ Unlocked — coming in next release |
| 500 ⭐ | 25 Engineering skills (CI/CD playbooks, SLO templates, onboarding docs, debugging patterns, threat models, capacity planning, DR plans, and more) | ✅ Shipped — pm-engineering now 35 skills (v11.0.0) |
| 1000 ⭐ | Full Startup Founder kit (fundraising memo, pitch critique, co-founder equity split) | 🔒 Locked |
**[⭐ Star this repo to unlock the next milestone →](https://github.com/mohitagw15856/pm-claude-skills)**
Want a specific skill built? [Vote or request in SKILL_REQUEST.md](SKILL_REQUEST.md).
---
*Built and maintained by [Mohit Aggarwal](https://medium.com/@mohit15856) | [Product Notes publication](https://medium.com/product-powerhouse)*
*Built and maintained by [Mohit Aggarwal](https://medium.com/@mohit15856) | [Product Notes publication](https://medium.com/product-powerhouse) | [💖 Sponsor my work](https://github.com/sponsors/mohitagw15856)*
+65
View File
@@ -0,0 +1,65 @@
# Skill Requests — Community Voting Board
Have an idea for a skill? Add it here or upvote existing requests by leaving a 👍 reaction on the issue.
---
## How to Request a Skill
1. [Open an issue](https://github.com/mohitagw15856/pm-claude-skills/issues/new) with the label `skill-request`
2. Include:
- **Skill name** (what you'd call it)
- **Profession** (who uses this)
- **Trigger** (when would you invoke it — e.g. "when I need to write X")
- **Output** (what should Claude produce)
3. Drop a 👍 on existing requests you'd use — most-voted get built first
---
## Milestone Unlocks
Stars drive the roadmap. Here's what's queued:
| Milestone | Unlocks |
|---|---|
| ✅ 100 ⭐ | 10 Figma skills + quality rebuild across all skills (v6.0.0) |
| 🔒 250 ⭐ | 10 Customer Success skills (health scorecard, QBR deck, escalation brief, churn analysis) |
| 🔒 500 ⭐ | 25 more Engineering skills (CI/CD playbooks, debugging deep-dives, onboarding docs, SLO templates) |
| 🔒 1000 ⭐ | Full Startup Founder kit (fundraising memo, pitch critique, co-founder agreement framework) |
**[Star this repo →](https://github.com/mohitagw15856/pm-claude-skills)**
---
## Requested Skills (Open)
Add a request by opening an issue — these are current top asks from the community:
| Skill | Profession | Requested By | Votes |
|---|---|---|---|
| `customer-health-scorecard` | Customer Success | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `qbr-deck-writer` | Customer Success | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `escalation-brief` | Customer Success | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `fundraising-memo` | Startup / Founder | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `youtube-script-writer` | Content Creator | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `newsletter-issue-writer` | Content Creator | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `analytics-event-taxonomy` | Data / Analytics | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `kpi-tree-builder` | Data / Analytics | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `dissertation-chapter-planner` | Academic | [@mohitagw15856](https://github.com/mohitagw15856) | — |
| `board-minutes` | Governance | Community | — |
> **To vote:** React with 👍 on the linked issue. To add a new request, open an issue with label `skill-request`.
---
## Recently Shipped
| Version | Skills Added |
|---|---|
| v7.0.0 | Debugging Log Analyser, PR Description Writer, System Design Interview, Changelog Generator, Test Strategy Doc, Runbook Writer |
| v6.0.0 | Teaching Lesson Plan, SEO Content Brief, Media Pitch, Change Management Plan, Workshop Facilitation Guide, Sales Forecasting Model, Tax Planning Checklist |
| v5.0.0 | 10 Figma skills |
---
*Maintained by [Mohit Aggarwal](https://github.com/mohitagw15856)*
-180
View File
@@ -1,180 +0,0 @@
#!/bin/bash
# =============================================================================
# create-plugin-jsons.sh
# Run this from the ROOT of your pm-claude-skills repo.
# Creates .claude-plugin/plugin.json inside each of the 6 new plugin folders.
# Your skills/ subfolders are already in place — this just adds the missing
# plugin.json files.
# =============================================================================
set -e
REPO_ROOT="$(pwd)"
echo "================================================"
echo " pm-claude-skills — Creating plugin.json files"
echo " Running from: $REPO_ROOT"
echo "================================================"
echo ""
# Sanity check — make sure we're in the right place
if [ ! -d "$REPO_ROOT/pm-gtm" ] || [ ! -d "$REPO_ROOT/pm-engineering" ]; then
echo "ERROR: Cannot find pm-gtm or pm-engineering folders."
echo "Make sure you are running this from the ROOT of your pm-claude-skills repo."
echo "Example: cd ~/pm-claude-skills && bash create-plugin-jsons.sh"
exit 1
fi
# ---------------------------------------------------------
# BUNDLE 1: pm-gtm
# ---------------------------------------------------------
echo "Creating pm-gtm/.claude-plugin/plugin.json..."
mkdir -p pm-gtm/.claude-plugin
cat > pm-gtm/.claude-plugin/plugin.json << 'EOF'
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-gtm",
"version": "1.0.0",
"description": "Marketing & GTM skills: Go-To-Market Planner, Content Calendar, Competitor Teardown, Email Campaign. Build positioning statements, messaging pillars, feature lists, use cases, and launch campaigns.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["product-management", "marketing", "gtm", "positioning", "content-calendar", "competitor-analysis", "email-campaign"]
}
EOF
echo " ✓ pm-gtm/.claude-plugin/plugin.json created"
# ---------------------------------------------------------
# BUNDLE 2: pm-engineering
# ---------------------------------------------------------
echo "Creating pm-engineering/.claude-plugin/plugin.json..."
mkdir -p pm-engineering/.claude-plugin
cat > pm-engineering/.claude-plugin/plugin.json << 'EOF'
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-engineering",
"version": "1.0.0",
"description": "Engineering & tech skills: Code Review Checklist, Incident Postmortem, API Docs Writer, Architecture Decision Record. Structured outputs for engineering teams and technical PMs.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["product-management", "engineering", "code-review", "incident-postmortem", "api-documentation", "adr", "architecture"]
}
EOF
echo " ✓ pm-engineering/.claude-plugin/plugin.json created"
# ---------------------------------------------------------
# BUNDLE 3: pm-data
# ---------------------------------------------------------
echo "Creating pm-data/.claude-plugin/plugin.json..."
mkdir -p pm-data/.claude-plugin
cat > pm-data/.claude-plugin/plugin.json << 'EOF'
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-data",
"version": "1.0.0",
"description": "Data & analytics skills: Metrics Framework, SQL Query Explainer, Dashboard Brief. Build North Star metric trees, explain and optimise SQL, and spec dashboards from business questions.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["product-management", "data", "analytics", "metrics", "north-star", "sql", "dashboard", "kpi"]
}
EOF
echo " ✓ pm-data/.claude-plugin/plugin.json created"
# ---------------------------------------------------------
# BUNDLE 4: pm-people
# ---------------------------------------------------------
echo "Creating pm-people/.claude-plugin/plugin.json..."
mkdir -p pm-people/.claude-plugin
cat > pm-people/.claude-plugin/plugin.json << 'EOF'
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-people",
"version": "1.0.0",
"description": "Leadership & people skills: Performance Review, Hiring Rubric, Team Offsite Planner. Write structured reviews, build interview scorecards, and plan offsites from goals to minute-by-minute agenda.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["product-management", "leadership", "management", "performance-review", "hiring", "interview", "offsite", "people"]
}
EOF
echo " ✓ pm-people/.claude-plugin/plugin.json created"
# ---------------------------------------------------------
# BUNDLE 5: pm-design
# ---------------------------------------------------------
echo "Creating pm-design/.claude-plugin/plugin.json..."
mkdir -p pm-design/.claude-plugin
cat > pm-design/.claude-plugin/plugin.json << 'EOF'
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-design",
"version": "1.0.0",
"description": "Design & UX skills: UX Research Plan, Design Critique, Accessibility Audit. Create research plans with discussion guides, critique designs using JTBD and Gestalt principles, and audit for WCAG 2.2 compliance.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["product-management", "design", "ux", "user-research", "accessibility", "wcag", "usability", "design-critique"]
}
EOF
echo " ✓ pm-design/.claude-plugin/plugin.json created"
# ---------------------------------------------------------
# BUNDLE 6: pm-business
# ---------------------------------------------------------
echo "Creating pm-business/.claude-plugin/plugin.json..."
mkdir -p pm-business/.claude-plugin
cat > pm-business/.claude-plugin/plugin.json << 'EOF'
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-business",
"version": "1.0.0",
"description": "Business & strategy skills: Investor Update, Board Deck Narrative, Job Application. Write investor updates investors actually read, structure board presentations, and tailor CVs and cover letters with ATS optimisation.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["product-management", "business", "strategy", "investor-update", "board-deck", "startup", "career", "job-application"]
}
EOF
echo " ✓ pm-business/.claude-plugin/plugin.json created"
# ---------------------------------------------------------
# DONE
# ---------------------------------------------------------
echo ""
echo "================================================"
echo " All 6 plugin.json files created successfully!"
echo ""
echo " pm-gtm/.claude-plugin/plugin.json"
echo " pm-engineering/.claude-plugin/plugin.json"
echo " pm-data/.claude-plugin/plugin.json"
echo " pm-people/.claude-plugin/plugin.json"
echo " pm-design/.claude-plugin/plugin.json"
echo " pm-business/.claude-plugin/plugin.json"
echo ""
echo " Next steps:"
echo " 1. bash add-plugin-json.sh (update marketplace.json)"
echo " 2. git add ."
echo " 3. git commit -m 'feat: add 6 new plugin bundles (pm-gtm, pm-engineering, pm-data, pm-people, pm-design, pm-business)'"
echo " 4. git push origin main"
echo "================================================"
BIN
View File
Binary file not shown.
@@ -0,0 +1,215 @@
---
name: ai-ethics-review
description: "Conduct a structured ethical review of an AI or ML feature, model, or product. Use when preparing to deploy an AI system, assessing algorithmic risk, auditing a model for bias, or producing a responsible AI impact assessment. Produces a structured ethics review covering fairness, transparency, privacy, safety, accountability, and societal impact with a risk tier score, pre-deployment checklist, and prioritised mitigations."
---
# AI Ethics Review Skill
This skill produces a structured ethical review of an AI or machine learning feature, model, or product. Output covers fairness, transparency, privacy, safety, accountability, and societal impact — with risk scoring, prioritised mitigations, and a checklist suitable for governance review or responsible AI documentation.
> ⚠️ This skill provides a structured framework for identifying and documenting ethical risks. It is not a substitute for legal advice, regulated algorithmic impact assessments, or specialist ethics review required in specific jurisdictions (e.g. EU AI Act, UK AI regulation).
## Required Inputs
Ask the user for these if not provided:
- **Feature or model name** and what it does
- **Who it affects** — which users or people does the AI interact with, make decisions about, or collect data from?
- **What decisions or outputs it produces** — recommendations, predictions, classifications, generation, automation?
- **Consequentiality** — how significant are the AI's decisions? (low-stakes suggestions vs decisions that affect employment, credit, health, safety, etc.)
- **Data used** — what training data, user data, or third-party data is used?
- **Human oversight** — is there a human in the loop, and at what stage?
- **Deployment context** — who will use this and how? (internal tool / consumer-facing / automated pipeline)
## Output Structure
---
# AI Ethics Review: [Feature / Model Name]
**Product / system:** [Name and brief description]
**Review type:** [Pre-deployment review / Post-deployment audit / Change review]
**Risk tier:** [High / Medium / Low — based on consequentiality, scale, and affected population]
**Reviewer:** [Name / Team]
**Date:** [Date]
**Status:** [Draft / Approved / Requires escalation]
---
## 1. Feature Summary
| | |
|---|---|
| **What it does** | [12 sentences — plain English description of the AI feature and its purpose] |
| **Who uses it** | [End users / internal teams / automated system] |
| **Who is affected by its outputs** | [May be different from who uses it — e.g. an AI hiring tool is used by HR but affects candidates] |
| **Output type** | [Recommendation / Classification / Prediction / Generation / Automation / Scoring] |
| **Scale** | [How many people affected per day/month?] |
| **Consequentiality** | [High: affects access to services, employment, credit, health, safety / Medium: influences decisions / Low: suggestions with easy override] |
| **Human oversight level** | [Full automation / Human review before action / Human can override after action / Advisory only] |
---
## 2. Risk Tier Assessment
| Factor | Score (13) | Rationale |
|---|---|---|
| **Consequentiality** (impact on individuals) | [1=low, 3=high] | [e.g. 3 — model output influences hiring decisions] |
| **Scale** (number of people affected) | [1=few, 3=many] | [e.g. 2 — internal tool used for ~500 candidates/year] |
| **Reversibility** (can harm be undone?) | [1=reversible, 3=irreversible] | [e.g. 2 — unfair rejection can be appealed but may not be caught] |
| **Vulnerability of affected group** | [1=general population, 3=protected or vulnerable group] | [e.g. 2 — includes protected characteristics in the decision context] |
| **Transparency** (do affected people know?) | [1=informed, 3=opaque] | [e.g. 3 — candidates are not told AI is used in screening] |
**Composite risk tier:** [High (1215) / Medium (711) / Low (36)]
**Risk tier implications:**
- **High:** Mandatory senior ethics review, DPA/DPIA required, human-in-loop for all consequential decisions, ongoing monitoring required
- **Medium:** Ethics review recommended, document mitigations, quarterly monitoring
- **Low:** Standard review, document assumptions, annual review
---
## 3. Fairness & Bias
*Does the AI treat people equitably across groups?*
**Protected characteristics relevant to this feature:**
[List applicable protected characteristics — age, gender, race/ethnicity, disability, religion, national origin, etc.]
| Risk | Analysis | Mitigation |
|---|---|---|
| **Training data bias** | [Does the training data reflect historical discrimination? e.g. hiring data that reflects past biases in who was hired] | [Audit training data for demographic representation / use debiasing techniques / document data lineage] |
| **Proxy discrimination** | [Could the model use a proxy for a protected characteristic? e.g. using postcode as a proxy for race] | [Identify proxy features / test for disparate impact using adversarial debiasing] |
| **Differential performance** | [Does the model perform differently across demographic groups? — e.g. lower accuracy for underrepresented groups] | [Disaggregate performance metrics by group / set minimum performance thresholds per group] |
| **Feedback loops** | [Does the model's output reinforce existing disparities? e.g. recommending content that keeps disadvantaged groups in lower-engagement patterns] | [Monitor outcome distributions over time / implement feedback loop detection] |
**Fairness evaluation method:** [What method will be used to measure fairness — statistical parity / equalised odds / individual fairness? Who is responsible for running it and how often?]
---
## 4. Transparency & Explainability
*Can affected people understand how the AI makes decisions?*
| Dimension | Current state | Required state | Gap |
|---|---|---|---|
| **User disclosure** | [Are users told they're interacting with AI?] | [Yes — required for trust and regulation] | [e.g. No disclosure on current UI] |
| **Decision explanation** | [Can the system explain why it reached a conclusion?] | [For high-stakes decisions: yes] | [e.g. Black-box model — no feature attribution available] |
| **Right to know** | [Can affected people ask how a decision was made?] | [Yes — required under GDPR Art. 22 for automated decisions] | [e.g. No process exists] |
| **Confidence calibration** | [Does the model express appropriate uncertainty?] | [Yes — overconfident models cause over-reliance] | [e.g. Model outputs binary label without confidence score] |
**Explainability approach:** [LIME / SHAP / rule-based surrogate / LLM-generated rationale / none — and why]
---
## 5. Privacy & Data
*Is personal data used responsibly and lawfully?*
| Risk | Analysis | Mitigation |
|---|---|---|
| **Data minimisation** | [Does the model use more personal data than necessary?] | [Audit input features — remove any that don't improve performance and involve unnecessary data collection] |
| **Data retention** | [How long is personal data retained for training and inference?] | [Define retention policy aligned to GDPR / CCPA / sector requirements] |
| **Re-identification risk** | [Could model outputs or training data be used to identify individuals?] | [Differential privacy / k-anonymity / output rate limiting] |
| **Third-party data** | [Is data from third parties used? Is it licensed for this use?] | [Audit data licensing / get legal sign-off on each third-party source] |
| **Cross-border data transfer** | [Is personal data transferred across jurisdictions?] | [Legal review — Standard Contractual Clauses or equivalent] |
**DPIA required?** [Yes / No / Uncertain — for High tier or whenever processing is likely to result in high risk to individuals under GDPR Art. 35]
---
## 6. Safety & Reliability
*What happens when the AI gets it wrong?*
| Failure mode | Likelihood | Impact | Mitigation |
|---|---|---|---|
| **False positives** | [H/M/L] | [e.g. Flagging a legitimate transaction as fraud — customer locked out] | [Set threshold conservatively; human review for edge cases] |
| **False negatives** | [H/M/L] | [e.g. Missing a real fraud case — financial loss] | [Monitor false negative rate; set minimum recall threshold] |
| **Out-of-distribution inputs** | [H/M/L] | [Model behaves unpredictably on inputs outside training distribution] | [Input validation; confidence thresholding — route uncertain inputs to human review] |
| **Model degradation** | [M] | [Performance degrades as data distributions shift post-deployment] | [Scheduled performance monitoring; drift detection alerts] |
| **Adversarial inputs** | [L/M] | [Deliberate manipulation of inputs to game the model] | [Adversarial testing; rate limiting; anomaly detection on inputs] |
| **Single point of failure** | [L/M] | [Model outage causes downstream system failure] | [Graceful degradation — define fallback behaviour when model is unavailable] |
**Fallback behaviour:** [What happens if the AI is unavailable or returns low-confidence output? — e.g. route to human review / use rule-based fallback / block the action]
---
## 7. Accountability & Governance
*Who is responsible when things go wrong?*
| Question | Answer |
|---|---|
| **Who owns this AI feature?** | [Team or individual with end-to-end accountability] |
| **Who approved deployment?** | [Name and role — must be documented] |
| **Who is responsible for ongoing monitoring?** | [Team and cadence] |
| **Who can shut it down?** | [Who has kill-switch authority and under what conditions?] |
| **How are incidents reported?** | [Internal escalation path + external disclosure process if required] |
| **Is this subject to regulation?** | [EU AI Act / UK AI regulation / sector-specific rules — FINRA, FDA, FCA, etc.] |
**Incident response plan:** [Link to or describe what happens if the model causes harm — detection, escalation, remediation, disclosure]
---
## 8. Societal Impact
*Beyond individual users — what are the broader effects?*
| Impact area | Risk | Mitigation |
|---|---|---|
| **Labour displacement** | [Does this AI automate tasks that currently employ people?] | [Transition plan / human-AI collaboration framing / skills retraining commitment] |
| **Environmental impact** | [What is the carbon cost of training and inference?] | [Measure and offset; prefer efficient architectures; use renewable-energy infrastructure where possible] |
| **Power concentration** | [Does this AI give the deploying organisation disproportionate power over individuals?] | [Ensure right to opt out; avoid lock-in; consider open alternatives] |
| **Information ecosystem** | [Could this AI contribute to misinformation, filter bubbles, or manipulation?] | [Provenance labelling / content policies / algorithmic diversity requirements] |
---
## 9. Mitigation Priorities
| # | Risk | Severity | Action | Owner | Deadline |
|---|---|---|---|---|---|
| 1 | [Highest risk — e.g. No disclosure to affected candidates] | Critical | [Add AI disclosure to UI and candidate-facing documentation] | [PM + Legal] | [Before launch] |
| 2 | [e.g. No fairness evaluation across demographic groups] | High | [Commission third-party fairness audit using [method]] | [ML team + external auditor] | [Within 30 days of launch] |
| 3 | [e.g. No model monitoring in place] | High | [Deploy performance and drift monitoring dashboard] | [ML Ops] | [Launch day] |
| 4 | [e.g. DPIA not completed] | High | [Complete DPIA with DPO before deployment] | [Legal / DPO] | [Before launch] |
---
## 10. Pre-Deployment Checklist
- [ ] Ethics review completed and approved by required reviewers
- [ ] DPIA completed (if required)
- [ ] Fairness evaluation completed and results documented
- [ ] AI disclosure is in place wherever required
- [ ] Human oversight mechanism is defined and tested
- [ ] Kill-switch and escalation path is documented and tested
- [ ] Model monitoring is deployed and alerting is configured
- [ ] Data lineage and training data audit documented
- [ ] Legal sign-off obtained on data licensing and cross-border transfers
- [ ] Incident response plan in place
---
## Quality Checks
- [ ] "Who is affected" includes people the AI makes decisions *about*, not just who uses the product
- [ ] Fairness analysis names specific protected characteristics, not just "diverse groups"
- [ ] Safety section covers both false positive and false negative failure modes
- [ ] Accountability section names real people, not teams or roles
- [ ] Mitigations are specific and time-bound — not "monitor and review"
## Anti-Patterns
- [ ] Do not limit the affected-population analysis to users of the product — AI that makes decisions about people (hiring, credit, content moderation) affects non-users who have no opt-out
- [ ] Do not accept "we will monitor" as a mitigation without specifying what is monitored, at what threshold, and who acts
- [ ] Do not assign fairness analysis to the model team alone — protected characteristic analysis requires input from legal, HR, or a subject-matter expert
- [ ] Do not defer the DPIA to post-launch — for high-risk tier systems, a DPIA is a pre-requisite for lawful deployment under GDPR
- [ ] Do not conflate statistical accuracy with fairness — a model can be 95% accurate overall while performing significantly worse for a protected group
## Example Trigger Phrases
- "Run an AI ethics review for [feature]"
- "Conduct an ethical impact assessment for our new ML model"
- "Review the AI risks for our hiring / credit / recommendation system"
- "Build a responsible AI checklist for our product"
- "What are the ethical risks of using AI for [use case]?"
@@ -1,6 +1,6 @@
---
name: ai-product-canvas
description: Structures AI and ML product decisions including model selection, data requirements, evaluation frameworks, and responsible AI considerations. Use when building AI-powered features, evaluating LLM integrations, designing AI products, or assessing AI readiness. Triggers on "AI product", "LLM feature", "AI canvas", "build with AI", "AI integration", "AI-powered", "machine learning feature".
description: "Structure AI and ML product decisions with the rigour of any product decision. Use when building AI-powered features, evaluating LLM integrations, designing AI products, or assessing AI readiness. Produces a complete AI product canvas covering problem definition, model approach, data requirements, evaluation framework, UX design, responsible AI checklist, and launch monitoring plan."
---
# AI Product Canvas Skill
@@ -143,3 +143,27 @@ Before building, flag if any of these apply:
- Responsible AI checklist must be completed before launch, not after
- Include latency in success metrics — a 5-second AI response is often worse than no AI at all
- Recommend starting with a human-in-the-loop design and automating only when accuracy is proven
## Required Inputs
Ask the user for these if not provided:
- **Feature or product description** (what the AI is intended to do)
- **User problem** (what problem the AI is solving for users)
- **Available data** (what training/inference data exists)
- **ML/AI lead** (who owns the technical implementation)
## Anti-Patterns
- [ ] Do not skip the "Why AI?" question — if the answer is "we want to use AI," stop and reframe around the user problem first
- [ ] Do not launch with an undefined accuracy threshold — "good enough" is not a threshold; set a number before build begins
- [ ] Do not design the UX to hide AI-generated output as if it were system truth — users need to know when AI is involved so they can override it
- [ ] Do not defer the Responsible AI checklist to post-launch — bias and privacy issues are far harder to fix in production than in design
- [ ] Do not treat model latency as a post-launch optimisation — a 6-second AI response that replaces a 1-second rule-based response is a regression, not a feature
## Quality Checks
- [ ] "Why AI?" is answered clearly (not "because we can")
- [ ] Minimum acceptable accuracy threshold is defined before build begins
- [ ] Fallback UX is specified for model failures or low-confidence outputs
- [ ] Responsible AI checklist is completed (not deferred to post-launch)
- [ ] Monitoring plan includes both model performance and user engagement metrics
@@ -1,13 +1,20 @@
---
name: design-handoff-brief
description: Transform feature briefs into structured design briefs that give designers the context they need
tool_integration: Figma, Notion
description: "Transform feature briefs into structured design briefs that give designers the context they need before opening Figma. Use when asked to write a design brief, create a design handoff, brief a designer on a new feature, or translate a PRD into design requirements. Produces a brief with user goal, emotional context, success criteria, constraints, edge cases, and out-of-scope boundaries."
---
# Design Handoff Brief Skill
## Purpose
Produce a design brief that sets designers up for success — grounding them in user context and constraints before they open Figma, not after they've gone in the wrong direction.
## Required Inputs
Ask the user for these if not provided:
- **Feature brief or PRD** (even rough notes work)
- **Designer's name or team** (for personalisation)
- **Technical constraints** (any engineering limitations already known)
- **Timeline** (when does design need to be done?)
## What Designers Actually Need (and PMs Often Skip)
- The user's goal, not the feature name
- The emotional state of the user at this moment in the journey
@@ -23,8 +30,9 @@ Produce a design brief that sets designers up for success — grounding them in
4. List edge cases the design must handle
5. Define success criteria the design should be evaluated against
6. Write a "not in scope" section to prevent scope creep in design
7. **Validate** — Confirm every edge case listed is specific enough to design for, and every out-of-scope item is concrete enough to say "no" to
## Output Format
## Output Structure
### Design Brief: [Feature Name]
@@ -57,3 +65,19 @@ Produce a design brief that sets designers up for success — grounding them in
- User research: [link]
- Existing patterns: [Figma component library link]
- Competitor examples: [links if relevant]
## Quality Checks
- [ ] User goal is written in user language (not feature/product language)
- [ ] At least one edge case covers an error or failure state
- [ ] Success criteria are measurable or observable (not "looks good")
- [ ] Out-of-scope section names at least one thing that might seem in scope but isn't
- [ ] Technical constraints are specific enough for an engineer to confirm
## Anti-Patterns
- [ ] Do not write the user goal in feature language ("design the checkout flow") — it must be written from the user's perspective with a motivation and outcome
- [ ] Do not skip the "Explicitly Out of Scope" section — without it, designers will inadvertently solve problems not intended for this iteration
- [ ] Do not list edge cases that are so generic they apply to any feature (e.g. "handle errors") — each edge case must be specific to this feature's failure modes
- [ ] Do not hand off the brief without confirming engineering constraints are accurate — a constraint that is wrong is worse than no constraint
- [ ] Do not omit the emotional context of the user — designs without emotional grounding produce technically correct but experientially flat results
@@ -1,55 +1,77 @@
---
name: experiment-designer
description: Designs A/B tests from hypotheses and interprets experiment results
with statistical rigour. Use when user says "run an experiment", "design an A/B
test", "test this feature", "interpret these results", "was this experiment
successful", or "what sample size do I need".
metadata:
author: Mohit Aggarwal
version: 1.0.0
category: data-and-metrics
tags: [experimentation, data, analytics, ab-testing]
documentation: https://github.com/mohitagw15856/pm-claude-skills
description: "Design statistically rigorous A/B tests and interpret experiment results. Use when asked to design an experiment, run an A/B test, calculate sample size, interpret test results, or assess whether an experiment was successful. Produces a complete experiment design with hypothesis, sample size, run time, success criteria, and risk flags — or a results interpretation with ship/iterate/kill recommendation."
---
# Experiment Designer Skill
## Purpose
Produce rigorous experiment designs from product hypotheses, and interpret
results with statistical and practical significance — so you can defend every
decision to a sceptical engineering lead or data scientist.
Produce rigorous experiment designs from product hypotheses, and interpret results with statistical and practical significance — so you can defend every decision to a sceptical engineering lead or data scientist.
## Required Inputs
Ask the user for these if not provided:
**For experiment design:**
- Hypothesis (what change, what metric, what expected movement)
- Current baseline metric value
- Minimum detectable effect (MDE) — the smallest lift worth caring about
- Available daily sample size
**For results interpretation:**
- Control and variant results (raw numbers or percentages)
- P-value or confidence interval
- Run duration (days)
- Any anomalies observed during the test
## Two-Phase Process
### Phase 1: Experiment Design
**Required inputs:** hypothesis, primary metric, current baseline, minimum
detectable effect (MDE), available sample size per day.
**Output:**
- Hypothesis restated as: "If we [change], we expect [metric] to [move by X%]
because [reason]"
- Control and variant definitions
- Primary metric (one only)
- Secondary guardrail metrics (2-3 max)
- Required sample size (calculated from MDE and baseline)
- Estimated run time in days
- Pre-defined success criteria (before the test runs — no moving goalposts)
- Design risk flags: novelty effects, seasonal confounds, multiple testing issues,
network effects, sample ratio mismatch risks
1. Restate hypothesis as: "If we [change], we expect [metric] to [move by X%] because [reason]"
2. Define control and variant clearly
3. Select primary metric (one only) and secondary guardrail metrics (2-3 max)
4. Calculate required sample size from MDE and baseline
5. Estimate run time in days
6. Set pre-defined success criteria before the test runs — no moving goalposts
7. Flag design risks: novelty effects, seasonal confounds, multiple testing issues, network effects, sample ratio mismatch
### Phase 2: Results Interpretation
**Required inputs:** control results, variant results, p-value or raw numbers,
run duration, any anomalies observed.
1. Assess statistical significance (p < 0.05 threshold)
2. Assess practical significance: was the lift meaningful for the business, not just real?
3. Interpret confidence intervals
4. Investigate confounding factors
5. Recommend: Ship / Iterate / Kill / Run follow-up test
6. **Validate** — Confirm the test ran for the full planned duration. Flag if it was stopped early (peeking problem). Confirm sample ratio mismatch did not occur.
**Output:**
- Statistical significance assessment (p < 0.05 threshold)
- Practical significance: was the lift meaningful for the business, not just real?
- Confidence interval interpretation
- Confounding factors to investigate
- Recommendation: Ship / Iterate / Kill / Run follow-up test
- If "Iterate": specific hypotheses to test next
## Output Structure
**[Design or Results header based on phase]**
*Hypothesis:* "If we [change], we expect [metric] to [move by X%] because [reason]"
*Primary metric:* [One metric only]
*Guardrail metrics:* [2-3 max]
*Required sample size:* [n per variant]
*Estimated run time:* [days]
*Pre-defined success threshold:* [specific number]
*Design risk flags:* [any concerns]
**Results (Phase 2 only):**
*Statistical significance:* [p-value and conclusion]
*Practical significance:* [lift size vs. business threshold]
*Recommendation:* Ship / Iterate / Kill / Follow-up — [rationale]
## Quality Checks
- Never interpret results from an underpowered test without flagging it
- Always distinguish statistical from practical significance
- Flag if test was stopped early (peeking problem)
- Note if sample ratio mismatch occurred
- [ ] Hypothesis specifies the change, the metric, the direction, and the reason
- [ ] Primary metric is singular — guardrail metrics are secondary
- [ ] Success criteria are defined before the test launches (not after seeing results)
- [ ] Test was not stopped early (or flagged clearly if it was)
- [ ] Practical significance assessed separately from statistical significance
- [ ] Sample ratio mismatch is checked in results interpretation
## Anti-Patterns
- [ ] Do not define success criteria after seeing preliminary results — post-hoc success definitions are HARKing (Hypothesising After Results are Known) and invalidate the experiment
- [ ] Do not stop a test early because the result looks significant — early stopping dramatically inflates false positive rates; the test must run to the planned sample size
- [ ] Do not treat statistical significance as the same as practical significance — a p < 0.05 result with a 0.1% lift is real but may not be worth shipping
- [ ] Do not run the same experiment on the same population multiple times without correction — multiple testing inflates the chance of a false positive proportionally
- [ ] Do not use more than one primary metric — multiple primary metrics require multiple hypothesis corrections and make the ship/kill decision ambiguous
@@ -1,62 +1,70 @@
---
name: multi-source-signal-synthesiser
description: Synthesises user signals from multiple research sources into a
unified insight brief, reconciling conflicting feedback. Use when user has data
from multiple sources, needs to "make sense of all this user data", "what are
users really telling us", "synthesise our research", or has conflicting feedback
from different channels.
metadata:
author: Mohit Aggarwal
version: 1.0.0
category: discovery
tags: [user-research, synthesis, discovery, insights]
documentation: https://github.com/mohitagw15856/pm-claude-skills
description: "Synthesises user signals from multiple research sources into a unified, weighted insight brief. Use when you have data from interviews, support tickets, NPS verbatims, app reviews, or sales calls and need to reconcile contradictions, surface the underlying need behind requests, or answer 'what are users really telling us'. Produces ranked insights with confidence ratings, source weighting rationale, divergent signal analysis by user segment, and a research gap identification section."
---
# Multi-Source Signal Synthesiser Skill
## Purpose
Reconcile user signals from multiple sources — interviews, support tickets, NPS,
app reviews, sales calls — into a unified, weighted insight brief that surfaces
the underlying need rather than the surface-level request.
Reconcile user signals from multiple sources — interviews, support tickets, NPS, app reviews, sales calls — into a unified, weighted insight brief that surfaces the underlying need rather than the surface-level request.
## Source Weighting (default — adapt to your context)
- Direct research (interviews, usability tests): weight 5
- Support tickets (unprompted pain signals): weight 4
- NPS verbatims: weight 3
- App store reviews: weight 2
- Sales call summaries (filtered through sales lens): weight 2
- Anecdote or single report: weight 1
## Required Inputs
Ask the user for these if not provided:
- **Signal sources** (interviews, support tickets, NPS verbatims, app reviews, sales calls, analytics — any combination)
- **Time period** covered by the data
- **Product area or feature** the signals relate to (if scoped)
## Source Weighting (default — adapt to context)
| Source | Weight | Rationale |
|--------|--------|-----------|
| Direct research (interviews, usability tests) | 5 | Highest-fidelity, structured |
| Support tickets (unprompted pain signals) | 4 | Real pain, unfiltered |
| NPS verbatims | 3 | Broad but shallow |
| App store reviews | 2 | Public, self-selected |
| Sales call summaries | 2 | Filtered through sales lens |
| Anecdote or single report | 1 | Low confidence alone |
## Process
1. Accept inputs from any combination of the source types above
2. Tag each signal by source and apply weight
3. Look for CONVERGENCE: same underlying need appearing across 3+ sources
4. Look for DIVERGENCE: contradictory signals suggesting user segmentation
5. Distinguish surface request from underlying need
(e.g. "faster export" may mean "I don't trust the data will be there when
I need it")
6. Produce ranked insights by weighted frequency
1. Tag each signal by source and apply weight
2. Look for **convergence**: same underlying need appearing across 3+ sources
3. Look for **divergence**: contradictory signals suggesting user segmentation
4. Distinguish surface request from underlying need (e.g. "faster export" may mean "I don't trust the data will be there when I need it")
5. Produce ranked insights by weighted frequency
6. **Validate** — Confirm each insight has evidence from at least 2 source types. Flag any insight resting on a single source as low-confidence.
## Output Format
## Output Structure
### User Signal Synthesis — [Date / Period]
**Sources included:** [list]
**Sources included:** [list with count per source]
**Total signals processed:** [n]
#### Insight 1: [Underlying need, not feature request]
- **Confidence:** High / Medium / Low (based on source diversity and weight)
- **Evidence:** [Signals from each source supporting this]
- **Conflicting signals:** [Any contradicting evidence and how to interpret it]
- **Product implication:** [Specific, not generic]
- **Product implication:** [Specific next step, not generic]
[Repeat for top 3-5 insights]
#### Divergent Signals (Possible Segmentation)
[Where user groups appear to have genuinely different needs]
[Where user groups appear to have genuinely different needs — specify which segments]
#### What the Data Does NOT Tell Us
[Gaps that require further research before acting]
## OpenClaw Configuration
Connect to: Notion (research docs), support inbox, NPS tool, app review feed.
Schedule: weekly synthesis run, diff output showing new signals only.
## Quality Checks
- [ ] Every insight references at least 2 distinct source types
- [ ] Surface requests are translated to underlying needs (not just echoed)
- [ ] Divergent signals identify the specific user segments, not just "some users disagree"
- [ ] Confidence ratings are consistent with source diversity and weighting
- [ ] "What the data does NOT tell us" section is honest about gaps
## Anti-Patterns
- [ ] Do not echo surface-level feature requests as insights — translate every request to the underlying need before including it as a finding
- [ ] Do not assign High confidence to insights supported by only one source type — confidence requires corroboration across at least two distinct source types
- [ ] Do not treat all sources as equally weighted — a single interview quote and a pattern across 200 support tickets are not comparable signals
- [ ] Do not collapse divergent signals into a single finding — where user segments have genuinely different needs, name the segments explicitly rather than averaging them away
- [ ] Do not omit the research gap section when key decisions rest on thin data — acting on low-confidence findings without flagging the gaps misleads product teams
@@ -1,6 +1,6 @@
---
name: data-analysis-standard
description: Structures product data analysis, metric deep-dives, funnel analysis, and cohort studies. Use when asked to analyse product metrics, investigate a drop in conversion, build a dashboard spec, or explain data to stakeholders. Triggers on "analyse metrics", "funnel analysis", "cohort analysis", "data deep dive", "why did X drop".
description: "Structure a product data analysis, metric deep-dive, funnel analysis, or cohort study. Use when asked to analyse product metrics, investigate a drop in conversion, explain a data change to stakeholders, or find the root cause of a metric movement. Produces a structured analysis with question, root cause, confidence level, and recommended action."
---
# Data Analysis Standard Skill
@@ -100,6 +100,31 @@ Output a cohort retention table and annotate:
---
## Required Inputs
Ask the user for these if not provided:
- **Metric or question** being investigated
- **Time period** (what changed, from when to when)
- **Data available** (which segments, sources, or queries you have access to)
- **Business context** (what decision this analysis informs)
- **Audience** (who will read this — exec / team / data team)
## Quality Checks
- [ ] Analysis answers all 4 questions: what changed, why, so what, now what
- [ ] Root cause has evidence (not just hypothesis)
- [ ] Confidence level is stated and justified
- [ ] What the data cannot tell us is explicitly named
- [ ] Recommended action includes an owner and timeline
## Anti-Patterns
- [ ] Do not present correlations as causation — always state the distinction explicitly
- [ ] Do not report a metric movement without stating the time window and comparison baseline
- [ ] Do not skip the "so what" — raw observations without recommended actions are incomplete analysis
- [ ] Do not overstate confidence — label hypotheses clearly and note what data would be needed to confirm them
- [ ] Do not ignore segment breakdowns — aggregate metrics can mask opposing trends in sub-segments
## Guidelines
- Always state what the data *cannot* tell you — never oversell confidence
@@ -1,13 +1,20 @@
---
name: product-health-analysis
description: Interpret product metrics against goals and surface actionable signals
tool_integration: Google Analytics, Mixpanel
description: "Interpret product metrics against goals and surface actionable signals. Use when asked to analyse product health, review key metrics, investigate a performance issue, produce a health report, or assess product-market fit signals. Produces a structured health report with RAG status, trend analysis, root cause hypotheses, and prioritised actions."
---
# Product Health Dashboard Skill
## Purpose
# Product Health Analysis Skill
Transform raw metrics data into a clear health narrative — what's working, what's not, and what needs immediate attention.
## Required Inputs
Ask the user for these if not provided:
- **Metrics data** (current values for key metrics — even rough numbers work)
- **Targets or benchmarks** (OKR targets, historical baselines, or industry benchmarks)
- **Period** (week / month / quarter being analysed)
- **Product area or segment** (are we looking at the whole product or a specific feature?)
## Metrics Framework
Analyse across four layers:
1. **Acquisition** — new users, source quality, CAC trends
@@ -21,8 +28,9 @@ Analyse across four layers:
3. Look for correlations — does a drop in activation explain a retention dip 2 weeks later?
4. Write a plain-English health summary (no jargon) suitable for sharing with non-data stakeholders
5. Recommend top 3 areas for immediate investigation with suggested diagnostic steps
6. **Validate** — Confirm every flagged metric has a plausible root cause hypothesis, not just a raw number, and every recommended action has a specific owner or team
## Output Format
## Output Structure
### Product Health Report — [Period]
**Overall Health:** 🟢 On Track / 🟡 Watch / 🔴 Action Required
@@ -41,3 +49,19 @@ Analyse across four layers:
**Recommended Actions:**
[Specific next steps with owners and timelines]
## Quality Checks
- [ ] Every metric includes both a target and a trend (not just a snapshot)
- [ ] At least one correlation is drawn between metrics (e.g., activation → retention)
- [ ] Every flagged metric has a root cause hypothesis, not just "it dropped"
- [ ] Observations are written for a non-technical stakeholder (no raw query language or data jargon)
- [ ] Overall health rating is justified with specific evidence
## Anti-Patterns
- [ ] Do not report a single aggregate metric without segment breakdowns — averages hide opposing trends
- [ ] Do not flag a metric as healthy just because it is above the target — check if the target itself is meaningful
- [ ] Do not list metric movements without root cause hypotheses — observations without explanations are not analysis
- [ ] Do not mix product health metrics with business KPIs without explaining the relationship between them
- [ ] Do not omit recommended actions — a health report that only describes problems without prioritised next steps is incomplete
@@ -1,6 +1,6 @@
---
name: retention-analysis
description: Structures retention analysis, churn investigations, and engagement deep-dives for product teams. Use when asked to analyse user retention, investigate churn, measure DAU/MAU, or build a retention improvement plan. Triggers on "retention analysis", "churn", "DAU/MAU", "user retention", "why are users leaving".
description: "Structure a retention analysis, churn investigation, or engagement deep-dive for any product team. Use when asked to analyse user retention, investigate churn, measure DAU/MAU, or build a retention improvement plan. Produces a retention snapshot with root cause hypotheses, aha-moment correlation, and prioritised interventions."
---
# Retention Analysis Skill
@@ -108,6 +108,32 @@ Users who [specific action] in first [N] days retain at [X%] vs [Y%] for those w
---
## Required Inputs
Ask the user for these if not provided:
- **Product and business model** (SaaS / consumer app / marketplace / other)
- **Current retention metrics** (D1, D7, D30 if available)
- **Segment to analyse** (all users / paid / free / a specific cohort)
- **Key question to answer** (why is retention dropping? what drives retention?)
- **Available data** (analytics events, churn surveys, interview notes)
## Quality Checks
- [ ] Retention curve shape is diagnosed (flattening vs trending to zero = PMF vs onboarding)
- [ ] Cohorts are segmented before analysis (not all users lumped together)
- [ ] "Aha moment" correlation is identified or flagged as unknown
- [ ] Interventions are specific (not "improve onboarding")
- [ ] Churned user interviews are recommended (not just data analysis)
- [ ] Monitoring plan includes an alert threshold
## Anti-Patterns
- [ ] Do not recommend "improve onboarding" without specifying what specific step to change and why
- [ ] Do not analyse retention without segmenting by cohort — aggregate retention curves hide cohort-specific patterns
- [ ] Do not treat DAU/MAU below 5% as a retention problem — at that level, it is a product-market fit problem
- [ ] Do not skip qualitative research — churned user interviews reveal reasons that quantitative data cannot
- [ ] Do not set a monitoring alert without specifying the threshold that triggers it
## Guidelines
- Never recommend "improve onboarding" without specifying *what* to change and *why*
@@ -149,6 +149,14 @@ Ask the user for these if not provided:
- [ ] Board asks are specific and actionable
- [ ] Deck is ≤ 15 slides (excluding appendix)
## Anti-Patterns
- [ ] Do not bury bad news after slides full of good news — boards lose trust when they discover problems were de-emphasised; lead with the honest narrative
- [ ] Do not include slides without a "so what" — a chart that shows data without a takeaway wastes board time and signals the presenter hasn't done the analysis
- [ ] Do not exceed 15 slides in the main deck — a longer deck usually means the presenter hasn't decided what matters most
- [ ] Do not attend a board meeting without at least one specific ask — a board meeting with no asks is a missed opportunity to leverage the room
- [ ] Do not report metrics without comparing them to plan or a prior period — a metric shown in isolation gives the board no basis for judgement
## Example Trigger Phrases
- "Build a board deck structure for our Q[N] board meeting"
@@ -119,6 +119,14 @@ Hi [Investor names or "all"],
- [ ] Total length is skimmable in 34 minutes
- [ ] No spin or buzzwords
## Anti-Patterns
- [ ] Do not omit challenges or bad news — sanitised updates erode investor trust faster than bad results do
- [ ] Do not bury the lead — use BLUF structure and put the most important news in the first paragraph
- [ ] Do not send an update without a clear "Ask" section — investors who want to help need to know how
- [ ] Do not use buzzwords or spin — investors see hundreds of updates and will see through vague positive language
- [ ] Do not report metrics without a comparison baseline — numbers without context (vs. last period or target) are meaningless
## Example Trigger Phrases
- "Write an investor update for [month/quarter]"
@@ -1,6 +1,6 @@
---
name: job-application
description: "Tailor a CV and cover letter to a specific job description. Use when asked to write a cover letter, tailor a CV or resume, optimise for ATS, match a job description, or prepare a job application. Produces an ATS-optimised tailored CV summary and a personalised cover letter."
description: "Tailors a CV and cover letter to a specific job description. Use when asked to write a cover letter, tailor a CV or resume, optimise for ATS, match a job description, or prepare a job application. Produces an ATS-optimised tailored CV summary and a personalised cover letter aligned to the role's requirements."
---
# Job Application Skill
@@ -120,6 +120,14 @@ Before submitting:
- [ ] Cover letter is 250350 words
- [ ] Gaps are either addressed or strategically omitted
## Anti-Patterns
- [ ] Do not fabricate or embellish experience — only use real achievements from the provided CV
- [ ] Do not use the same cover letter template for every role — every letter must reference specific details of the job description
- [ ] Do not address selection criteria that aren't in the JD — match keywords the employer actually used
- [ ] Do not omit ATS optimisation — ensure role-specific keywords from the JD appear naturally in the CV summary
- [ ] Do not write a cover letter that re-summarises the CV — it must add context and motivation, not repeat bullet points
## Example Trigger Phrases
- "Help me apply for this job: [paste JD]"
+3 -3
View File
@@ -1,13 +1,13 @@
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-cross",
"version": "1.0.0",
"description": "Cross-profession skills: Press Release, Grant Proposal, Executive Summary. Write journalist-ready press releases, structure grant applications aligned to funder priorities, and produce decision-ready executive summaries for any audience.",
"version": "1.1.0",
"description": "Cross-profession skills: Press Release, Grant Proposal, Executive Summary, Teaching Lesson Plan. Write journalist-ready press releases, structure grant applications, produce decision-ready executive summaries, and design complete lesson plans for any subject, audience, or setting.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["communications", "press-release", "grant", "executive-summary", "briefing", "funding", "media"]
"keywords": ["communications", "press-release", "grant", "executive-summary", "briefing", "funding", "media", "education", "teaching", "lesson-plan", "training"]
}
@@ -84,12 +84,21 @@ An executive summary is NOT a summary of the document. It is a standalone docume
**Client:** Lead with their problem. Show you understand before presenting recommendation.
## Quality Checks
- Bottom line in first 3 sentences
- Standalone — no need to read full document
- Recommendation is specific
- Fits length limit
- Written for audience priorities not author priorities
- Next steps have owners and dates
- [ ] Bottom line in first 3 sentences
- [ ] Standalone — no need to read full document
- [ ] Recommendation is specific
- [ ] Fits length limit
- [ ] Written for audience priorities not author priorities
- [ ] Next steps have owners and dates
## Anti-Patterns
- [ ] Do not summarise the document chronologically — an executive summary that follows the structure of the source document is not an executive summary, it is an abstract
- [ ] Do not bury the recommendation at the end — executives read the first paragraph and skim the rest; the ask must be in sentence one or two
- [ ] Do not use the same summary for different audiences — a CEO and a board member have different decision contexts and require different framing
- [ ] Do not include background that the reader already knows — every sentence of background must earn its place by making the bottom line more actionable
- [ ] Do not leave the "risks of inaction" section vague — a summary that does not quantify what happens if the reader does nothing removes the urgency needed for a decision
## Example Trigger Phrases
- "Write an executive summary of this report: [paste]"
@@ -87,11 +87,22 @@ Funder test: does this problem align with [funder] stated priorities? Make the c
---
## Funder Alignment Check
- Every section explicitly references funder stated priorities
- Word limits respected
- Budget aligns with eligible costs policy
- Required attachments prepared
## Quality Checks
- [ ] Every section explicitly references funder stated priorities (not just generic language)
- [ ] Problem statement includes specific data, not just assertions
- [ ] Objectives are SMART (measurable and time-bound)
- [ ] Budget narrative justifies every line with specific detail
- [ ] Sustainability section explains what happens after the grant ends
- [ ] Word limits respected
## Anti-Patterns
- [ ] Do not write a generic proposal — every section must be tailored to the specific funder's stated priorities
- [ ] Do not exceed the specified word or page limits — over-length proposals are disqualified at many funders
- [ ] Do not leave the sustainability section vague — funders need to know what happens after grant funding ends
- [ ] Do not use jargon the funder's reviewers won't understand — write for the panel, not the project team
- [ ] Do not underspecify the budget narrative — every significant line item must be justified with method and reasoning
## Example Trigger Phrases
- "Write a grant proposal for [project] applying to [funder]"
@@ -0,0 +1,158 @@
---
name: last-30-days-research
description: "Searches Reddit, X/Twitter, and the broader web for recent opinions, sentiment, and signal on any topic. Use when you need to know what real people are saying about a tool, product, trend, or event in the past 30 days — cutting through SEO content to surface genuine community reaction. Produces a structured report with consensus findings, pain points, positive signals, contrarian takes, source links, and a signal confidence rating."
---
# Last 30 Days Research
## The Problem
Googling gives SEO-stuffed "best of" lists written six months ago by someone who has never used the thing. Real honest takes live on Reddit threads, X replies, and niche communities — but chasing them across platforms eats your afternoon. This skill does the chase for you.
## Required Inputs
| Input | Required | Notes |
|-------|----------|-------|
| Topic | Yes | Tool, trend, feature, product, event, company — anything with a name |
| Date scope | No | Defaults to last 30 days. Can override to last 7 days or last 90 days |
| Angle | No | e.g. "focus on developer sentiment" or "looking for pricing complaints specifically" |
## Output Structure
The output is a structured research report with the following sections, delivered in this exact order:
```
## Last 30 Days Research: [Topic]
Research window: [Date 30 days ago] → [Today's date]
---
## What People Agree On
[Consensus points that appear across multiple platforms — most reliable signal]
## Where People Disagree
[Active debates, contrasting views — include which side has more weight]
## Pain Points That Keep Coming Up
[Recurring complaints and frustrations — strongest signal of real problems]
## Positive Signals
[What people genuinely praise — not PR, but unprompted appreciation]
## Most Interesting Takes
[Contrarian, unexpected, or surprisingly insightful comments worth noting]
## Sources
[Links to the most useful threads/posts found — 510 links with brief labels]
## Signal Confidence
[High / Medium / Low — with a one-line rationale based on data volume and consistency]
```
Each section should contain substantive content, not placeholders. If a section has no findings (e.g. no positive signals found), state that explicitly rather than leaving it empty or fabricating content.
## Instructions for Claude
### Step 1 — Calculate the date window
Determine today's date and subtract 30 days to get the research start date. Format: YYYY-MM-DD. Use these dates explicitly in every search query.
### Step 2 — Reddit search
Run at least three web searches targeting Reddit:
```
site:reddit.com "[topic]" after:[30-days-ago-date]
site:reddit.com "[topic]" 2025
reddit.com "[topic]" discussion OR thread OR comments
```
For each result: read the thread title, top-level comments, and any highly-upvoted replies. Record the key claims and the URL.
If the topic has common synonyms or abbreviations, run additional searches with those (e.g. "Claude Code" and "claude.code" and "Anthropic coding tool").
### Step 3 — X/Twitter search
Run at least two web searches targeting X:
```
site:twitter.com OR site:x.com "[topic]" after:[30-days-ago-date]
"[topic]" site:x.com -is:retweet
```
Note: X search via web has limitations. If results are sparse, supplement with searches for specific accounts known to discuss the topic area (e.g. tech journalists, domain experts).
### Step 4 — Broader web search
Run at least two broader searches for articles, blog posts, and commentary:
```
"[topic]" review OR opinion OR experience [month] [year]
"[topic]" vs OR alternative OR comparison [month] [year]
```
Target sources: Hacker News, Substack, dev.to, personal blogs, product communities. Avoid press releases and vendor-authored content.
### Step 5 — Cross-platform corroboration check
Before writing the report, review everything collected and apply the corroboration rule:
**When the same point appears on both Reddit and X independently, treat it as strong signal — it's likely true.**
A point mentioned only once on one platform is a data point, not a finding. Weight your sections accordingly.
### Step 6 — Write the report
Populate each section of the output structure. Follow these rules:
- **What People Agree On**: Only include points you saw on 2+ platforms or in multiple independent threads. These are your most reliable findings.
- **Where People Disagree**: Name the sides. "Some say X, others say Y — and the X camp seems louder based on upvote counts / engagement."
- **Pain Points**: Be specific. "Performance issues" is weak. "Cold start times over 4 seconds on the free tier" is useful.
- **Positive Signals**: Must be unprompted praise, not from product marketing or sponsored content.
- **Most Interesting Takes**: At least 2, maximum 5. Quote or closely paraphrase where possible.
- **Sources**: Include the actual URLs. Label each one briefly (e.g. "Reddit thread: 'Has anyone switched from X to Y?'").
- **Signal Confidence**: Rate High/Medium/Low based on:
- High = 10+ sources, consistent signal across platforms
- Medium = 510 sources, some inconsistency
- Low = fewer than 5 sources, or highly fragmented signal
### Step 7 — Sanity check before delivering
Before outputting the report, verify:
- [ ] Every claim in the report traces to an actual source found during research (not prior knowledge)
- [ ] The date window was actually applied to searches, not ignored
- [ ] No fabricated or hallucinated URLs in the Sources section
- [ ] Signal Confidence rating reflects the actual data volume, not optimism
## Quality Checks
- [ ] At minimum 3 Reddit searches were run with the date filter applied
- [ ] At minimum 2 X/Twitter searches were run
- [ ] At minimum 2 broader web searches were run
- [ ] Cross-platform corroboration principle was applied (same point on multiple platforms = stronger signal)
- [ ] Pain Points section contains specific, concrete details — not vague generalisations
- [ ] Sources section contains real URLs (not hallucinated), verified during research
- [ ] Signal Confidence is rated and justified
- [ ] If a section has no findings, it says so explicitly rather than being omitted or padded
- [ ] No vendor-authored content or press releases treated as independent signal
- [ ] Synonyms and alternative names for the topic were searched
## Anti-Patterns
- [ ] Do not treat SEO blog posts or vendor-authored content as community signal — only count independent sources
- [ ] Do not report findings without applying the date filter — prior knowledge mixed with recent search results produces stale, unverifiable claims
- [ ] Do not fabricate or guess at URLs — every link in the Sources section must have been retrieved during the research session
- [ ] Do not report a single mention as a "finding" — a finding requires corroboration from at least two independent sources
- [ ] Do not rate Signal Confidence as High when fewer than 5 credible sources were found — this misleads the reader about how much to rely on the output
## Example Trigger Phrases
- "What are people saying about Cursor AI from the last 30 days?"
- "Research Vercel's recent sentiment"
- "Last 30 days on the Arc browser shutdown"
- "What's the current vibe on Supabase?"
- "What are developers saying about Claude Code lately?"
- "Research [topic] from the last 30 days"
- "Give me a signal report on [product]"
- "What's the Reddit and Twitter take on [trend]?"
@@ -0,0 +1,183 @@
---
name: notebooklm-connector
description: "Automates NotebookLM from Claude Code using browser automation via the Claude Chrome extension — creating notebooks, adding sources, and triggering outputs without manual clicking. Use when you want to create a NotebookLM notebook, add URLs or documents as sources, or generate mindmaps, audio overviews, or briefing docs programmatically. Produces a confirmed checklist of completed actions and a direct link to the notebook."
---
# NotebookLM Connector
## The Problem
NotebookLM is one of the best AI research tools — but it doesn't connect to your other tools. Every notebook requires manual setup inside the NotebookLM UI: open browser, name the notebook, paste URLs one by one, click generate. For researchers, builders, or anyone who works with a high volume of sources, this friction compounds fast.
This skill automates NotebookLM from Claude Code using browser automation via the Claude Chrome extension.
## Prerequisites
| Requirement | Details |
|-------------|---------|
| Claude Chrome extension | Must be installed and active in your Chrome browser |
| NotebookLM account | Active account at notebooklm.google.com |
| Chrome browser | Open and signed into NotebookLM |
If the Chrome extension is not installed, this skill cannot function. There is no fallback — you will need to perform actions manually.
## Required Inputs
| Input | Required | Notes |
|-------|----------|-------|
| Action(s) to perform | Yes | What you want done — see Supported Actions below |
| Notebook name | Conditional | Required for create; optional for add/generate if a notebook is already open |
| Sources | Conditional | Required for add sources action — URLs, file paths, or pasted text |
| Output type | Conditional | Required for generate action — mindmap, audio overview, or briefing doc |
## Supported Actions
| Action | What It Does |
|--------|-------------|
| Create notebook | Opens NotebookLM, creates a new notebook with the specified title |
| Add sources | Adds one or more URLs, files, or text blocks as sources to a notebook |
| Generate mindmap | Triggers mindmap generation from the notebook's sources |
| Generate audio overview | Requests an audio overview (note: takes several minutes to render) |
| Generate briefing doc | Requests a briefing document or slide deck from sources |
| List notebooks | Lists your existing notebooks and their source counts |
| Open notebook | Navigates to a specific existing notebook by name |
Actions can be chained in a single request: "Create a notebook called 'AI Trends Q2', add these 3 URLs as sources, then generate a mindmap."
## Output Structure
After completing actions, Claude returns a structured confirmation:
```
## NotebookLM — Actions Completed
**Notebook:** [Notebook name]
**URL:** [Direct link to the notebook]
**Actions completed:**
- [x] Created notebook: "[Name]"
- [x] Added source: [URL or file name]
- [x] Added source: [URL or file name]
- [x] Triggered: Mindmap generation
**Status:** [Any pending items — e.g. "Audio overview is generating, check back in 510 minutes"]
**Notes:** [Any issues encountered or deviations from the requested actions]
```
If an action fails, the failed step is marked with `[ ]` and a reason is provided. See Error Handling below.
## Instructions for Claude
### Step 1 — Parse and confirm the request
Before opening any browser, parse the full request into discrete steps:
1. What notebook is being targeted (new or existing)?
2. What sources need to be added (list each URL or file)?
3. What outputs need to be generated?
If anything is ambiguous — e.g. "add my research sources" without specifying what they are — ask for clarification before proceeding. Do not guess at source URLs.
### Step 2 — Check the Chrome extension is available
Confirm browser automation is available via the Claude Chrome extension. If it is not active, stop and report:
> "This skill requires the Claude Chrome extension to be installed and active. Please install it at [extension URL] and try again."
### Step 3 — Navigate to NotebookLM
Open or navigate to `https://notebooklm.google.com`. Confirm the user is logged in. If a login screen appears, stop and ask the user to log in manually, then retry.
### Step 4 — Execute actions in order
Execute each action in the sequence requested. After each action, confirm it completed before moving to the next. Do not batch actions speculatively.
**Creating a notebook:**
- Click "New Notebook"
- Enter the specified title
- Confirm the notebook is created and visible
**Adding a URL source:**
- In the notebook, click "Add Source"
- Select "Website" or "URL"
- Paste the URL
- Wait for the source to process and appear in the sources list
- Confirm before adding the next source
**Adding pasted text:**
- Click "Add Source"
- Select "Copied text" or "Paste text"
- Paste the content
- Confirm the source appears
**Generating a mindmap:**
- Navigate to the notebook's output options
- Select "Mindmap" from available outputs
- Trigger generation
- Confirm the mindmap begins rendering
**Generating an audio overview:**
- Navigate to output options
- Select "Audio Overview"
- Trigger generation
- Note: rendering takes several minutes — report this to the user, do not wait for completion
### Step 5 — Compile and return the confirmation
Return the structured output described in the Output Structure section above, including the direct notebook URL and a checklist of completed/failed actions.
## Error Handling
If any step fails, do the following:
1. Stop at the failed step (do not attempt to continue)
2. Report the exact step that failed and what was observed
3. Suggest a manual workaround for that step
4. Offer to retry from that point
**Common failures and workarounds:**
| Failure | Likely Cause | Manual Workaround |
|---------|-------------|-------------------|
| Extension not detected | Extension not installed or disabled | Install from Chrome Web Store |
| Login screen appears | Session expired | Log in manually, then retry |
| Source fails to process | URL is paywalled or blocked | Download content and add as pasted text instead |
| Mindmap not available | Source volume too low | Add more sources (NotebookLM requires minimum content) |
| Audio overview grayed out | Sources not yet indexed | Wait 12 minutes for indexing, then retry |
## Limitations
- **Chrome extension required** — This skill does not work in the Claude web interface without the extension. It cannot function in API-only or terminal-only Claude setups.
- **NotebookLM UI changes** — If Google updates the NotebookLM interface, specific steps (button names, navigation paths) may need to be updated in this skill.
- **Audio overview render time** — Audio overviews are queued server-side by NotebookLM and typically take 515 minutes. Claude can trigger the request but cannot wait for completion.
- **File uploads** — Uploading local files (PDFs, docs) requires the file to be accessible from the browser. File paths must be absolute.
- **Session state** — Claude cannot save or restore NotebookLM session state between conversations. Each session starts fresh.
## Quality Checks
- [ ] User's full request was parsed into discrete steps before any browser action was taken
- [ ] Ambiguous source references were clarified before proceeding
- [ ] Each action was confirmed complete before the next one started
- [ ] Direct notebook URL is included in the output
- [ ] If audio overview was triggered, user was informed of the render delay
- [ ] Any failed steps are explicitly reported with the specific failure reason
- [ ] Manual workaround was offered for any step that failed
- [ ] Output checklist accurately reflects what was completed vs. what failed
## Anti-Patterns
- [ ] Do not proceed with any browser action before the full request has been parsed into discrete steps — ambiguous source references must be clarified before navigating
- [ ] Do not guess at source URLs if the user says "add my research sources" without specifying them — ask for the explicit list before starting
- [ ] Do not batch actions speculatively — each action must be confirmed complete before the next one begins to avoid compounding failures
- [ ] Do not wait for audio overview rendering to complete — audio overviews take 515 minutes server-side; report the trigger and move on rather than blocking the session
- [ ] Do not attempt this skill if the Claude Chrome extension is not active — report the missing prerequisite immediately rather than attempting browser steps that will fail
## Example Trigger Phrases
- "Open NotebookLM and create a notebook called 'Competitor Analysis Q2'"
- "Add these 5 URLs as sources to my NotebookLM notebook"
- "Generate a mindmap in NotebookLM from my current notebook"
- "Create a NotebookLM notebook on AI agent frameworks, add these sources, and generate an audio overview"
- "What notebooks do I have in NotebookLM?"
- "Add this article to NotebookLM: [URL]"
- "Generate a briefing doc from my NotebookLM sources on [topic]"
@@ -64,6 +64,22 @@ ENDS
## Journalist Test
Would a journalist care? Is the headline the full story? Is there a human angle? Is the quote something a human would say? Can the first paragraph stand alone?
## Quality Checks
- [ ] Headline uses active voice and is under 10 words
- [ ] First paragraph stands alone as the complete story
- [ ] Quote adds something the facts don't say (not a restatement)
- [ ] Boilerplate is factual, not promotional
- [ ] Embargo date and media contact are included
## Anti-Patterns
- [ ] Do not bury the news — the most important information must appear in the first paragraph (inverted pyramid)
- [ ] Do not use promotional language or superlatives — press releases must read as news, not advertising copy
- [ ] Do not omit the boilerplate — every press release needs the standard "About [Company]" paragraph at the end
- [ ] Do not forget the embargo date and media contact — journalists need both to use the release
- [ ] Do not write a headline longer than 12 words — it must be scannable and specific
## Example Trigger Phrases
- "Write a press release announcing [news]"
- "Draft a media statement about [event]"
@@ -0,0 +1,164 @@
---
name: sycophancy-challenger
description: "Flips Claude's default from validation to adversarial critique. Use before high-stakes decisions, plans, assumptions, or pitches you haven't stress-tested. Produces structured challenges, steelmanned counter-arguments, and the strongest case against your position — a genuine thinking partner, not a mirror."
---
# Sycophancy Challenger
Claude defaults to validating. You bring a decision, it finds three reasons your instinct is solid, and you leave more confident but not more right. That's actively dangerous when the stakes are high — a hiring call, a pricing change, a strategy pivot, a public commitment. This skill flips the default: Claude argues against your idea first, holds its position under pushback, and only concedes when you give it new evidence. Not when you express displeasure.
> Credit: Originally created by Joel Salinas (Leadership in Change) — adapted and extended for this library.
---
## Required Inputs
| Input | Format | Notes |
|---|---|---|
| Your idea, decision, plan, or assumption | Describe it in plain language | More context = sharper challenge. Include reasoning if you have it. |
No other setup required. Activating the skill is enough — describe your idea and Claude will challenge it immediately.
---
## Output Structure
Every response in this mode follows this exact format:
```
## Strongest Case AGAINST This
[The single most damaging criticism of the idea. Not a list of concerns — the
one argument that, if true, would kill this. Stated directly, without softening.]
## The Weakest Element
[The specific part of the idea most likely to fail, be wrong, or break under
real-world conditions. Named precisely. Not "execution risk" — the actual thing.]
## What You'd Need to Prove to Make This Work
[The assumptions that must be true for this idea to succeed. Written as testable
claims, not as encouragement. If an assumption can't be tested, that's noted.]
## What I Can't Find Fault With
[Only appears when a genuine search finds nothing damaging. States clearly what
holds up and why — doesn't invent weak praise to fill the section. If everything
is actually fine, says so plainly and explains why the challenge came up short.]
```
No additional sections. No summary. No "overall, this is a solid idea." The format ends when the four sections are complete.
---
## Instructions for Claude
### On activation
Do not open with agreement, validation, or any form of "I see where you're coming from." Begin the challenge immediately. The first word of your response should advance the criticism, not soften the user's expectations.
### Step 1: Assume the idea hasn't been stress-tested
Treat the idea as if the user believes in it strongly and has not actively looked for reasons it fails. Your job is to be the adversary they didn't have in the room.
### Step 2: Find the strongest case against it
Not a balanced view. Not pros and cons. The strongest case against. Ask:
- What's the most likely way this fails?
- What's the assumption that, if wrong, makes everything else irrelevant?
- Who would argue against this, and what's the best version of their argument?
- What does this idea get wrong about how people, markets, or systems actually behave?
State the strongest case directly. Do not list multiple criticisms in this section — lead with the one that does the most damage.
### Step 3: Identify the weakest element
This is different from the strongest case against. The weakest element is the most fragile specific component — the thing most likely to crack under execution, scrutiny, or changed conditions. Name it precisely. Examples of insufficient answers:
- "The timeline might be tight" → insufficient
- "The assumption that customers will pay $99/month before experiencing the product is the element most likely to break this, because you have no evidence of willingness-to-pay at that price point" → correct level of specificity
### Step 4: Surface the required assumptions
List what must be true for this to work. Write each assumption as a testable claim:
```
For this to work, the following must be true:
1. [Assumption stated as a claim that can be verified or falsified]
2. [Assumption stated as a claim]
3. [Assumption stated as a claim]
```
If an assumption cannot be tested — it's based on hope, belief, or unprovable prediction — flag it explicitly: "This assumption cannot currently be tested. That's a risk."
### Step 5: Report what holds up (only if true)
Search genuinely for what the idea gets right or where the challenge fails. If you find it, state it clearly. If you can't find a real flaw, say exactly that: "I've looked for the failure points and I can't find them. Here's what actually holds up: [specific things]." Do not invent praise. Do not invent flaws either.
### Handling pushback
If the user pushes back:
- **New evidence or new information:** update your position based on the evidence. State what changed and why.
- **Emotional pushback, repetition, or displeasure:** do not move. Restate the criticism calmly. Example: "I understand you feel strongly about this — I'm not backing off the point about X because that hasn't changed. If there's something I'm missing, tell me what it is."
- **A clarification that changes the picture:** acknowledge the clarification, adjust if warranted, and explain exactly what the clarification changed.
Do not soften a position because the user seems upset. Do not move back to validation mode mid-conversation.
### When the skill ends
The session is complete when the user has either:
1. Strengthened their idea by addressing the core criticism with real evidence or a genuine plan adjustment, or
2. Identified a real flaw they're going to fix.
Not when they've expressed satisfaction. Not when a certain number of exchanges have happened. The measure is whether something actually changed or was genuinely defended.
### Prohibitions
These prohibitions do more work than the rules above. Follow them absolutely:
- **Never open with agreement or validation.** Not "That's an interesting approach," not "I can see why you'd think that." Start with the challenge.
- **Never say "great question," "great point," or "I see where you're coming from" as a lead.** These are validation openers, not neutral transitions.
- **Never soften a criticism with "however, there are also positives."** If the positives are real, they go in the "What I Can't Find Fault With" section, not as a counterweight to every criticism.
- **Never back down because the user expressed displeasure.** Only move if given new evidence.
- **Never invent a flaw that isn't real.** If the idea is actually solid, say so. Inventing fake criticisms is as useless as fake validation.
- **Never use the word "valid" to describe the user's perspective mid-challenge.** It's a validation signal disguised as a neutral word.
---
## Quality Checks
- [ ] Response opened with the challenge — not with a softening phrase or acknowledgment
- [ ] "Strongest Case Against" section contains one argument, not a list
- [ ] "Weakest Element" is specific — names the actual component, not a category of risk
- [ ] "What You'd Need to Prove" lists testable assumptions, not encouragement
- [ ] Untestable assumptions are explicitly flagged as risks
- [ ] "What I Can't Find Fault With" only appears if the search was genuine and something held up
- [ ] No invented flaws — every criticism connects to something real in what the user described
- [ ] Pushback was met with a position restatement, not a retreat (unless new evidence was provided)
- [ ] The session ended because something changed or was genuinely defended — not because the user seemed satisfied
- [ ] None of the prohibited phrases or patterns appear anywhere in the response
---
## Anti-Patterns
- [ ] Do not open with a softening phrase or acknowledgment before the challenge — the first sentence must be the critique
- [ ] Do not retreat from a position when the user pushes back without providing new evidence — update only when genuinely persuaded
- [ ] Do not invent flaws — every criticism must connect to something real in what the user described
- [ ] Do not provide a list of weak objections — identify the single strongest case against the idea
- [ ] Do not end the session because the user seems satisfied — end only when something genuinely changed or was defended
## Example Trigger Phrases
- "Use the sycophancy-challenger skill — here's my plan: [describe it]"
- "Challenge this idea before I commit to it: [describe it]"
- "I've already decided to do X — tell me why I'm wrong"
- "Be the devil's advocate on this hire: [describe the candidate and the role]"
- "I'm about to pitch this to investors — tear it apart first: [describe it]"
- "Don't validate this, challenge it: [idea or assumption]"
- "Stress-test this strategy: [describe it]"
- "What's the strongest argument against doing this: [decision]"
- "I think I'm right about X — what am I missing?"
@@ -0,0 +1,126 @@
---
name: teaching-lesson-plan
description: "Design a structured lesson plan for any subject, audience, or format. Use when asked to write a lesson plan, course outline, teaching session, workshop curriculum, or training module. Produces a complete lesson plan with learning objectives, activities, timing, assessment, and differentiation guidance."
---
# Teaching Lesson Plan Skill
Produces a complete, structured lesson plan for any subject, age group, or setting — from a one-hour corporate training to a full school lesson. Built around clear learning objectives, varied activities, and formative assessment.
## Required Inputs
Ask the user for these if not provided:
- **Subject or topic**
- **Audience** (age group, experience level, group size)
- **Session length** (30 / 45 / 60 / 90 / 120 minutes)
- **Setting** (classroom / workshop / online / corporate training / one-to-one)
- **Learning goal** (what should participants know or be able to do by the end?)
- **Prior knowledge** (what can you assume they already know?)
## Output Structure
---
# Lesson Plan: [Topic]
**Subject:** [Subject] | **Audience:** [Description] | **Duration:** [X minutes]
**Setting:** [Setting] | **Group size:** [N]
---
## Learning Objectives
By the end of this session, participants will be able to:
1. [Objective 1 — use Bloom's taxonomy verbs: recall, explain, apply, analyse, evaluate, create]
2. [Objective 2]
3. [Objective 3 — maximum 34 objectives per session]
**Key vocabulary:** [35 terms participants will need to know]
---
## Materials and Preparation
- [ ] [Resource 1 — slides, handout, equipment]
- [ ] [Resource 2]
- [ ] Room setup: [configuration — rows / circles / tables / breakout spaces]
---
## Lesson Structure
| Time | Phase | Activity | Format |
|---|---|---|---|
| [00:00] | Hook / Opener | [How you grab attention and establish relevance] | [Whole group / Individual / Pairs] |
| [00:05] | Prior knowledge | [How you connect to what they already know] | [Discussion / Quiz / Think-pair-share] |
| [00:15] | Instruction | [Direct teaching of new content] | [Explanation / Demo / Video] |
| [00:30] | Guided practice | [Supported practice with feedback] | [Worked examples / Group task] |
| [00:50] | Independent practice | [Students apply learning independently] | [Task / Problem / Discussion] |
| [01:05] | Check for understanding | [Formative assessment] | [Exit ticket / Quiz / Q&A] |
| [01:15] | Closure | [Summarise, connect to next session] | [Whole group] |
---
## Key Explanations and Worked Examples
### [Concept 1]
[Clear explanation + one concrete worked example. Explain the concept the way a good teacher would — no jargon without definition, one idea at a time.]
### [Concept 2]
[Explanation + example]
---
## Differentiation
**For those who need more support:**
- [Scaffold: e.g. sentence starters, worked examples, vocabulary cards]
- [Modified task or reduced scope]
**For those ready for a challenge:**
- [Extension: e.g. apply to a new context, evaluate, create something]
---
## Formative Assessment (Check for Understanding)
**During session:**
- [Method 1: e.g. Cold calling with no-stakes approach, thumbs up/down, mini whiteboards]
- [Method 2: e.g. Think-pair-share before moving on]
**Exit ticket (last 5 minutes):**
[One specific question that directly tests the learning objective — not "what did you enjoy?" but "solve this problem" or "explain this concept in your own words"]
---
## Common Misconceptions to Address
| Misconception | Correct understanding | How to address it |
|---|---|---|
| [What learners often get wrong] | [The correct version] | [Specific activity or explanation] |
---
## Quality Checks
- [ ] Learning objectives use action verbs (not "understand" or "know")
- [ ] Session has a clear hook that establishes relevance
- [ ] Activities are varied (not all listening)
- [ ] Formative assessment checks the actual learning objective
- [ ] Differentiation is specified for both support and extension
- [ ] Timing adds up to session length
## Anti-Patterns
- [ ] Do not design a lesson plan without explicitly stating the learning objectives — activities must trace back to outcomes
- [ ] Do not allocate timing that does not add up to the total session length — the plan must be time-feasible
- [ ] Do not create activities with no assessment component — learning must be measurable, not just delivered
- [ ] Do not ignore differentiation — a plan with no accommodation for different learning levels or abilities is incomplete
- [ ] Do not front-load all content delivery without interactive breaks — passive listening degrades retention after 1520 minutes
## Example Trigger Phrases
- "Write a lesson plan on [topic] for [audience]"
- "Design a 60-minute session on [subject]"
- "Create a training module on [skill]"
- "Plan a workshop on [topic] for [group]"
+13
View File
@@ -0,0 +1,13 @@
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-cs",
"version": "1.0.0",
"description": "Customer Success skills: Customer Health Scorecard, QBR Deck, Escalation Brief, Churn Analysis. Score account health with a weighted RAG framework, build structured QBR decks with value narratives, write crisp escalation briefs for at-risk accounts, and analyse churn by category and segment with prioritised interventions.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["customer-success", "account-management", "health-scorecard", "qbr", "quarterly-business-review", "churn", "retention", "escalation", "csm", "renewal"]
}
@@ -0,0 +1,187 @@
---
name: churn-analysis
description: "Produce a structured churn analysis that separates avoidable from unavoidable churn. Use when investigating why customers are leaving, identifying at-risk segments, calculating net revenue retention, or building a retention intervention plan. Produces a churn report with rate calculations, categorised reasons by avoidability, segment breakdown, timing analysis, early warning signals, and prioritised interventions ranked by estimated impact."
---
# Churn Analysis Skill
Produce a structured churn analysis that goes beyond the headline rate — identifying why customers leave, which segments are most at risk, and what interventions will have the highest impact on retention.
## Required Inputs
Ask for these if not already provided:
- **Time period** being analysed (e.g. Q1, last 12 months)
- **Total customers at start of period** and **customers churned**
- **ARR or revenue lost** to churn
- **Churn reasons data** — exit survey results, CSM notes, support data, or sales loss reasons
- **Customer segments** — by tier, industry, cohort, or product line
- **Current retention rate** if known
- **Any recent changes** — pricing, product, support model — that may have affected churn
## Churn Categories
Always classify churn before analysing it:
| Category | Definition |
|---|---|
| **Voluntary — avoidable** | Customer left due to a problem we could have addressed (product gaps, poor onboarding, relationship failures) |
| **Voluntary — unavoidable** | Customer left for reasons outside our control (budget cuts, acquisition, company shutdown) |
| **Involuntary** | Payment failure, contract non-renewal by mistake, admin error |
The interventions for each category are different. Conflating them leads to wrong conclusions.
## Output Format
---
# Churn Analysis: [Product / Segment / Company]
**Period:** [Start date] — [End date]
**Prepared by:** [Name] | **Date:** [Date]
---
## Headline Numbers
| Metric | Value |
|---|---|
| Customers at start of period | [N] |
| Customers churned | [N] |
| **Customer churn rate** | **[X]%** |
| ARR at start of period | £/$/€[X] |
| ARR lost to churn | £/$/€[X] |
| **Revenue churn rate (gross)** | **[X]%** |
| ARR from expansions (same period) | £/$/€[X] |
| **Net revenue retention (NRR)** | **[X]%** |
**Benchmark context:**
- Customer churn rate: [X]% vs. industry benchmark [Y]% — [above / below / in line]
- NRR: [X]% — [What this means: above 100% = expansion offsets churn; below 100% = shrinking base]
---
## Churn Breakdown by Category
| Category | Customers | % of churn | ARR lost |
|---|---|---|---|
| Voluntary — avoidable | [N] | [X]% | £/$/€[X] |
| Voluntary — unavoidable | [N] | [X]% | £/$/€[X] |
| Involuntary | [N] | [X]% | £/$/€[X] |
| **Total** | **[N]** | **100%** | **£/$/€[X]** |
**Avoidable churn as % of total churn:** [X]% — this is the number we can actually influence.
---
## Churn Reasons — Avoidable Churn Only
Rank by frequency. Include ARR weight where data allows.
| Reason | Count | % of avoidable churn | ARR lost | Representative quote |
|---|---|---|---|---|
| [Reason 1 — e.g. "Product missing key feature"] | [N] | [X]% | £/$/€[X] | "[Quote]" |
| [Reason 2] | [N] | [X]% | £/$/€[X] | "[Quote]" |
| [Reason 3] | [N] | [X]% | £/$/€[X] | "[Quote]" |
| [Reason 4] | [N] | [X]% | £/$/€[X] | "[Quote]" |
| Other | [N] | [X]% | £/$/€[X] | — |
**Theme synthesis:** [23 sentences grouping the top reasons into 23 themes. E.g. "The top three reasons cluster around two themes: product gaps in [area] (affecting X% of avoidable churn) and onboarding failures where customers never achieved value (Y%)."]
---
## Churn by Segment
Identify which segments over- or under-index for churn.
### By Tier
| Tier | Churn rate | vs. Overall | Notes |
|---|---|---|---|
| Enterprise | [X]% | +/-[X]pp | |
| Mid-Market | [X]% | +/-[X]pp | |
| SMB | [X]% | +/-[X]pp | |
### By Cohort (Acquisition Year)
| Cohort | Churn rate | Notes |
|---|---|---|
| [Year 1] | [X]% | |
| [Year 2] | [X]% | |
| [Year 3] | [X]% | |
### By Industry / Use Case (if data available)
| Segment | Churn rate | Notes |
|---|---|---|
| [Segment 1] | [X]% | |
| [Segment 2] | [X]% | |
**Key pattern:** [Which segment has the highest churn rate and what likely explains it]
---
## Timing Analysis
- **Average contract length before churn:** [X months]
- **Highest-risk moment:** [e.g. "Month 3 — when trial value has worn off but full adoption hasn't happened"]
- **Churn timing distribution:**
| When churn occurred | % of churned accounts |
|---|---|
| 03 months | [X]% |
| 36 months | [X]% |
| 612 months | [X]% |
| 12+ months | [X]% |
---
## Early Warning Signals
Based on the churned accounts, identify the signals that preceded churn (and could have triggered earlier intervention):
| Signal | Lead time before churn | How to detect |
|---|---|---|
| [Signal 1 — e.g. "DAU/MAU dropped below 15%"] | [~X weeks] | [Usage dashboard / alert] |
| [Signal 2 — e.g. "No QBR in 90+ days"] | [~X weeks] | [CRM flag] |
| [Signal 3 — e.g. "Champion left the account"] | [~X weeks] | [LinkedIn alert / CSM tracking] |
| [Signal 4] | [~X weeks] | [Detection method] |
---
## Intervention Recommendations
Ranked by estimated impact × feasibility.
| Intervention | Addresses | Est. churn reduction | Effort | Owner |
|---|---|---|---|---|
| [Intervention 1 — e.g. "Improve onboarding for [segment] with dedicated 30-day check-in"] | [Reason 1] | [X accounts / £X ARR] | Low / Med / High | [Team] |
| [Intervention 2] | [Reason 2] | [X accounts / £X ARR] | Low / Med / High | [Team] |
| [Intervention 3] | [Reason 3] | [X accounts / £X ARR] | Low / Med / High | [Team] |
**Priority call:** [Which one intervention, if implemented this quarter, would have the biggest impact and why]
---
## What We Don't Know (Data Gaps)
- [Data gap 1 — e.g. "Exit survey response rate is only 30% — the reasons data may not be representative"]
- [Data gap 2 — e.g. "No product usage data for SMB tier — can't confirm usage signal correlation"]
- [Data gap 3]
---
## Anti-Patterns
- [ ] Do not mix avoidable and unavoidable churn in intervention plans — recommending product fixes for customers who churned due to company shutdown wastes resources
- [ ] Do not calculate churn rate using end-of-period customer count as the denominator — this understates churn; always divide churned customers by the starting cohort
- [ ] Do not rely solely on exit survey data for churn reasons — response rates are typically low and self-selection biases the sample toward customers who are engaged enough to complete a survey
- [ ] Do not recommend interventions without linking them to a specific churn reason — interventions disconnected from root causes will not move retention
- [ ] Do not report only gross revenue churn — without net revenue retention (NRR), a healthy-looking retention number can hide a shrinking revenue base
## Quality Checks
- [ ] Churn rate is correctly calculated (churned ÷ starting cohort, not end-of-period total)
- [ ] Avoidable and unavoidable churn are separated — interventions target avoidable churn only
- [ ] Churn reasons are customer-reported, not internally assumed
- [ ] Segment analysis identifies which segments over-index — not just averages
- [ ] Early warning signals are specific and detectable, not generic ("low engagement")
- [ ] Interventions link directly to the top churn reasons — no recommendations without a root cause match
@@ -0,0 +1,184 @@
---
name: cs-escalation-brief
description: "Write a structured escalation brief for an at-risk customer account. Use when an account has escalated, when a customer is threatening churn, when a P1 customer issue needs executive attention, or when preparing an internal save play. Produces a crisp escalation brief with account context, timeline, root cause, business impact, and a clear resolution plan."
---
# Customer Escalation Brief Skill
Produce a clear, concise escalation brief that gives internal stakeholders — VP CS, CCO, product leadership, or the CEO — everything they need to understand the situation, make decisions, and act fast.
A good escalation brief is not a complaint. It is a professional document that states the facts, assigns accountability honestly, and proposes a specific resolution plan.
## Required Inputs
Ask for these if not already provided:
- **Account name**, tier, and ARR
- **CSM name** and account owner
- **Nature of the escalation** — what happened, what the customer is saying
- **Timeline** of events leading to escalation
- **Customer contact** who escalated (name, role, influence level)
- **What the customer wants** — their stated ask
- **What we believe the root cause is**
- **What has already been done** to address the situation
- **Renewal date** and current renewal risk assessment
## Escalation Levels
Calibrate urgency and audience based on escalation level:
| Level | Trigger | Audience | Response time |
|---|---|---|---|
| L1 — Account Risk | Customer expressing dissatisfaction; renewal at risk | CSM + CS Manager | 24 hours |
| L2 — Executive Escalation | Customer escalated to their exec; requesting vendor exec involvement | VP CS + Account Exec | 4 hours |
| L3 — Churn Risk | Customer has issued notice or is in active churn conversation | CCO / CEO + Revenue leadership | 1 hour |
| L4 — Public Risk | Customer threatening public escalation, legal, or press | CCO / Legal / Comms | Immediate |
## Output Format
---
# Escalation Brief: [Account Name]
**Escalation level:** L[1/2/3/4] — [Label]
**Date raised:** [Date]
**Raised by:** [CSM name]
**Escalation owner:** [Name of exec or senior stakeholder now leading response]
---
## Account at a Glance
| Field | Detail |
|---|---|
| ARR | £/$/€[X] |
| Tier | Enterprise / Mid-Market / SMB |
| Customer since | [Date] |
| Renewal date | [Date] — [N] days away |
| Renewal risk (pre-escalation) | Green / Amber / Red |
| Renewal risk (current) | Green / Amber / Red |
| Customer contact who escalated | [Name, role, seniority] |
| Executive sponsor (customer) | [Name, role — active / passive / vacant] |
| Executive sponsor (vendor) | [Name, role] |
---
## What Happened — Summary
[35 sentences. State the facts plainly. What the customer experienced, how they reacted, and how we learned about the escalation. No editorialising. No blame.]
---
## Timeline
List in chronological order. Each entry: `[Date / time] — [What happened. Who did what.]`
Include:
- When the original issue or trigger event occurred
- When the customer first raised concerns (informally)
- When it escalated (formal escalation or exec involvement)
- Actions taken since escalation
---
## Root Cause
**Primary cause:** [One clear sentence. What specifically went wrong.]
**Contributing factors:**
- [Factor 1 — be honest about internal failures as well as external ones]
- [Factor 2]
**Is this a systemic issue or isolated?**
[ ] Isolated to this account
[ ] Pattern seen in other accounts — details: [_______]
[ ] Product or process gap that needs fixing
---
## Customer's Stated Position
**What the customer says happened:** [Their version of events — fair and unfiltered]
**What they are asking for:** [Their explicit ask — compensation, fix by date, exec call, SLA credit, exit clause]
**Sentiment of escalating contact:** [Frustrated but constructive / Angry / Seeking exit / Unknown]
**Risk of public escalation:** Low / Medium / High — [evidence if Medium or High]
---
## Business Impact
| Impact type | Detail |
|---|---|
| ARR at risk | £/$/€[X] |
| Potential churn probability | [X]% |
| Reputational risk | Low / Medium / High |
| Reference / case study status | [Was a reference — now at risk / Not a reference] |
| Expansion pipeline at risk | £/$/€[X] |
---
## What Has Been Done So Far
1. [Action taken — by whom — date — outcome]
2. [Action taken — by whom — date — outcome]
3. [Action taken — by whom — date — outcome]
**Has a formal apology or acknowledgement been issued?** Yes / No
---
## Proposed Resolution Plan
**Immediate actions (next 2448 hours):**
| Action | Owner | By when |
|---|---|---|
| [Action] | [Name] | [Date] |
| [Action] | [Name] | [Date] |
**Medium-term actions (next 24 weeks):**
| Action | Owner | By when |
|---|---|---|
| [Action] | [Name] | [Date] |
**What we are NOT offering:** [Be explicit about what is not on the table — avoids misaligned expectations]
**Success criteria:** [How will we know the escalation is resolved? What does the customer need to confirm they are satisfied?]
---
## Decision Required from Escalation Owner
[State clearly what decision or resource the escalation owner needs to provide. Be specific — do not make them ask. E.g.: "We need approval to offer a 20% service credit for Q2" or "We need an exec call with [name] within 48 hours."]
---
## Communication Plan
| Audience | Message | Channel | Owner | By when |
|---|---|---|---|---|
| Escalating customer contact | [Summary of message] | Email / Call | [Name] | [Date] |
| Customer exec sponsor | [Summary] | Call | [Name] | [Date] |
| Internal CS team | [Summary] | Slack / Meeting | CS Manager | [Date] |
---
## Quality Checks
- [ ] Root cause is specific — not "communication breakdown" or "product gap" without detail
- [ ] Customer's position is stated fairly — not minimised or dismissed
- [ ] A clear decision is requested from the escalation owner — brief does not end with "what do you think?"
- [ ] ARR at risk is quantified
- [ ] Communication plan has owners and dates — not "TBD"
- [ ] Language is professional and blameless toward individuals
## Anti-Patterns
- [ ] Do not assign blame to individuals — focus on system failures and process gaps
- [ ] Do not downplay ARR at risk or describe churn risk vaguely without a number
- [ ] Do not leave resolution plan ownership as "TBD" or unassigned
- [ ] Do not write the brief without a clear ask from the escalation owner
- [ ] Do not omit the customer's own stated position — their perspective must be represented fairly
@@ -0,0 +1,149 @@
---
name: cs-health-scorecard
description: "Build a customer health scorecard for a specific account. Use when asked to score account health, assess renewal risk, build a health dashboard, or evaluate an account's likelihood to renew or expand. Produces a structured health scorecard with a RAG status, dimension scores, key risks, and recommended actions."
---
# Customer Health Scorecard Skill
Produce a structured, data-driven health scorecard for a customer account — giving the CSM and leadership a clear view of renewal risk, expansion potential, and the actions needed to move the account in the right direction.
## Required Inputs
Ask for these if not already provided:
- **Account name** and tier (enterprise / mid-market / SMB)
- **Contract value** (ARR) and **renewal date**
- **Product usage data** — logins, DAU/MAU ratio, key feature adoption
- **Support data** — open tickets, CSAT or NPS score, recent escalations
- **Engagement data** — last QBR date, executive sponsor status, champion name
- **Commercial data** — payment history, expansion conversations, seats used vs. licensed
- **Any known risks or recent changes** at the account
## Scoring Framework
Score each dimension 15. Weight as shown. Calculate weighted total out of 100.
| Dimension | Weight | What to Score |
|---|---|---|
| **Product Adoption** | 30% | DAU/MAU ratio, breadth of features used, power users identified |
| **Engagement** | 20% | QBR cadence, executive sponsor active, champion strength |
| **Outcomes** | 20% | Customer hitting their stated goals / success metrics |
| **Support Health** | 15% | Ticket volume trend, unresolved escalations, CSAT |
| **Commercial** | 15% | On-time payments, seats utilised, expansion signals |
**Score → RAG conversion:**
- 80100: Green (healthy, renew likely)
- 6079: Amber (at risk, needs attention)
- 059: Red (high churn risk, escalate)
## Output Format
---
# Customer Health Scorecard: [Account Name]
**CSM:** [Name] | **Tier:** [Enterprise / Mid-Market / SMB]
**ARR:** £/$/€[X] | **Renewal date:** [Date] | **Days to renewal:** [N]
**Overall health:** [Green / Amber / Red] — [Score]/100
**Last updated:** [Date]
---
## Health Score Summary
| Dimension | Score (15) | Weight | Weighted Score | Trend |
|---|---|---|---|---|
| Product Adoption | [15] | 30% | [X] | ↑ / → / ↓ |
| Engagement | [15] | 20% | [X] | ↑ / → / ↓ |
| Outcomes | [15] | 20% | [X] | ↑ / → / ↓ |
| Support Health | [15] | 15% | [X] | ↑ / → / ↓ |
| Commercial | [15] | 15% | [X] | ↑ / → / ↓ |
| **Total** | — | 100% | **[X]/100** | |
---
## Dimension Detail
### Product Adoption — [Score]/5
- **DAU/MAU ratio:** [X]% (benchmark: >25% = healthy)
- **Key features adopted:** [List features in use]
- **Features not adopted:** [List unused high-value features]
- **Power users identified:** [Yes / No — how many]
- **Assessment:** [12 sentences on adoption health]
### Engagement — [Score]/5
- **Last QBR:** [Date] — [Outcome summary]
- **Next QBR:** [Scheduled / Overdue]
- **Executive sponsor:** [Active / Passive / Vacant]
- **Champion:** [Name, role, strength: strong / moderate / weak]
- **Assessment:** [12 sentences]
### Outcomes — [Score]/5
- **Customer's stated goals:** [List 23 goals from onboarding or last QBR]
- **Progress against goals:** [On track / Partial / Off track]
- **Evidence of value:** [Metric or quote that demonstrates ROI]
- **Assessment:** [12 sentences]
### Support Health — [Score]/5
- **Open tickets:** [N] (priority breakdown: P1: X, P2: X, P3: X)
- **CSAT / NPS:** [Score] (benchmark: >8 CSAT / >30 NPS = healthy)
- **Unresolved escalations:** [Yes / No — details if yes]
- **Ticket trend (last 90 days):** Increasing / Stable / Decreasing
- **Assessment:** [12 sentences]
### Commercial — [Score]/5
- **Seats licensed:** [N] | **Seats active:** [N] ([X]% utilisation)
- **Payment history:** [On time / Late — details]
- **Expansion signals:** [Yes — describe / No]
- **Downgrade or cancellation signals:** [Yes — describe / No]
- **Assessment:** [12 sentences]
---
## Top Risks
| Risk | Severity | Mitigation |
|---|---|---|
| [Risk description] | High / Medium / Low | [Specific action to mitigate] |
---
## Recommended Actions
**Immediate (this week):**
1. [Action — owner — deadline]
**This month:**
1. [Action — owner — deadline]
**Before renewal:**
1. [Action — owner — deadline]
---
## Renewal Forecast
| Scenario | Probability | ARR at risk |
|---|---|---|
| Full renewal at current ARR | [X]% | £/$/€0 |
| Renewal with contraction | [X]% | £/$/€[X] |
| Churn | [X]% | £/$/€[full ARR] |
**Recommended renewal play:** [Expand / Hold / Save / Manage out]
---
## Quality Checks
- [ ] Score is based on data, not gut feel — each dimension has evidence
- [ ] Risks are specific (not "low engagement" — something like "executive sponsor left in March, no replacement identified")
- [ ] Actions have owners and deadlines
- [ ] Renewal probability is calibrated against pipeline reality
- [ ] Trend arrows reflect direction of change vs. last scorecard, not just current state
## Anti-Patterns
- [ ] Do not score health dimensions on gut feel — every score needs specific supporting evidence
- [ ] Do not give a Green status to accounts with unresolved P1 issues or missed milestones
- [ ] Do not list risks vaguely — "low engagement" without specifics is not actionable
- [ ] Do not leave recommended actions without named owners and deadlines
- [ ] Do not conflate product usage frequency with product value delivery
@@ -0,0 +1,200 @@
---
name: customer-success-plan
description: "Build a joint customer success plan for a specific account. Use when asked to create a success plan, joint success plan, mutual action plan, or customer onboarding plan. Produces a structured success plan with business goals, milestones, success metrics, ownership, and a 90-180 day roadmap."
---
# Customer Success Plan Skill
This skill produces a joint customer success plan — a living document shared between the CSM and the customer that aligns on outcomes, milestones, and mutual commitments. Output is ready to co-author with the customer in a kickoff call or QBR.
## Required Inputs
Ask the user for these if not provided:
- **Account name** and industry
- **Product / plan purchased**
- **Key stakeholders** — customer champion and economic buyer
- **Customer's stated business goals** — why did they buy? What problem are they solving?
- **Contract term and renewal date**
- **Current onboarding stage** (new customer / expanding / post-QBR / pre-renewal)
- **Seats / licenses / usage purchased**
- **Any known risks** — adoption gaps, champion uncertainty, competing priorities
## Output Structure
---
# Customer Success Plan: [Account Name]
**Product:** [Product name / plan tier]
**Contract term:** [Start date → Renewal date]
**CSM:** [Name]
**Customer champion:** [Name, Title]
**Customer executive sponsor:** [Name, Title — if known]
**Last updated:** [Date]
**Status:** [Active / Under review / Completed]
---
## 1. Partnership Objectives
> *What does success look like for [Account Name] at contract end?*
[Write 23 sentences describing the customer's core objective in plain English — what they are trying to achieve in their business, not what features they are using.]
**Primary business goal:** [e.g. Reduce time-to-hire by 30% across engineering teams]
**Secondary goal:** [e.g. Consolidate three legacy tools into one platform, saving £X/year]
**Success statement (customer's words):** "[Direct quote from champion about what success looks like — ask for this in kickoff]"
---
## 2. Success Metrics
Define how both parties will measure success. Agreed in the kickoff call and tracked in QBRs.
| Metric | Baseline (today) | Target | By when | Data source |
|---|---|---|---|---|
| [e.g. Seat utilisation] | [X%] | [≥ 80%] | [Month 3] | [Product analytics] |
| [e.g. Time to hire] | [X days] | [< Y days] | [Month 6] | [Customer's ATS] |
| [e.g. Reports produced/month] | [X] | [≥ Y] | [Month 3] | [Product analytics] |
| [e.g. NPS] | [X] | [≥ 8] | [Month 6] | [Quarterly survey] |
**Leading indicators** (early signs the plan is on track):
- [e.g. 5+ users log in within the first 2 weeks]
- [e.g. First workflow automated within 30 days]
- [e.g. Champion presents the tool to their team by end of Month 1]
---
## 3. Milestone Roadmap
Break the success journey into phases with clear milestones and owners:
### Phase 1: Onboard (Month 1)
| Milestone | Owner | Due date | Status |
|---|---|---|---|
| Admin setup complete (SSO, permissions, data integration) | [IT contact] | [Date] | [ ] |
| All purchased seats activated and users invited | [Champion] | [Date] | [ ] |
| Core workflow [X] configured and tested | [CSM + Champion] | [Date] | [ ] |
| First training session delivered (all teams) | [CSM] | [Date] | [ ] |
| Kickoff call completed and success plan co-signed | [CSM + Champion] | [Date] | [ ] |
### Phase 2: Adopt (Months 23)
| Milestone | Owner | Due date | Status |
|---|---|---|---|
| [Core feature] in active daily use by ≥ X users | [Champion] | [Date] | [ ] |
| First business outcome achieved and documented | [Champion + CSM] | [Date] | [ ] |
| 30-day check-in completed | [CSM] | [Date] | [ ] |
| [Power user workflow] enabled for advanced users | [CSM] | [Date] | [ ] |
### Phase 3: Value (Months 46)
| Milestone | Owner | Due date | Status |
|---|---|---|---|
| QBR 1 delivered — ROI evidence presented | [CSM + AE] | [Date] | [ ] |
| Success metric [X] hit target | [Champion] | [Date] | [ ] |
| Expansion use case identified and introduced | [AE] | [Date] | [ ] |
| Reference call or case study agreed | [Champion] | [Date] | [ ] |
### Phase 4: Renew & Expand (Months 712)
| Milestone | Owner | Due date | Status |
|---|---|---|---|
| QBR 2 delivered — renewal conversation started | [CSM + AE] | [Date] | [ ] |
| Renewal proposal sent | [AE] | [Date] | [ ] |
| Expansion or flat renewal signed | [AE] | [Date] | [ ] |
---
## 4. Mutual Commitments
Success plans work when both parties commit. Document what each side will do:
**[Vendor] commits to:**
- Dedicated CSM available [X days/week / by email within 24 hours]
- Monthly [call / check-in / async update] with champion
- QBR every [90 days] with executive summary and ROI report
- Priority support for [Account] — response SLA of [X hours] for P1 issues
- Roadmap preview for relevant upcoming features
- [Any other specific commitment made in sales cycle]
**[Account Name] commits to:**
- Champion available for [30-min monthly] check-in
- Users complete onboarding training by [date]
- Feedback on product experience shared monthly (async or sync)
- Executive sponsor participates in QBR 1 and renewal discussion
- Provide outcome data to CSM quarterly for ROI tracking
---
## 5. Stakeholder Engagement Plan
| Stakeholder | Role | Engagement frequency | Format | Owner |
|---|---|---|---|---|
| [Champion] | Day-to-day owner | Weekly (async) + Monthly (call) | Slack / Email + Zoom | CSM |
| [Economic buyer] | Budget holder | Quarterly | QBR (in-person or video) | CSM + AE |
| [IT contact] | Integration owner | As needed | Email | CSM |
| [End users] | Active users | Training only | Group session | CSM |
---
## 6. Risk & Mitigation
| Risk | Likelihood | Impact | Mitigation plan |
|---|---|---|---|
| Low adoption in first 30 days | [M] | [H] | CSM hosts live onboarding; champion sends internal comms day 1 |
| Champion changes role | [L] | [H] | Multi-thread: introduce CSM to 2 additional stakeholders by Month 2 |
| Budget pressure at renewal | [M] | [H] | Build ROI case monthly; document value continuously |
| Competing priorities delay rollout | [H] | [M] | Agree minimum viable adoption path with champion; don't require perfection to declare value |
---
## 7. Communication Plan
| Communication | Audience | Frequency | Format | Owner |
|---|---|---|---|---|
| Health update | Champion | Monthly | Email summary (3 bullets: what's good, what needs attention, one ask) | CSM |
| QBR | Champion + Exec | Quarterly | 45-min video call with slide deck | CSM + AE |
| Product updates | Champion | As released | Release notes email | CSM |
| Support status | Champion | When open tickets exist | Email / Slack | Support + CSM |
---
## 8. Escalation Path
If the success plan falls off track:
| Trigger | Action | Owner | Timeline |
|---|---|---|---|
| Health drops to Amber | Internal review + champion call within 5 days | CSM | Immediate |
| Health drops to Red | CS leadership + AE looped in; escalation brief drafted | CS Manager | Within 24 hours |
| Champion is unresponsive for >10 days | AE attempts exec sponsor contact | AE | After CSM attempt fails |
| Adoption <40% at Month 3 | Emergency enablement session + revised milestone plan | CSM | Within 1 week of flag |
---
## Quality Checks
- [ ] Success metrics are the customer's metrics — not just product usage metrics
- [ ] Milestones have specific owners and due dates — not "TBD"
- [ ] Mutual commitments section is genuinely mutual — not just what the vendor will do
- [ ] Risk register includes champion departure and low adoption
- [ ] Plan is written to be shared with the customer — no internal-only commentary in this document
- [ ] Executive sponsor is identified and has an engagement role
## Anti-Patterns
- [ ] Do not define success metrics that the vendor controls — metrics must reflect the customer's business outcomes
- [ ] Do not set milestone dates without customer confirmation — unilateral timelines undermine joint ownership
- [ ] Do not create a plan the customer hasn't agreed to — it must be mutual, not a CSM's internal plan
- [ ] Do not leave ownership fields blank or assigned to "CS team" — every action needs a named owner
- [ ] Do not confuse product adoption milestones with customer business outcomes — both are needed but are not the same
## Example Trigger Phrases
- "Build a success plan for [Account Name] who just signed"
- "Create a joint success plan for our new enterprise customer"
- "Write a 6-month customer success roadmap for [Company]"
- "I need a mutual action plan for our QBR with [Account]"
- "Generate a customer success plan for an at-risk account"
+226
View File
@@ -0,0 +1,226 @@
---
name: qbr-deck
description: "Build a Quarterly Business Review (QBR) deck structure and narrative for a customer account. Use when asked to prepare a QBR, business review meeting, executive review, or quarterly check-in with a customer. Produces a slide-by-slide QBR structure with talking points, metrics review, value narrative, and mutual next steps."
---
# QBR Deck Skill
Produce a complete Quarterly Business Review deck — structured, data-backed, and customer-focused. A good QBR demonstrates value delivered, aligns on goals for the next quarter, and strengthens the executive relationship. It should never feel like a product demo or a vendor update.
## Required Inputs
Ask for these if not already provided:
- **Account name**, CSM name, and customer stakeholders attending
- **Contract details** — ARR, contract start date, renewal date
- **Last quarter's goals** (from previous QBR or kickoff)
- **Usage and adoption data** — key metrics for the quarter
- **Support summary** — tickets raised, resolution time, any escalations
- **Business outcomes the customer cares about** — what success looks like for them
- **Product updates or new features** relevant to this customer
- **Goals for next quarter**
- **Any open commercial conversations** (expansion, renewal, at-risk signals)
## QBR Principles
- Lead with customer outcomes, not product features
- Every metric should connect to a business result the customer cares about
- The agenda is a conversation, not a presentation — build in time for customer input at every stage
- Close with mutual commitments, not just vendor actions
## Output Format
---
# QBR: [Account Name] × [Your Company]
**[Quarter] [Year] Business Review**
**Date:** [Date] | **Location / Call link:** [TBC]
**Customer attendees:** [Names and roles]
**[Your company] attendees:** [Names and roles]
---
## Slide 1: Agenda (5 min)
| Time | Topic | Owner |
|---|---|---|
| 0:00 | Welcome and introductions | CSM |
| 0:05 | [Last quarter] — how did we do? | CSM + Customer |
| 0:20 | Value delivered — business impact | CSM |
| 0:35 | What's coming — roadmap preview | CSM / Product |
| 0:45 | [Next quarter] — goals and priorities | Customer |
| 0:55 | Actions and mutual commitments | CSM |
| 1:00 | Close | |
*Talking point: "We've kept today to 60 minutes. We want as much of this to be a conversation as possible — please push back, redirect, and ask questions throughout."*
---
## Slide 2: Where We Are Together (2 min)
**Partnership snapshot:**
- **Customer since:** [Date]
- **Contract value:** £/$/€[ARR]/year
- **Renewal date:** [Date]
- **Active users:** [N] of [N] licensed seats ([X]% adoption)
- **Products / modules active:** [List]
*Talking point: "Before we dive in — a quick picture of where we are. [X] months in, [Y] active users, and this is our [Nth] QBR together."*
---
## Slide 3: Last Quarter — Goals We Set Together (5 min)
| Goal | Set in [Last QBR / Kickoff] | Status |
|---|---|---|
| [Goal 1] | [What we committed to] | ✅ Achieved / ⚠️ Partial / ❌ Missed |
| [Goal 2] | [What we committed to] | ✅ Achieved / ⚠️ Partial / ❌ Missed |
| [Goal 3] | [What we committed to] | ✅ Achieved / ⚠️ Partial / ❌ Missed |
For any partial or missed goal: state what happened and what changes next quarter.
*Talking point: "Let's start with accountability. Here's what we said we'd achieve last quarter — let's be honest about where we landed."*
---
## Slide 4: Usage and Adoption (5 min)
**Quarter-over-quarter trend:**
| Metric | [Q-1] | [Q] | Change |
|---|---|---|---|
| Monthly active users | [N] | [N] | +/-X% |
| Sessions per user per week | [N] | [N] | +/-X% |
| [Key feature 1] adoption | [X]% | [X]% | +/-X% |
| [Key feature 2] adoption | [X]% | [X]% | +/-X% |
**Highlights:**
- [Positive adoption trend to call out]
- [Feature or workflow with strongest engagement]
**Opportunity:**
- [Feature with low adoption that could drive more value — link to their goals]
*Talking point: "Usage is [up / stable / something we want to talk about]. The area I'd like to focus on is [feature] — we're not seeing the adoption we'd expect given [their goal], and I want to understand why."*
---
## Slide 5: Business Impact — Value Delivered (10 min)
Lead with outcomes, not activity.
**[Outcome 1: customer's primary success metric]**
- Before: [baseline]
- Now: [current state]
- Impact: [quantified business result — time saved, revenue influenced, cost reduced, risk mitigated]
**[Outcome 2]**
- [Same structure]
**[Outcome 3]**
- [Same structure]
**Customer evidence** (use if available):
> "[Quote from champion or user about value experienced]"
*Talking point: "This is the section I most want your input on. Are these the outcomes that matter to your business? Are there other ways you're measuring success that we should be tracking?"*
---
## Slide 6: Support Summary (3 min)
| Metric | This quarter | Last quarter | Trend |
|---|---|---|---|
| Tickets raised | [N] | [N] | ↑ / → / ↓ |
| Average resolution time | [X hrs] | [X hrs] | ↑ / → / ↓ |
| P1 / critical issues | [N] | [N] | ↑ / → / ↓ |
| CSAT score | [X/10] | [X/10] | ↑ / → / ↓ |
**Notable issues this quarter:**
- [Any escalation or major ticket — brief summary and resolution]
**What we're doing differently:**
- [Any process change or improvement based on support patterns]
---
## Slide 7: What's Coming — Roadmap Preview (5 min)
Focus only on what's relevant to this customer's goals. Do not dump the full roadmap.
| Feature / Improvement | Expected | Why it matters to [Account Name] |
|---|---|---|
| [Feature 1] | [Q+1] | [Direct link to their goal or pain point] |
| [Feature 2] | [Q+1 / Q+2] | [Direct link] |
| [Feature 3] | [H2] | [Direct link] |
*Talking point: "I've filtered the roadmap to what I think matters most to your team. I'd love your reaction — are these the right priorities from your perspective?"*
---
## Slide 8: Next Quarter — Your Goals (10 min)
**Customer input section — facilitate, don't present.**
Prompt questions:
- "What does success look like for your team in [next quarter]?"
- "What's the biggest challenge you're trying to solve in the next 90 days?"
- "Is there anything about the way you're using [product] you want to change?"
**Capture live:**
| Goal for next quarter | Owner (customer) | How we'll support it | How we'll measure it |
|---|---|---|---|
| [Goal 1] | [Name] | [CSM / product action] | [Metric] |
| [Goal 2] | [Name] | [CSM / product action] | [Metric] |
---
## Slide 9: Mutual Commitments (5 min)
**[Your company] commits to:**
1. [Specific action — owner — by when]
2. [Specific action — owner — by when]
3. [Specific action — owner — by when]
**[Account Name] commits to:**
1. [Specific action — owner — by when]
2. [Specific action — owner — by when]
**Next touchpoint:** [Date of next check-in or mid-quarter review]
---
## Slide 10: Thank You + Open Q&A (5 min)
- Recap the one headline from today: [The single most important thing you want them to remember]
- Confirm actions are captured and shared after the call
- Ask: "Is there anything we didn't cover today that you wanted to raise?"
---
## Preparation Checklist
- [ ] Usage data pulled and QoQ comparison calculated
- [ ] Last QBR goals reviewed — status confirmed before the meeting
- [ ] Business outcomes framed in customer language (not product language)
- [ ] Roadmap filtered to this account's specific use cases
- [ ] Customer's goals for next quarter researched or pre-confirmed with champion
- [ ] Executive sponsor briefed on any sensitive topics before the call
- [ ] Actions from previous QBR reviewed — any outstanding items addressed
## Quality Checks
- [ ] Every slide has a talking point, not just a title
- [ ] Value slide leads with business outcomes, not product activity
- [ ] Roadmap preview links each item to a customer goal
- [ ] Mutual commitments section has real owners on both sides
- [ ] Customer has at least 20 minutes of airtime in the agenda
## Anti-Patterns
- [ ] Do not fill the QBR with product activity metrics — lead with business outcomes the customer cares about
- [ ] Do not present a roadmap without linking each item to a customer goal — vendor priorities are not a QBR agenda
- [ ] Do not run a QBR as a one-sided presentation — it must include structured time for the customer to speak
- [ ] Do not close a QBR without documented mutual commitments with named owners on both sides
- [ ] Do not skip the "what's not working" slide — suppressing problems erodes trust and misses renewal risks
@@ -0,0 +1,198 @@
---
name: renewal-playbook
description: "Build a structured renewal playbook for a customer account. Use when asked to plan a renewal, structure a renewal negotiation, prepare for an expansion conversation, or build a renewal strategy for at-risk or healthy accounts. Produces a renewal brief with health assessment, negotiation strategy, objection responses, expansion levers, and a timeline."
---
# Renewal Playbook Skill
This skill produces a complete renewal playbook for a specific customer account, covering health assessment, commercial strategy, negotiation preparation, expansion opportunity mapping, and a step-by-step timeline. Output is ready for the CSM or account team to execute 90180 days before renewal.
## Required Inputs
Ask the user for these if not provided:
- **Account name**
- **Renewal date**
- **Current ARR** and proposed renewal ARR (if different)
- **Account health** — RAG status and main reasons (or describe the account situation)
- **Key stakeholders** — economic buyer, champion, and any detractors
- **Renewal risk factors** — budget pressure, low adoption, competitive threat, champion departure, etc.
- **Expansion opportunity** — any upsell or cross-sell potential?
- **Contract terms** — current plan, duration, and any terms up for renegotiation
## Output Structure
---
# Renewal Playbook: [Account Name]
**Renewal date:** [Date]
**Current ARR:** [£/$/€ X]
**Target renewal ARR:** [£/$/€ X — flat / +X% expansion / contraction risk]
**Health status:** [Green / Amber / Red]
**CSM:** [Name]
**Account executive:** [Name]
**Days to renewal:** [X days]
---
## 1. Account Health Snapshot
| Dimension | Score (15) | Evidence |
|---|---|---|
| **Product adoption** | [X/5] | [e.g. 3 of 5 purchased seats active; core feature used weekly] |
| **Business outcomes** | [X/5] | [e.g. Customer reports X% improvement in [metric]; no formal ROI review done] |
| **Relationship depth** | [X/5] | [e.g. Strong champion in [name/role]; limited exec sponsorship] |
| **Support & satisfaction** | [X/5] | [e.g. 2 open P2 tickets; last NPS 7; no escalations in 6 months] |
| **Commercial engagement** | [X/5] | [e.g. Invoice paid on time; no discount pressure raised yet] |
| **Overall health** | [X/5 — weighted] | [Green / Amber / Red] |
**Renewal thesis:** [One sentence: why this account will renew — or what must change for it to renew.]
---
## 2. Stakeholder Map
| Stakeholder | Role | Influence | Sentiment | Our relationship |
|---|---|---|---|---|
| [Name] | Economic buyer | High | [Positive / Neutral / Negative] | [Warm / Cold / Unknown] |
| [Name] | Champion | High | [Positive] | [Warm] |
| [Name] | End user | Low | [Neutral] | [Limited] |
| [Name] | IT / procurement | Medium | [Neutral] | [Transactional] |
**Champion risk:** [Is our champion secure in their role? Any signals of departure or reorganisation?]
**Multi-thread plan:** [Who else do we need relationships with before renewal? How do we get there?]
---
## 3. Risk Register
| Risk | Likelihood (H/M/L) | Impact (H/M/L) | Mitigation |
|---|---|---|---|
| [Budget pressure / cost-cutting] | [H] | [H] | [Build ROI case 90 days out; identify budget holder's priorities] |
| [Low adoption in [department]] | [M] | [H] | [Run targeted enablement session; tie to champion's OKRs] |
| [Competitor evaluation] | [M] | [M] | [Request competitive intelligence; schedule exec-level call] |
| [Champion departure] | [L] | [H] | [Map two additional stakeholders; executive intro call] |
---
## 4. Value Story
Build the ROI narrative for the renewal conversation:
**Headline result:** [e.g. "[Account] saved X hours/week or reduced [metric] by X% using [product]"]
**Evidence sources:**
- [ ] Product usage data (logins, features used, seat utilisation)
- [ ] Business metric improvement (pull from QBR deck or success plan)
- [ ] Support resolution time improvement
- [ ] Customer-provided testimonial or case study quotes
**Value gaps to close before renewal:** [Are there outcomes the customer expected but hasn't seen yet? What's the plan to close these?]
---
## 5. Expansion Opportunity
Map upside beyond flat renewal:
| Opportunity | Type | Estimated value | Likelihood | Timing |
|---|---|---|---|---|
| [Seat expansion — [dept] wants to add 10 users] | Upsell | [+£X ARR] | [High] | [Renewal or +3M] |
| [Cross-sell — [Product B] use case identified] | Cross-sell | [+£X ARR] | [Medium] | [+6M] |
| [Multi-year commitment] | Discount for term | [+£X TCV / -X% discount] | [Low] | [At renewal] |
**Expansion play:** [Which opportunity to lead with, and the sequence for raising it in the renewal conversation]
---
## 6. Commercial Strategy
**Renewal scenario planning:**
| Scenario | Probability | ARR outcome | Response strategy |
|---|---|---|---|
| **Flat renewal** | [X%] | [£X — same as current] | [Accept; plant seeds for +6M expansion] |
| **Expansion** | [X%] | [£X] | [Lead with ROI evidence; pitch seat or feature expansion] |
| **Contraction risk** | [X%] | [£X — downgrade to lower tier] | [Propose phased commitment; demonstrate path to full adoption] |
| **Churn risk** | [X%] | [£0] | [Escalate to leadership; executive sponsor engagement] |
**Discount guardrails:**
- Floor discount: [X% — do not go below without VP approval]
- Triggers for discount: [Multi-year / volume / reference customer commitment]
- What to ask for in return: [Reference case study / G2 review / executive intro / case study participation]
**Pricing flexibility:**
- [e.g. Can offer monthly billing in exchange for 24-month commit]
- [e.g. Can offer X seats free in exchange for expansion commitment]
---
## 7. Objection Responses
Prepare for the most likely objections:
**"The price is too high"**
> Anchor on value delivered: "[Customer] achieved [X outcome] — at [£X ARR], that's [£Y per outcome / hour saved / user]. What would it cost to deliver that outcome without us?"
> If budget is genuinely constrained, explore: phased payment, reduction in scope rather than full churn, multi-year pricing.
**"We're not seeing enough adoption"**
> Acknowledge, then commit: "You're right — [X seats] are actively using [core feature] out of [Y]. We want to fix this. Here's our 60-day plan: [exec sponsor on enablement call / training session / in-product nudge campaign]."
**"We're evaluating [Competitor]"**
> Don't panic. Ask: "What's driving the evaluation — is it specific features, pricing, or something else?" Then map gaps honestly. Offer a feature roadmap preview if relevant. Get clarity on their criteria and timeline before responding defensively.
**"We need to reduce spend this quarter"**
> Separate the commercial conversation from the value conversation. Offer to protect the relationship with a reduced scope today with a committed expansion trigger at a business milestone. Avoid discounting without a reason.
---
## 8. Renewal Timeline
| Week | Action | Owner | Notes |
|---|---|---|---|
| **W16** (4 months out) | Internal renewal review — health, expansion opportunity, risk | CSM | Flag to leadership if Red |
| **W12** | QBR / executive business review — ROI evidence delivered | CSM + AE | Book 4560 min with economic buyer |
| **W10** | Champion 1:1 — pulse check on satisfaction and upcoming priorities | CSM | Uncover internal dynamics before commercial discussion |
| **W8** | Expansion conversation — plant seeds, share roadmap | AE | Do not lead with pricing |
| **W6** | Send renewal proposal — pricing, terms, options | AE | Include multi-year option |
| **W4** | Negotiation — address objections, finalise commercial terms | AE + CSM | Escalate to VP if >X% discount required |
| **W2** | Legal / procurement — contract redlines, signature process | AE + Legal | |
| **W0** | Signed. Handoff to post-renewal success plan | CSM | Thank the champion; begin next cycle |
---
## 9. Success Criteria
- [ ] Renewal signed before deadline
- [ ] ARR outcome within target range
- [ ] Champion relationship maintained or improved
- [ ] At least one expansion conversation started
- [ ] ROI evidence documented and accepted by customer
---
## Quality Checks
- [ ] Stakeholder map includes the economic buyer — not just the champion
- [ ] Risk register has a mitigation for every H/H risk
- [ ] Value story uses product data and business outcomes, not just feature lists
- [ ] Commercial strategy includes a floor discount and a reason-to-discount framework
- [ ] Timeline starts at least 90 days before renewal date
- [ ] Objection responses are specific to this account, not generic
## Anti-Patterns
- [ ] Do not start renewal conversations less than 90 days before the renewal date for accounts over $50K ARR
- [ ] Do not build a renewal strategy without first honestly assessing account health — wishful thinking leads to last-minute churn
- [ ] Do not treat all renewal objections as negotiating tactics — some objections signal genuine dissatisfaction that requires resolution first
- [ ] Do not offer discounts as the first response to price objections — explore value gaps before reducing price
- [ ] Do not close the renewal without confirming the expansion opportunity — every renewal is also an expansion conversation
## Example Trigger Phrases
- "Build a renewal playbook for [Account Name] renewing in [Month]"
- "Help me plan the renewal strategy for an at-risk customer"
- "Prepare a renewal brief for my QBR with [Company]"
- "What's my renewal strategy for a Red account coming up in 60 days?"
- "Create a renewal and expansion plan for [Account]"
@@ -0,0 +1,102 @@
---
name: chart-data-extractor
description: "Extract pixel-level data from an image of a chart or graph and produce a structured data table. Use when asked to extract data from a chart image, transcribe numbers from a graph, digitise a chart, or turn a screenshot of data into a table. Produces a structured table with extracted values, confidence levels, and a reconstructed chart source. Best used with Claude Opus 4.7 or newer for reliable chart data extraction."
---
# Chart Data Extractor Skill
Extracts data from images of charts and graphs — bar charts, line charts, pie charts, scatter plots, and tables in images — producing a structured data table that can be used in spreadsheets or rebuilt in any charting tool. Built to leverage Opus 4.7 pixel-level image analysis capabilities.
## Required Inputs
Ask the user for these if not provided:
- **The chart image** (upload a screenshot or image file)
- **Chart type** (if ambiguous — bar / line / pie / scatter / other)
- **What matters most** (approximate trends / precise values / specific data points / categorisation)
- **Known axis values** (optional — if the user knows the max/min values to anchor the extraction)
## Output Structure
### 1. Chart Identification
| Attribute | Value |
|---|---|
| Chart type | [Bar / Line / Pie / Scatter / Area / Other] |
| Chart title (if visible) | [Title text] |
| X-axis label | [Label + unit] |
| Y-axis label | [Label + unit] |
| Number of series | N |
| Legend categories | [List] |
| Data period (if time-based) | [Start — End] |
### 2. Extracted Data Table
| [X axis] | [Series 1] | [Series 2] | ... |
|---|---|---|---|
| [Value] | [Value] | [Value] | |
### 3. Confidence Levels
For each data point or series, flag confidence:
- **High confidence:** data points where the value is clearly readable against gridlines or labels
- **Medium confidence:** data points where the value is interpolated between gridlines
- **Low confidence:** data points where the value is ambiguous or overlaps with other elements
Low-confidence points should be explicitly listed — not silently included in the main table.
### 4. Notable Observations
Observations that the data itself reveals:
- Peak value: [Value, when, in which series]
- Lowest value: [Value, when, in which series]
- Largest delta between series: [Details]
- Any anomalies or outliers visible in the chart
### 5. Reconstructed Source
CSV format for direct use:
```csv
[x_axis],[series_1],[series_2]
[value],[value],[value]
```
### 6. Assumptions and Caveats
- Grid resolution: [How precisely values could be read — e.g. "Y-axis has major gridlines every 10 units, minor every 2"]
- Interpolation used: [Any values that required estimating between gridlines]
- Unclear data: [Anything in the chart that could not be read reliably]
- Axis scale: [Linear/logarithmic/etc — note if not obvious]
### 7. Follow-up Options
Ask the user which of these they want:
- Rebuild the chart in a specified format (Excel formula, Python matplotlib, D3, etc.)
- Produce a narrative description of what the chart shows
- Compare this data against another chart or source
- Flag potentially misleading visual choices in the original (truncated axes, misleading scales, etc.)
## Quality Checks
- [ ] Every extracted number specifies which series it belongs to
- [ ] Confidence levels are explicit for ambiguous points
- [ ] Low-confidence values are flagged separately, not silently included
- [ ] Assumptions about axis scale and interpolation are stated
- [ ] CSV output is clean and directly usable
## Anti-Patterns
- [ ] Do not silently include low-confidence data points in the main table — flag them separately so the user knows which values to verify
- [ ] Do not assume a linear scale without confirming it — logarithmic axes make extracted values incorrect by orders of magnitude if misread
- [ ] Do not report extracted values with false precision — if the chart's Y-axis only shows gridlines every 10 units, a reported value of 37 is invented, not extracted
- [ ] Do not omit the assumptions and caveats section — partial image quality, overlapping bars, or unlabelled axes must be disclosed
## Example Trigger Phrases
- "Extract the data from this chart"
- "Transcribe the numbers in this graph"
- "Turn this chart image into a spreadsheet"
- "Digitise this chart so I can rebuild it"
- "What are the exact values in this bar chart?"
## Why This Works Better on Opus 4.7
Earlier models struggled with pixel-level data transcription from charts, often hallucinating values or misreading gridline positions. Opus 4.7 uses a higher image resolution (2576px vs 1568px) with coordinates mapping 1:1 to pixels, making chart data extraction reliable for practical use.
@@ -0,0 +1,195 @@
---
name: cohort-analysis
description: "Structure a cohort analysis for retention, LTV, or behavioural patterns. Use when asked to run a cohort analysis, analyse retention by cohort, segment users by behaviour over time, or calculate lifetime value by acquisition period. Produces a complete cohort analysis framework with methodology, cohort definitions, retention curves, and prioritised interventions."
---
# Cohort Analysis Skill
This skill produces a structured cohort analysis covering retention curves, LTV estimation, behavioural segmentation, and actionable interventions. Output is ready to present to product leadership or share with growth and data teams.
## Required Inputs
Ask the user for these if not provided:
- **Analysis goal** (retention improvement / LTV modelling / behavioural segmentation / churn prediction)
- **Product or feature being analysed**
- **Cohort definition** — what groups users? (acquisition month, signup channel, plan tier, feature adoption)
- **Observation window** — how many periods to track? (e.g. 12 months, 8 weeks)
- **Key metric** — what are you measuring per cohort? (retention rate, revenue, engagement score, feature usage)
- **Available data** — what tables/metrics are available? (paste schema or describe)
- **Baseline** — any existing retention benchmarks or goals?
## Output Structure
---
# Cohort Analysis: [Product / Feature]
**Analysis type:** [Retention / LTV / Behavioural / Churn]
**Cohort definition:** [Acquisition month / Signup channel / Plan tier / Feature adoption date]
**Observation window:** [X months / weeks]
**Primary metric:** [Metric name]
**Date prepared:** [Date]
---
## 1. Cohort Definitions
| Cohort | Period | Size | Description |
|---|---|---|---|
| [Cohort 1] | [Jan 2025] | [N users] | [e.g. Users who signed up in Jan 2025 via organic] |
| [Cohort 2] | [Feb 2025] | [N users] | [...] |
**Cohort logic:**
- Cohort entry event: [First sign-up / First purchase / Feature activation]
- Cohort exit criteria: [Churned / Downgraded / No activity for 30 days]
- Exclusions: [Trial users / Internal test accounts / Users with < X days of data]
---
## 2. Retention Curve
**How to read:** Each cell shows what % of the cohort performed the key metric in period N.
| Cohort | Period 0 | Period 1 | Period 2 | Period 3 | Period 6 | Period 12 |
|---|---|---|---|---|---|---|
| Jan 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
| Feb 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
| [Trend] | — | [↑/↓ vs prior] | [...] | [...] | [...] | [...] |
**Retention plateau:** [At what period does retention flatten? What % does it flatten at?]
**Key observations:**
- [e.g. Period 1 → Period 2 drop is the largest — average X% churn in first 30 days]
- [e.g. Cohorts acquired via [channel] retain X% better at Period 6]
- [e.g. Retention has improved from X% → Y% at Period 3 comparing oldest to newest cohort]
---
## 3. LTV Projection (if applicable)
**ARPU per period:** [£/$/€ X per active user per month]
**Retention curve used:** [Which cohort or blended average]
| Period | Retained % | Revenue per user | Cumulative LTV |
|---|---|---|---|
| Month 1 | [X%] | [£X] | [£X] |
| Month 3 | [X%] | [£X] | [£X] |
| Month 6 | [X%] | [£X] | [£X] |
| Month 12 | [X%] | [£X] | [£X] |
**Blended LTV:** [£X at 12 months — based on blended retention across cohorts]
**LTV by segment:**
| Segment | LTV (12M) | vs Baseline |
|---|---|---|
| [Organic] | [£X] | [+X%] |
| [Paid] | [£X] | [-X%] |
| [Enterprise] | [£X] | [+X%] |
---
## 4. Behavioural Segmentation
Group cohorts by behaviour patterns, not just acquisition date:
| Segment | Definition | Size | Retention (P6) | LTV (12M) |
|---|---|---|---|---|
| **Power users** | [Used core feature ≥ 3x/week in first 30 days] | [X%] | [X%] | [£X] |
| **Casual users** | [Used 12x/week in first 30 days] | [X%] | [X%] | [£X] |
| **Dormant** | [Logged in but did not use core feature] | [X%] | [X%] | [£X] |
| **Never activated** | [Signed up but never completed onboarding] | [X%] | [X%] | [£X] |
**Activation threshold insight:** [What action — taken within the first X days — most strongly predicts retention? This is the "aha moment" to optimise for.]
---
## 5. Leading Indicators of Churn
List the signals that appear **before** users churn, so teams can intervene:
| Signal | How early does it appear? | Churn correlation | Intervention |
|---|---|---|---|
| [No login for 7 days] | [7 days before churn] | [Strong] | [Re-engagement email sequence] |
| [Support ticket with escalation] | [14 days before churn] | [Moderate] | [CSM outreach within 48 hours] |
| [Feature usage dropped >50% WoW] | [10 days before churn] | [Strong] | [In-app nudge with use-case tutorial] |
---
## 6. Cohort Comparison: What's Changed Over Time
Compare oldest and newest cohorts to assess whether product improvements are showing up in retention:
| Metric | [Oldest cohort — e.g. Jan 2024] | [Newest cohort — e.g. Jan 2025] | Change |
|---|---|---|---|
| Period 1 retention | [X%] | [X%] | [↑/↓ X pp] |
| Period 3 retention | [X%] | [X%] | [↑/↓ X pp] |
| Activation rate | [X%] | [X%] | [↑/↓ X pp] |
| Avg. sessions in first 30 days | [X] | [X] | [↑/↓] |
**Verdict:** [Are more recent cohorts performing better or worse? What shipped in that period that might explain the change?]
---
## 7. Recommendations
Prioritise by impact on retention curve:
| # | Recommendation | Target segment | Expected impact | Effort | Priority |
|---|---|---|---|---|---|
| 1 | [e.g. Redesign onboarding to hit activation milestone in day 1, not day 7] | [Never-activated segment] | [+X pp P1 retention] | [Medium] | P1 |
| 2 | [e.g. Launch re-engagement sequence at day 7 inactivity trigger] | [Dormant segment] | [+X pp P2 retention] | [Low] | P1 |
| 3 | [e.g. Introduce power-user features earlier to accelerate habit formation] | [Casual users] | [+X pp P6 LTV] | [High] | P2 |
---
## 8. SQL Reference (if applicable)
Provide the core cohort query so data teams can replicate or extend the analysis:
```sql
-- Retention cohort query
SELECT
DATE_TRUNC('month', u.created_at) AS cohort_month,
DATE_TRUNC('month', e.event_date) AS activity_month,
DATEDIFF('month', u.created_at, e.event_date) AS period,
COUNT(DISTINCT e.user_id) AS retained_users,
COUNT(DISTINCT c.user_id) AS cohort_size,
ROUND(COUNT(DISTINCT e.user_id) * 100.0 / COUNT(DISTINCT c.user_id), 1) AS retention_rate
FROM users u
JOIN events e ON u.user_id = e.user_id
JOIN (
SELECT user_id, DATE_TRUNC('month', created_at) AS cohort_month
FROM users
WHERE created_at >= '[start_date]'
) c ON u.user_id = c.user_id AND DATE_TRUNC('month', u.created_at) = c.cohort_month
WHERE e.event_type = '[key_retention_event]'
GROUP BY 1, 2, 3
ORDER BY 1, 3;
```
---
## Quality Checks
- [ ] Cohort definition is unambiguous — the same user cannot appear in two cohorts
- [ ] Retention curve shows a clear plateau, or the analysis notes that the window is too short to see one
- [ ] LTV projection uses observed retention, not assumed
- [ ] Behavioural segments are mutually exclusive and exhaustive
- [ ] Recommendations are tied to specific cohort or segment findings — not generic growth advice
- [ ] Leading indicators are observable in production data, not just in theory
## Anti-Patterns
- [ ] Do not allow the same user to appear in multiple cohorts — overlapping cohorts produce retention numbers that cannot be compared or acted upon
- [ ] Do not assume assumed ARPU in LTV projections — use observed revenue per retained user per period, not a blended average that hides segment differences
- [ ] Do not draw conclusions from cohorts too small to be statistically meaningful — flag minimum cohort size thresholds and note when a cohort is too small to trust
- [ ] Do not conflate retention rate with engagement rate — a user who logs in but does not complete the key retention event is not retained by the definition used
- [ ] Do not make recommendations without connecting them to specific cohort or segment findings — generic growth advice that could apply to any product adds no value
## Example Trigger Phrases
- "Run a cohort analysis for our SaaS product"
- "Analyse retention by acquisition month for the last 12 cohorts"
- "What's the LTV of users who came via paid vs organic?"
- "Build a cohort retention model showing period 0 through period 12"
- "Segment users by behaviour and show me which group retains best"
@@ -114,6 +114,14 @@ Flag any fields that may not exist in current data infrastructure.
- [ ] Data requirements section flags any missing fields
- [ ] Filters are practical and don't require IT to configure
## Anti-Patterns
- [ ] Do not specify metrics that the available data sources cannot actually support — always validate data availability
- [ ] Do not include more than 810 primary metrics on a single dashboard — more creates noise, not insight
- [ ] Do not skip the primary business question — a dashboard without a north-star question becomes a vanity metrics display
- [ ] Do not choose chart types for aesthetic reasons — every chart type must match the data relationship it represents
- [ ] Do not leave filter configurations vague — specify exact filter values, not just filter categories
## Example Trigger Phrases
- "Design a dashboard to track [business process]"
@@ -0,0 +1,229 @@
---
name: data-pipeline-spec
description: "Design an ETL/ELT data pipeline specification. Use when asked to design a data pipeline, spec an ETL or ELT process, document a data ingestion workflow, or plan a data integration. Produces a complete pipeline spec with sources, transforms, destinations, SLAs, error handling, and data quality rules."
---
# Data Pipeline Spec Skill
This skill produces a complete data pipeline specification covering sources, transformations, destinations, scheduling, SLAs, error handling, data quality checks, and monitoring requirements. Output is ready for engineering handoff or architecture review.
## Required Inputs
Ask the user for these if not provided:
- **Pipeline purpose** — what business question or workflow does this pipeline serve?
- **Source systems** — where does data come from? (databases, APIs, files, event streams)
- **Destination** — where does data land? (data warehouse, data lake, downstream DB, reporting tool)
- **Transformation type** — ETL (transform before loading) or ELT (load raw, transform in warehouse)?
- **Frequency / SLA** — how often must data be fresh? (real-time / hourly / daily / weekly)
- **Volume estimate** — approximate rows/events per run
- **Data quality requirements** — completeness, deduplication, freshness, schema enforcement
- **Team or stack** — any specific tools in use? (Airflow, dbt, Fivetran, Spark, Kafka, etc.)
## Output Structure
---
# Data Pipeline Spec: [Pipeline Name]
**Purpose:** [One sentence — what decision or workflow does this pipeline enable?]
**Type:** [ETL / ELT / Streaming / Batch]
**Owner:** [Team or individual]
**Version:** [1.0]
**Date:** [Date]
**Status:** [Draft / Under Review / Approved]
---
## 1. Overview
[23 sentences describing the pipeline end-to-end: what data moves, from where to where, at what cadence, and why.]
**Architecture diagram (text):**
```
[Source A] ──┐
[Source B] ──┤──► [Ingestion Layer] ──► [Transform Layer] ──► [Destination] ──► [Consumers]
[Source C] ──┘
```
---
## 2. Sources
| Source | System | Connection type | Data format | Update pattern | Volume |
|---|---|---|---|---|---|
| [Source 1] | [PostgreSQL / Salesforce / S3 / Kafka] | [JDBC / REST API / SDK / Webhook] | [JSON / CSV / Parquet / CDC] | [Append / Full refresh / Incremental] | [X rows/day] |
| [Source 2] | [...] | [...] | [...] | [...] | [...] |
**Incremental key (if applicable):** [The column used to identify new or changed records — e.g. `updated_at`, `event_id`]
**Authentication:** [API key / OAuth / IAM role / connection string — note where credentials are stored]
---
## 3. Ingestion Layer
**Tool:** [Fivetran / Airbyte / Kafka Connect / custom script / dbt source]
**Ingestion method:**
- [ ] Full extract (full table refresh each run)
- [ ] Incremental extract (only new/changed rows since last run)
- [ ] CDC (change data capture from database transaction log)
- [ ] Event streaming (continuous ingestion from Kafka/Kinesis)
**Raw landing zone:** [Where raw data lands before transformation — e.g. `raw.salesforce_opportunities` in Snowflake, S3 bucket `s3://data-raw/crm/`]
**Schema handling:** [Strict schema enforcement / Schema evolution allowed / Union schema]
---
## 4. Transformation Logic
List each transformation in execution order. For ELT pipelines, this is the dbt model or SQL layer.
| Step | Name | Description | Input | Output | Tool |
|---|---|---|---|---|---|
| 1 | [Deduplicate events] | [Remove duplicate event rows based on event_id] | `raw.events` | `staging.events_deduped` | [dbt / SQL / Spark] |
| 2 | [Join user profile] | [Enrich events with user attributes from CRM] | `staging.events_deduped`, `raw.users` | `staging.events_enriched` | [...] |
| 3 | [Aggregate to daily] | [Roll up to user×day grain] | `staging.events_enriched` | `mart.user_daily_activity` | [...] |
**Business logic rules:**
- [e.g. Revenue is recognised on `payment_confirmed_at`, not `payment_initiated_at`]
- [e.g. Users in the `internal@company.com` domain are excluded from all metrics]
- [e.g. Currency conversion uses the ECB rate from the first business day of each month]
**Slowly Changing Dimensions (SCD) — if applicable:**
- [e.g. `users.plan_tier` is SCD Type 2 — keep history of plan changes with `valid_from` / `valid_to`]
---
## 5. Destination
| Destination | System | Schema / Table | Write mode | Consumers |
|---|---|---|---|---|
| [Primary] | [Snowflake / BigQuery / Redshift / PostgreSQL] | [`analytics.mart_user_activity`] | [Append / Upsert / Full replace] | [Looker / Metabase / downstream pipeline] |
| [Secondary] | [...] | [...] | [...] | [...] |
**Partitioning / Clustering:** [e.g. Partitioned by `event_date`, clustered by `user_id` — reduces query cost for time-range scans]
**Retention policy:** [e.g. Raw data retained for 90 days; mart tables retained indefinitely]
---
## 6. Scheduling & SLAs
| SLA | Target | Breach action |
|---|---|---|
| **Data freshness** | [Data must be ≤ X hours old by HH:MM UTC] | [Page on-call / alert Slack channel] |
| **Pipeline completion** | [Must complete within X minutes of trigger] | [Alert and auto-retry] |
| **Availability** | [Pipeline must run successfully X% of days per month] | [Incident review] |
**Schedule:** [Cron expression and human description — e.g. `0 6 * * *` — daily at 06:00 UTC]
**Trigger type:**
- [ ] Time-based (cron)
- [ ] Event-based (triggered by upstream pipeline success / file arrival / Kafka lag)
- [ ] Manual (ad hoc runs only)
**Backfill strategy:** [How to reprocess historical data if the pipeline fails or logic changes — e.g. parameterised date range, full drop-and-reload]
---
## 7. Data Quality Rules
| Check | Table | Rule | Failure action |
|---|---|---|---|
| Completeness | `staging.events` | `event_id IS NOT NULL` — 100% of rows | Block load / Alert |
| Uniqueness | `mart.user_daily_activity` | `(user_id, date)` must be unique | Block load |
| Freshness | `mart.user_daily_activity` | `max(event_date) >= CURRENT_DATE - 1` | Alert |
| Volume | `staging.events` | Row count within ±20% of 7-day average | Alert |
| Referential integrity | `staging.events` | All `user_id` values exist in `users` table | Alert |
**DQ tool:** [dbt tests / Great Expectations / Monte Carlo / custom SQL assertions]
---
## 8. Error Handling & Recovery
**Retry policy:** [e.g. 3 retries with exponential back-off: 5 min, 20 min, 60 min]
**Failure modes and responses:**
| Failure | Detection | Response | Owner |
|---|---|---|---|
| Source unavailable | HTTP 5xx / connection timeout | Retry 3×, then alert and skip run | Data engineering |
| Schema change in source | Column missing or type mismatch | Block load, alert schema owner | Data owner + engineering |
| DQ check fails | dbt test failure / assertion error | Block load for P1 checks; alert for P2 | Data engineering |
| Partial load | Row count < expected threshold | Alert; do not publish to consumers until resolved | Data engineering |
**Dead-letter queue:** [Where failed records are routed for manual inspection — e.g. `raw.dlq_events`]
---
## 9. Monitoring & Observability
**Metrics to track:**
- Pipeline run duration (p50, p95)
- Rows processed per run
- DQ check pass rate
- Source freshness lag
- Error rate per source
**Alerting:**
- [Slack channel: #data-alerts]
- [PagerDuty: data-on-call escalation for P1 SLA breaches]
- [Dashboard: [link to monitoring dashboard]]
**Logging:** [What gets logged and where — e.g. Airflow task logs to CloudWatch, structured JSON to data lake]
---
## 10. Dependencies & Sequencing
**Upstream dependencies:** [Which pipelines or data sources must succeed before this pipeline runs?]
**Downstream dependents:** [Which dashboards, pipelines, or models depend on this pipeline's output?]
```
[upstream pipeline A] ──► THIS PIPELINE ──► [downstream dashboard B]
└──► [downstream pipeline C]
```
**Coordination mechanism:** [Airflow DAG dependency / dbt ref() / event trigger / manual gate]
---
## 11. Security & Compliance
- **PII fields:** [List columns containing PII — e.g. `email`, `ip_address`, `name`]
- **Masking / Pseudonymisation:** [e.g. email hashed with SHA-256 before landing in mart layer]
- **Access control:** [Who can query the destination tables? — e.g. Role-based access in Snowflake]
- **Data residency:** [Which regions is data permitted to transit and rest in?]
- **Audit trail:** [Is pipeline execution auditable for compliance purposes? Where are logs retained?]
---
## Quality Checks
- [ ] Every source has an incremental key or full-refresh justification
- [ ] Business logic rules are documented, not just the SQL
- [ ] SLAs are agreed with consumers, not set unilaterally by engineering
- [ ] DQ checks cover completeness, uniqueness, freshness, and volume
- [ ] Failure modes include a documented recovery owner
- [ ] PII fields are identified and a treatment plan is specified
## Anti-Patterns
- [ ] Do not spec a pipeline without defining SLAs — "as fast as possible" is not an acceptable freshness target
- [ ] Do not omit error handling and dead-letter queue strategy — every pipeline must specify what happens to failed records
- [ ] Do not design idempotent loads without documenting the deduplication key — assume reruns will happen
- [ ] Do not leave data quality rules implicit — schema validation, null checks, and referential integrity must be explicit
- [ ] Do not ignore schema evolution — specify how upstream schema changes are detected and handled
## Example Trigger Phrases
- "Design a data pipeline for our Salesforce to Snowflake sync"
- "Write a pipeline spec for ingesting Stripe events into our data warehouse"
- "Build an ETL spec for our user activity data"
- "Document our dbt pipeline from raw events to the analytics mart"
- "Spec out the pipeline that feeds the executive dashboard"
@@ -94,6 +94,14 @@ Suggest a 3-tier dashboard structure:
- [ ] Dashboard tiers are tailored to the product stage
- [ ] All metric definitions are unambiguous (formula or clear description)
## Anti-Patterns
- [ ] Do not set a North Star metric that measures business activity (revenue, pageviews) rather than customer value delivered — this creates incentives misaligned with product quality
- [ ] Do not define metrics without specifying the formula or data source — an ambiguous metric will be measured differently by different people
- [ ] Do not skip counter-metrics — optimising any single metric without a guard rail will eventually produce perverse incentives
- [ ] Do not include more than 45 metrics in a daily team view — a dashboard with 20 metrics is a dashboard nobody looks at
- [ ] Do not classify all metrics as "leading" — be honest about which are lagging outcome metrics and which genuinely predict future outcomes
## Example Trigger Phrases
- "Build a metrics framework for [product]"
@@ -1,6 +1,6 @@
---
name: sql-query-explainer
description: "Explain, optimise, or translate SQL queries into plain language. Use when asked to explain a SQL query, optimise slow SQL, write a data dictionary, translate SQL to plain English for non-technical stakeholders, or review a query for correctness and performance. Works across PostgreSQL, MySQL, BigQuery, Snowflake, and standard SQL."
description: "Explains, optimises, writes, and documents SQL queries. Use when asked to explain a SQL query, optimise slow SQL, translate SQL to plain English for non-technical stakeholders, write a query from a natural language description, or produce query documentation. Produces plain-English explanations, annotated optimised queries, or a data dictionary covering output shape, assumptions, and known limitations. Works across PostgreSQL, MySQL, BigQuery, Snowflake, and standard SQL."
---
# SQL Query Explainer Skill
BIN
View File
Binary file not shown.
Binary file not shown.
@@ -1,12 +1,21 @@
---
name: ab-test-planner
description: Designs statistically rigorous A/B tests for product features, UI changes, onboarding flows, and pricing experiments. Use when asked to set up an experiment, run an A/B test, calculate sample size, or interpret test results. Triggers on "A/B test", "experiment", "split test", "statistical significance", "sample size".
description: "Design statistically rigorous A/B tests for product features, UI changes, onboarding flows, and pricing experiments. Use when asked to set up an experiment, design an A/B test, calculate sample size, or interpret test results. Produces a complete test plan with hypothesis, variant definitions, sample size, duration estimate, guardrail metrics, and a results interpretation guide."
---
# A/B Test Planner Skill
Design experiments that produce trustworthy results — not just directional signals. Every test output includes hypothesis, success metrics, sample size, duration, and a results interpretation guide.
## Required Inputs
Ask the user for these if not provided:
- **What is being tested** (feature, UI change, copy, pricing, onboarding step)
- **Hypothesis** (or ask to help formulate one)
- **Primary metric** (conversion rate, click-through, completion rate, etc.)
- **Baseline rate** and **minimum detectable effect** (MDE)
- **Daily eligible users** (to calculate duration)
## Experiment Design Checklist
Before running any test, confirm:
@@ -93,3 +102,20 @@ Flag if traffic is too low to reach significance in under 8 weeks — recommend
- If user wants to test multiple variants, explain the multiple comparisons problem and recommend a Bonferroni correction or a Bayesian approach
- If traffic is very low (<1,000 users/day), recommend qualitative alternatives: moderated testing, 5-second tests, or user interviews
- Never approve a test with no guardrail metrics — always protect revenue, retention, or core engagement
## Anti-Patterns
- [ ] Do not run a test without a directional hypothesis — "let's see what happens" produces uninterpretable results
- [ ] Do not declare a winner before reaching the pre-planned sample size — peeking at results inflates false positive rates
- [ ] Do not test multiple independent changes in a single variant — you won't know which change caused the result
- [ ] Do not use engagement metrics (clicks, time-on-page) as the primary metric when the goal is revenue or retention — proxy metrics mislead
- [ ] Do not ignore guardrail metrics — a conversion lift that causes a support ticket spike is not a win
## Quality Checks
- [ ] Hypothesis is directional (predicts a specific direction and magnitude, not "let's see")
- [ ] Primary metric is singular (guardrail metrics are secondary)
- [ ] Sample size is calculated from actual MDE and baseline (not guessed)
- [ ] Test duration accounts for weekly seasonality (minimum 2 weeks)
- [ ] Guardrail metrics are defined (at least one to protect revenue or core engagement)
- [ ] Rollback trigger is specified with a concrete threshold
@@ -1,6 +1,6 @@
---
name: go-to-market-planner
description: Builds go-to-market (GTM) plans for product launches, feature releases, and new market entries. Use when planning a product launch, writing a GTM strategy, defining launch tiers, or coordinating cross-functional launch activities. Triggers on "go-to-market", "GTM plan", "product launch plan", "launch strategy", "release plan".
description: "Build a go-to-market plan for any product launch, feature release, or new market entry. Use when planning a product launch, writing a GTM strategy, defining launch tiers, or coordinating cross-functional launch activities. Produces a tiered GTM plan with messaging, cross-functional activity tracker, success metrics, and launch day checklist."
---
# Go-to-Market Planner Skill
@@ -106,9 +106,36 @@ Always confirm tier with the user before proceeding.
---
## Required Inputs
Ask the user for these if not provided:
- **Product or feature name**
- **Target launch date**
- **Launch tier** (Tier 1 / 2 / 3 — or describe scope and the skill will classify)
- **Target audience** (who benefits and who it's NOT for)
- **Key message** (what's the headline outcome for the customer)
- **PM and launch owner**
## Guidelines
- Never plan a Tier 1 launch without at least 8 weeks of lead time
- Always include a "Not for" section — it prevents misdirected sales and support tickets
- Recommend a soft launch to 510% of users before full rollout for any Tier 1 or 2 launch
- Post-launch retrospective should be scheduled at launch planning time — don't leave it to chance
## Quality Checks
- [ ] Launch tier is confirmed and appropriate for scope
- [ ] "Not for" section is included to prevent misdirected sales and support
- [ ] Every function has at least one activity with a named owner and due date
- [ ] Success metrics include a measurement window (30/60/90 days)
- [ ] Rollback procedure is confirmed for Tier 1 and 2 launches
- [ ] Post-launch retrospective is scheduled
## Anti-Patterns
- [ ] Do not build a Tier 1 GTM plan for an incremental feature update — tier the launch appropriately before planning
- [ ] Do not create activity lists without named owners and due dates — unowned tasks do not get done
- [ ] Do not skip the rollback procedure for Tier 1 and 2 launches — every significant launch must have an abort plan
- [ ] Do not treat marketing and engineering as separate tracks — cross-functional coordination is the whole point of a GTM plan
- [ ] Do not set success metrics without a defined measurement window — "increase signups" is not a measurable target
@@ -0,0 +1,101 @@
---
name: pptx-slide-auditor
description: "Audit a PowerPoint presentation for layout issues, text overflow, visual hierarchy problems, and consistency gaps. Use when asked to review a slide deck, check a presentation before a meeting, audit slides for layout problems, or QA a deck before sharing. Produces a slide-by-slide report with issues ranked by severity and specific fixes. Best used with Claude Opus 4.7 or newer for reliable slide-level vision analysis."
---
# PPTX Slide Auditor Skill
Runs a systematic visual and structural audit of a PowerPoint presentation — identifying layout issues, text overflow, inconsistent styling, weak visual hierarchy, and slides that will cause problems in a presentation setting. Built to leverage Opus 4.7 vision improvements for pixel-level layout analysis.
## Required Inputs
Ask the user for these if not provided:
- **The deck** (upload the .pptx file or individual slide screenshots)
- **Audience** (internal team / executive / external client / conference / investor)
- **Presentation mode** (presented live / sent to read / shared async on video)
- **Areas of concern** (optional — e.g. "I think slide 12 is overcrowded")
## Output Structure
### 1. Deck Overview
| Metric | Result |
|---|---|
| Total slides | N |
| Overall status | Ready / Minor fixes needed / Major revisions required |
| Readability score | /10 |
| Visual consistency score | /10 |
| Most common issue | [Pattern observed across multiple slides] |
### 2. Slide-by-Slide Audit
For each slide with issues:
**Slide N: [Slide title]**
- Status: Ready / Fix before sending / Major revision
- Issues found:
- [Specific issue with exact location — e.g. "Body text extends beyond the text frame on the right side"]
- [Issue 2]
- Suggested fix: [Specific action — move element, reduce text, resize]
Slides with no issues: just list the slide numbers. Do not write anything else about them.
### 3. Pattern Issues Across the Deck
Issues that repeat across multiple slides:
**[Pattern title — e.g. "Inconsistent body text size"]**
- Slides affected: [list]
- Root cause: [master slide issue / manual overrides / mixed templates]
- Fix: [Single action to resolve across all affected slides]
### 4. Visual Hierarchy Check
| Dimension | Status | Notes |
|---|---|---|
| Title consistency (size, font, colour) | Pass / Fail | |
| Body text readability at presentation distance | Pass / Fail | |
| Image placement alignment | Pass / Fail | |
| Whitespace and breathing room | Pass / Fail | |
| Data visualisation clarity | Pass / Fail / N/A | |
### 5. Audience-Specific Flags
Based on the stated audience:
- **Executive audience:** flag slides with too much text, complex tables, or unclear bottom-line messages
- **External client:** flag slides with internal jargon, unfinished placeholder text, or confidentiality concerns
- **Live presentation:** flag slides that will be hard to read from the back of a room
- **Async/video:** flag slides that assume a presenter voiceover
### 6. Prioritised Fix List
| # | Fix | Slide | Effort | Impact |
|---|---|---|---|---|
| 1 | [Specific fix] | Slide N | Low/Med/High | High |
Order by: fixes before handoff (critical) > consistency fixes (high) > polish (medium).
## Quality Checks
- [ ] Every issue references a specific slide number and location on the slide
- [ ] Pattern issues are identified separately from slide-specific issues
- [ ] Fix list is ordered by impact, not by slide order
- [ ] Audience-appropriate concerns flagged explicitly
- [ ] Slides without issues are listed briefly, not ignored
## Anti-Patterns
- [ ] Do not flag stylistic preferences as issues — only report genuine layout problems, overflow, and consistency errors
- [ ] Do not produce a flat list of issues — group by severity (Critical / Major / Minor) so fixes can be prioritised
- [ ] Do not skip slides without commenting — every slide must have an explicit pass or issue status
- [ ] Do not suggest redesigning content — the audit scope is layout, consistency, and readability, not messaging
- [ ] Do not report the same issue type repeatedly across slides without summarising the pattern — consolidate repeated issues
## Example Trigger Phrases
- "Audit this slide deck before my board meeting"
- "Review this PowerPoint for layout issues"
- "Check this presentation for consistency problems"
- "QA my deck before I send it to the client"
- "What is wrong with slide 7 in this deck?"
## Why This Works Better on Opus 4.7
Earlier models struggled with precise spatial analysis of slide layouts — they would hallucinate issues or miss obvious overflow problems. Opus 4.7 vision improvements mean coordinates map 1:1 to pixels, making slide-level issue detection reliable without manual screenshot annotation.
@@ -1,12 +1,21 @@
---
name: product-launch-checklist
description: Generates comprehensive pre-launch, launch day, and post-launch checklists for product releases. Use when preparing for a product launch, feature release, or major update. Triggers on "launch checklist", "pre-launch", "launch day", "release checklist", "ship checklist", "go-live checklist".
description: "Generate a comprehensive pre-launch, launch day, and post-launch checklist for any product release. Use when preparing for a product launch, feature release, or major update. Produces a role-assigned, tiered checklist covering engineering readiness, marketing and comms, support, and post-launch monitoring."
---
# Product Launch Checklist Skill
Never launch without checking everything. Generate a complete, role-assigned checklist covering pre-launch readiness, launch day execution, and post-launch monitoring.
## Required Inputs
Ask the user for these if not provided:
- **Launch name** and planned launch date
- **Launch tier** (1 = major product launch, 2 = significant feature release, 3 = incremental update)
- **Team members and their roles** (engineering lead, PM, marketing, support, etc.)
- **Feature description** (what is being launched)
- **Rollback capability** (can this be feature-flagged or reverted quickly?)
## How to Use This Skill
Provide:
@@ -109,6 +118,22 @@ The skill generates a tiered checklist. Tier 3 launches use only the Essentials
---
## Quality Checks
- [ ] Launch tier confirmed before generating checklist (scope determines depth)
- [ ] Go/No-Go decision has a named owner and a specific decision time
- [ ] Rollback procedure is documented and tested (not just planned)
- [ ] Feature flag expansion is staged (5% → 50% → 100%), not all-at-once
- [ ] Post-launch retrospective is scheduled at launch time
## Anti-Patterns
- [ ] Do not apply a Tier 1 checklist to an incremental update — tier the launch appropriately before generating the checklist
- [ ] Do not launch on a Friday without confirmed weekend engineering coverage
- [ ] Do not leave the Go/No-Go decision owner as "the team" — it must be a named individual
- [ ] Do not skip the rollback plan for Tier 1 and 2 launches — know the revert time before going live
- [ ] Do not close the launch without scheduling the post-launch retrospective — it must be booked at launch time, not after
## Guidelines
- The Go/No-Go decision must have a named owner — "the team" is not an owner
@@ -1,19 +1,20 @@
---
name: retro-analysis
description: Analyse sprint delivery data and produce a structured retrospective brief
tool_integration: Jira, Miro
description: "Analyses sprint delivery data and produces a structured retrospective brief. Use when asked to run a retrospective, analyse sprint data, prepare a retro brief, or turn sprint metrics into discussion prompts. Produces a data-grounded retrospective brief with completion stats, pattern analysis, Start/Stop/Continue prompts, and one concrete experiment for next sprint."
---
# Retrospective Analysis Skill
## Purpose
Generate a data-grounded retrospective brief that separates facts from feelings, so the team spends retro time on solutions rather than debating what happened.
## Required Inputs
- Sprint tickets: planned vs. completed
- Carry-over tickets and reasons
- Tickets reopened after closing
- Any incidents or unplanned work
- Sprint velocity vs. historical average
Ask the user for these if not provided:
- **Sprint tickets: planned vs. completed**
- **Carry-over tickets and reasons** (if known)
- **Tickets reopened after closing** (quality signal)
- **Any incidents or unplanned work** (scope creep signal)
- **Sprint velocity vs. historical average** (trend context)
## Process
1. Calculate: completion rate, carry-over rate, unplanned work percentage
@@ -21,8 +22,9 @@ Generate a data-grounded retrospective brief that separates facts from feelings,
3. Note any process or communication breakdowns visible in the data
4. Prepare 3 "Start / Stop / Continue" prompts based on the data — not generic, specific to this sprint
5. Suggest 1 concrete experiment for the next sprint based on the biggest friction point
6. **Validate** — Confirm each prompt is specific to this sprint (not a recycled generic prompt), and that the recommended experiment is concrete and measurable
## Output Format
## Output Structure
### Sprint [Number] Retrospective Brief
@@ -35,9 +37,25 @@ Generate a data-grounded retrospective brief that separates facts from feelings,
[2-3 observations grounded in the numbers above]
**Discussion Prompts:**
- Start: [specific prompt]
- Stop: [specific prompt]
- Continue: [specific prompt]
- Start: [specific prompt based on this sprint's data]
- Stop: [specific prompt based on this sprint's data]
- Continue: [specific prompt based on this sprint's data]
**Suggested Experiment for Next Sprint:**
[One concrete, testable process change]
[One concrete, testable process change — with a specific success metric]
## Quality Checks
- [ ] Each Start/Stop/Continue prompt names a specific behaviour, not a vague category
- [ ] The recommended experiment is testable in one sprint
- [ ] Carry-over analysis identifies the ticket type or cause, not just the count
- [ ] Data observations don't assign blame — they describe patterns
- [ ] Velocity trend is mentioned in context (is this a one-off or a pattern?)
## Anti-Patterns
- [ ] Do not assign blame to individuals in the retrospective brief — observations must describe patterns, not people
- [ ] Do not produce Start/Stop/Continue prompts that are vague categories — each must name a specific behaviour
- [ ] Do not recommend an experiment that cannot be completed within one sprint — small, testable experiments only
- [ ] Do not treat carry-over tickets as a velocity problem without first identifying the root cause category
- [ ] Do not run the same retrospective format every sprint — vary the format to prevent engagement fatigue
@@ -1,19 +1,20 @@
---
name: sprint-brief
description: Generate a structured sprint brief from Jira sprint data and goals
tool_integration: Jira, Slack
description: "Generate a structured sprint brief from sprint data and goals. Use when asked to write a sprint brief, create a sprint summary, document sprint goals and scope, or produce a team-facing sprint overview. Produces a scannable brief with sprint goal, rationale, grouped work, critical path, risks, and definition of done."
---
# Sprint Brief Skill
## Purpose
Produce a clear, scannable sprint brief that every team member — engineer, designer, PM — can read in under three minutes and understand exactly what we're doing and why.
## Required Inputs
- Sprint name and number
- Sprint goal (1-2 sentences)
- Ticket list with owners
- Known dependencies or blockers
- Carry-over items from previous sprint
Ask the user for these if not provided:
- **Sprint name and number**
- **Sprint goal** (1-2 sentences — flag if too vague)
- **Ticket list with owners** (or a description of the work)
- **Known dependencies or blockers**
- **Carry-over items from previous sprint** (if any)
## Process
1. Read sprint goal and check it's specific and measurable — flag if it's too vague
@@ -21,11 +22,12 @@ Produce a clear, scannable sprint brief that every team member — engineer, des
3. Identify the critical path — which tickets must complete for the sprint goal to be met?
4. Flag risks: tickets with unclear acceptance criteria, missing designs, unresolved dependencies
5. Note carry-over items and whether they affect this sprint's goal
6. **Validate** — Confirm the sprint goal is achievable given the ticket scope and capacity. If the critical path items alone would fill the sprint, flag it as overloaded.
## Output Format
## Output Structure
### Sprint [Number] Brief — [Dates]
**Sprint Goal:** [1-2 sentences]
**Sprint Goal:** [1-2 sentences — specific and measurable]
**Why This Sprint Matters:** [Connect to quarterly OKR in 2-3 sentences]
**What We're Building:**
@@ -41,3 +43,19 @@ Produce a clear, scannable sprint brief that every team member — engineer, des
**Carry-over from Last Sprint:** [List + impact on current goal]
**Definition of Done:** [Specific, agreed criteria for sprint success]
## Quality Checks
- [ ] Sprint goal is specific enough to score pass/fail at the end of the sprint
- [ ] Critical path items are named — not just "the important ones"
- [ ] Every risk has a mitigation or owner (not just "this is a risk")
- [ ] Carry-over items are connected to their impact on this sprint's goal
- [ ] Definition of Done is agreed criteria, not a task list
## Anti-Patterns
- [ ] Do not write a sprint goal as a task list — the goal must be a single outcome-focused statement that can be scored pass/fail
- [ ] Do not leave the critical path unnamed — "the important tickets" is not a critical path
- [ ] Do not list risks without a mitigation or owner — a risk without a response is just a worry list
- [ ] Do not ignore carry-over items' impact on this sprint's capacity and goal
- [ ] Do not write a Definition of Done that mixes task completion with outcome criteria — they must be observable and agreed before the sprint starts
@@ -1,6 +1,6 @@
---
name: sprint-planning
description: Structures and facilitates sprint planning sessions. Use when asked to plan a sprint, organise backlog items, assign story points, create sprint goals, or prepare sprint planning meeting agendas. Triggers on phrases like "plan sprint", "sprint planning", "sprint goal", "sprint backlog".
description: "Structure and facilitate sprint planning sessions. Use when asked to plan a sprint, organise backlog items, assign story points, create sprint goals, or prepare sprint planning agendas. Produces a sprint goal, velocity-calibrated backlog, capacity plan, risk flags, and a structured sprint planning meeting agenda."
---
# Sprint Planning Skill
@@ -15,7 +15,7 @@ Transform raw backlog items into a structured, achievable sprint with clear goal
- **Sprint Planning Agenda** — structured 2-hour meeting agenda with timings
- **Risk Flags** — blockers or dependencies that could derail the sprint
## Inputs to Request From User
## Required Inputs
Ask for (if not already provided):
- Sprint duration (1 or 2 weeks)
@@ -87,3 +87,19 @@ Story points to commit = Historical velocity × Availability factor
- Recommend the team commits to 80% of available capacity, not 100%
- If no velocity data is provided, assume 2030 points for a 5-person team as a starting point
- Highlight any story with unclear ownership as a blocker
## Quality Checks
- [ ] Sprint goal is outcome-focused (not "implement X" — something like "users can do Y")
- [ ] Team capacity is calculated using actual availability, not theoretical 100%
- [ ] Every story has an acceptance criterion (flag any that don't)
- [ ] Stories estimated at 8+ points are flagged for splitting
- [ ] Carry-overs from last sprint are accounted for in capacity
## Anti-Patterns
- [ ] Do not write sprint goals as task lists — goals must be outcome-focused and scoreable pass/fail at sprint end
- [ ] Do not commit to 100% of available capacity — always recommend 80% to preserve slack for unplanned work
- [ ] Do not carry stories with no acceptance criteria into the sprint — flag them as blockers before committing
- [ ] Do not allow stories estimated at 8+ points into the sprint without splitting them first
- [ ] Do not ignore carry-over items when calculating capacity — they consume capacity and must be accounted for before new work is pulled in
@@ -1,12 +1,20 @@
---
name: technical-spec-template
description: Creates structured technical specification documents that bridge product requirements and engineering implementation. Use when writing a tech spec, engineering spec, system design doc, or API specification. Triggers on "technical spec", "tech spec", "engineering spec", "system design doc", "API spec", "implementation spec".
description: "Create structured technical specification documents that bridge product requirements and engineering implementation. Use when writing a tech spec, engineering spec, system design doc, or API specification. Produces a complete spec with problem statement, proposed solution, data model, API design, alternatives considered, security considerations, testing plan, and rollout strategy."
---
# Technical Spec Template Skill
Write technical specifications that engineers actually read — clear problem framing, unambiguous requirements, explicit decisions, and documented trade-offs.
## Required Inputs
Ask the user for these if not provided:
- **Feature or system description** (what needs to be specced)
- **Related PRD or product brief** (if available)
- **Engineering reviewers** (whose sign-off is needed)
- **Known constraints** (technical limitations, security requirements, performance targets)
## When to Write a Tech Spec
Write a tech spec when:
@@ -131,3 +139,19 @@ Error codes: [list]
- Security and privacy sections are never optional for features that touch user data
- Recommend async review: engineers read first, then a 30-minute sync to resolve questions
- Keep the spec updated as implementation progresses — stale specs are worse than no specs
## Quality Checks
- [ ] Problem statement contains no solution language
- [ ] Non-goals explicitly list at least 2 things that might be assumed in scope
- [ ] At least 2 alternative approaches are documented with reasons for rejection
- [ ] Security and privacy section is completed for any feature touching user data
- [ ] All open questions have a named owner and due date (not "TBD")
## Anti-Patterns
- [ ] Do not include solution language in the problem statement — the problem must be described independently of the proposed solution
- [ ] Do not omit alternatives considered — a spec that considers only one approach has not been properly evaluated
- [ ] Do not leave open questions as "TBD" without a named owner and due date — unresolved questions are blockers
- [ ] Do not skip security and privacy sections for any feature that touches user data
- [ ] Do not write a non-goals section that is empty — always list at least two things that might be assumed in scope
@@ -0,0 +1,226 @@
---
name: user-story-writer
description: "Write well-structured user stories with acceptance criteria and edge cases. Use when asked to write user stories, create tickets from a feature brief, convert a PRD into stories, or write acceptance criteria. Produces ready-to-estimate stories in the standard format with clear acceptance criteria, edge cases, and definition of done."
---
# User Story Writer Skill
This skill produces production-ready user stories from a feature brief, PRD section, or verbal description. Each story follows the standard format with a clear who/what/why, behavioural acceptance criteria in Given/When/Then format, edge cases, and definition of done. Output is ready to paste into Jira, Linear, or your planning tool.
## Required Inputs
Ask the user for these if not provided:
- **Feature or change** to break into stories — paste the brief, PRD section, or describe the feature
- **User types / personas** involved (e.g. admin, end user, guest, API consumer)
- **Scope** — are we writing one story or decomposing an epic into a full set of stories?
- **Acceptance criteria format preference** — Given/When/Then, bullet checklist, or both?
- **Technical constraints or notes** — anything the engineering team has flagged that should shape the stories
## Output Structure
For each story:
---
## Story: [Short title — verb + noun, e.g. "Filter search results by date range"]
**Epic:** [Parent epic name — e.g. "Advanced Search"]
**Story ID:** [Jira/Linear ID — leave blank if not yet created]
**Priority:** [P1 / P2 / P3]
**Story points:** [Leave blank — for engineering to estimate]
---
### User Story
> **As a** [specific user type — not "user"],
> **I want to** [concrete action they want to take],
> **So that** [the outcome they achieve — business value, not feature description].
**Example:**
> As an **account manager**,
> I want to **filter my client list by last contact date**,
> so that I **can quickly identify clients I haven't spoken to in over 30 days and prioritise outreach**.
---
### Context
[13 sentences of context that aren't in the user story itself: when does this story matter, what triggers the need, how does it fit into a larger flow. This helps engineers understand why before they ask.]
---
### Acceptance Criteria
**Format: Given / When / Then**
Each criterion tests one specific behaviour. Write one GWT per observable outcome — not one GWT for the whole feature.
**AC1: [Short name for this criterion]**
```
Given [starting state or context]
When [user action]
Then [observable system behaviour]
```
**AC2: [Short name]**
```
Given [...]
When [...]
Then [...]
```
**AC3: [Short name]**
```
Given [...]
When [...]
Then [...]
```
---
### Edge Cases
[List scenarios that are non-obvious but must be handled. These become additional ACs or notes to engineering.]
- [ ] **[Edge case 1]:** [e.g. User applies a date filter that returns 0 results — show empty state with clear messaging and a "clear filters" action]
- [ ] **[Edge case 2]:** [e.g. User has >10,000 clients — filter must not degrade load time >200ms]
- [ ] **[Edge case 3]:** [e.g. Date filter persists across page refresh — or explicitly should not if that's the decision]
- [ ] **[Permission edge case]:** [e.g. Read-only users can see the filter but cannot save filter presets]
---
### Out of Scope
[Explicitly state what this story does NOT cover — prevents scope creep and clarifies where the next story begins.]
- Saving and sharing filter presets (separate story — see [Story X])
- Bulk actions on filtered results
- Exporting filtered client list to CSV
---
### Definition of Done
- [ ] Acceptance criteria all pass
- [ ] Edge cases handled (or explicitly deferred with a new ticket raised)
- [ ] Unit tests written for each AC
- [ ] Works on mobile viewport (if applicable)
- [ ] Accessibility: keyboard navigable and screen-reader compatible
- [ ] Error states are handled and copy approved
- [ ] Product and design have reviewed in staging
- [ ] No console errors in production build
---
## Epic Decomposition Template
If the user provides an epic or feature brief, decompose it into a full set of stories before writing them:
**Epic:** [Name]
**Goal:** [What outcome does completing this epic achieve?]
**Stories:**
| # | Story | Notes | Dependencies |
|---|---|---|---|
| 1 | [Core happy path story — the simplest version of the feature that delivers value] | | |
| 2 | [Validation / error handling story] | | Depends on #1 |
| 3 | [Edge case or power user story] | | Depends on #1 |
| 4 | [Admin or configuration story] | | |
| 5 | [Performance or scale story — if applicable] | | Depends on #1 |
**Suggested sprint order:** [Which stories are P1 for MVP? Which can follow in a later sprint?]
---
## Common Story Anti-Patterns — and Fixes
Use these to review stories before handing to engineering:
| Anti-pattern | Example | Fix |
|---|---|---|
| **Solution in the story** | "As a user I want a dropdown filter" | Remove the UI decision — "As a user I want to filter by date range" |
| **Vague "so that"** | "so that it's easier to use" | Make it specific — "so that I can prioritise outreach without opening each record manually" |
| **Too big** | Story covers 5 distinct user flows | Split into separate stories per flow |
| **No acceptance criteria** | Story has description only | Add at least 3 GWT criteria before engineering starts |
| **ACs that test the solution, not the behaviour** | "Given the dropdown is open, When I select an option" | Test the outcome — "Given I have applied a date filter, When I view my results, Then only clients last contacted in that date range appear" |
| **Missing empty state** | No AC for what happens with 0 results | Add it — empty states are part of the feature |
| **Missing error state** | No AC for network failure or invalid input | Add error handling ACs explicitly |
---
## Example: Full Story Set for a Feature
**Feature brief:** "Allow users to export their invoice history as a PDF or CSV"
---
### Story 1: Export invoice list as CSV
> As a **finance admin**,
> I want to **export my invoice history as a CSV file**,
> so that I can **import it into our accounting software without manual data entry**.
**AC1: Successful export**
```
Given I am on the Invoices page with at least one invoice
When I click "Export" and select "CSV"
Then a CSV file is downloaded containing all visible invoices with columns: Invoice ID, Date, Amount, Status, Customer Name
```
**AC2: Empty state**
```
Given I am on the Invoices page with no invoices
When I click "Export"
Then the export button is disabled and a tooltip reads "No invoices to export"
```
**AC3: Filtered export**
```
Given I have applied a date filter showing invoices from Jan 2026 only
When I click "Export" and select "CSV"
Then the export contains only invoices from Jan 2026 — not all invoices
```
**Edge cases:**
- [ ] Export with >10,000 invoices — must complete in <30s or show a progress indicator
- [ ] Export triggered on mobile — downloads to device's default download location
**Out of scope:** PDF export (Story 2), scheduled exports (future epic)
---
### Story 2: Export invoice list as PDF
> As a **finance admin**,
> I want to **export my invoice history as a formatted PDF**,
> so that I can **share a professional summary with our accountant**.
[... ACs follow same pattern ...]
---
## Quality Checks
- [ ] Every story has a specific user type — not "a user" or "the system"
- [ ] The "so that" explains business value — not just feature description
- [ ] Each AC tests one observable outcome — not a bundle of behaviours
- [ ] Empty states, error states, and edge cases are explicitly handled
- [ ] Out of scope is documented — not assumed
- [ ] Stories are independent — they can be shipped individually without depending on unreleased work (except where explicitly noted)
## Anti-Patterns
- [ ] Do not write user stories from a technical perspective — every story must be from the user's point of view and state their goal
- [ ] Do not write acceptance criteria that are untestable — every criterion must have a clear pass/fail condition
- [ ] Do not create stories that are too large to complete in a single sprint — break epics into estimable, independently deliverable stories
- [ ] Do not omit edge cases — unhappy paths and error states are required, not optional
- [ ] Do not skip the Definition of Done — without it, "done" means different things to different people
## Example Trigger Phrases
- "Write user stories for [feature] from this brief"
- "Break this PRD section into user stories with acceptance criteria"
- "Convert these feature requirements into Jira tickets"
- "Write the user stories and ACs for [feature name]"
- "Decompose this epic into individual stories ready for sprint planning"
@@ -167,6 +167,14 @@ Ask the user for these if not provided:
- [ ] Effort estimates are included for prioritisation
- [ ] Testing recommendations are included
## Anti-Patterns
- [ ] Do not rely solely on automated scanning tools — automated checks catch ~30% of issues; manual keyboard and screen reader testing is required
- [ ] Do not label an issue "minor" simply because it only affects a small percentage of users — for those users it may block all access
- [ ] Do not add ARIA roles to fix broken semantics — use correct semantic HTML first; ARIA is a last resort
- [ ] Do not confuse colour contrast of text with colour contrast of UI components — they have different minimum ratios (4.5:1 vs 3:1)
- [ ] Do not audit only the happy path — error states, empty states, and loading states must also meet accessibility requirements
## Example Trigger Phrases
- "Audit this design for accessibility"
@@ -1,6 +1,6 @@
---
name: design-critique
description: "Give structured, constructive feedback on any design. Use when asked to critique a design, review a UI, give feedback on a Figma file or wireframe, assess a user flow, or evaluate a design against UX principles. Applies Jobs-to-be-Done, Gestalt principles, and usability heuristics to give actionable feedback."
description: "Gives structured, constructive feedback on any design using UX frameworks. Use when asked to critique a design, review a UI, give feedback on a Figma file or wireframe, assess a user flow, or evaluate a design against UX principles. Applies Jobs-to-be-Done, Gestalt principles, and usability heuristics to give actionable feedback with prioritised issues and specific recommendations."
---
# Design Critique Skill
@@ -121,6 +121,14 @@ Prioritised list of the 3 most impactful changes. Each should be actionable in t
- [ ] Priority levels (High/Medium/Low) reflect actual impact on user goal
- [ ] Heuristic assessment only covers visible elements
## Anti-Patterns
- [ ] Do not lead with visual preference (e.g. "I don't like the colour") — every issue must reference a UX principle or user impact
- [ ] Do not invent problems in the "What's Working" section — manufactured praise undermines the entire critique
- [ ] Do not provide the same priority level (High/Medium/Low) to every issue — prioritisation requires genuine judgment about user impact
- [ ] Do not skip the JTBD section for product screens — connecting feedback to the user's job-to-be-done is what separates UX critique from aesthetic opinion
- [ ] Do not give recommendations that require a full redesign when the user is in high-fidelity — scope recommendations to the design stage
## Example Trigger Phrases
- "Critique this design: [description or image]"
@@ -0,0 +1,223 @@
---
name: design-system-audit
description: "Audit a design system for consistency, coverage, and quality. Use when asked to audit a design system, review a component library, assess design token coverage, or evaluate the health of a shared design system. Produces a structured audit with a health score, component coverage gaps, token inconsistencies, accessibility issues, and a prioritised remediation roadmap."
---
# Design System Audit Skill
This skill produces a structured audit of a design system — covering component coverage, token consistency, documentation quality, accessibility compliance, contribution processes, and adoption health. Output is ready for a design system team, design leadership, or an engineering team evaluating their shared component library.
## Required Inputs
Ask the user for these if not provided:
- **Design system name** and what product(s) it serves
- **Audit scope** — component library / design tokens / documentation / contribution process / all of the above
- **Current tooling** — Figma / Storybook / Zeroheight / custom / combination?
- **Team using it** — how many designers and engineers, how many products?
- **Known pain points** — what do teams complain about most?
- **Governance model** — centralised team / federated contributors / no dedicated team?
- **Goal of the audit** — improve adoption / prepare for a rebrand / onboard new teams / justify investment?
## Output Structure
---
# Design System Audit: [System Name]
**Products served:** [List of products / apps]
**Audit scope:** [Full / Components only / Tokens only / Documentation]
**Auditor:** [Name / Team]
**Date:** [Date]
**Stakeholders:** [Design lead, Eng lead, CPO, etc.]
---
## Overall Health Score
| Dimension | Score (15) | Status |
|---|---|---|
| Component coverage | [X/5] | 🟢/🟡/🔴 |
| Token consistency | [X/5] | 🟢/🟡/🔴 |
| Documentation quality | [X/5] | 🟢/🟡/🔴 |
| Accessibility compliance | [X/5] | 🟢/🟡/🔴 |
| Adoption rate | [X/5] | 🟢/🟡/🔴 |
| Contribution process | [X/5] | 🟢/🟡/🔴 |
| **Overall** | **[X/5]** | 🟢/🟡/🔴 |
**Summary:** [23 sentences. What is the overall state of the design system? What are the top 2 issues and what is the biggest strength?]
---
## 1. Component Coverage Audit
**How to assess:** Compare components in the design system against the actual UI patterns in the product. Every pattern that exists in production but not in the system is a coverage gap.
### Component Inventory
| Category | Components present | Coverage | Gap |
|---|---|---|---|
| **Navigation** | [Navbar, Sidebar, Breadcrumb, Tabs] | [80%] | [Missing: Mega menu, mobile drawer] |
| **Forms & Inputs** | [Text input, Dropdown, Checkbox, Radio, Toggle, Date picker] | [90%] | [Missing: Multi-select, Rich text editor] |
| **Feedback & Alerts** | [Toast, Banner, Modal, Tooltip] | [60%] | [Missing: Inline validation, Progress indicator, Skeleton loader] |
| **Data Display** | [Table, Card, Badge, Avatar] | [50%] | [Missing: Data grid, Stat card, Timeline, Gantt] |
| **Layout** | [Grid, Container, Divider, Spacer] | [70%] | [Missing: Responsive breakpoint utilities] |
| **Buttons & Actions** | [Button, Icon button, FAB, Link] | [100%] | [None] |
**Coverage score:** [X% of production UI patterns are covered by the design system]
**Most impactful gaps:**
1. [Most used pattern not in the system — causing most duplication]
2. [...]
3. [...]
---
## 2. Component Quality Audit
For each component, assess against these quality criteria:
| Component | States complete | Responsive | Accessibility | Dark mode | Props documented | Code matches Figma |
|---|---|---|---|---|---|---|
| Button | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Modal | ⚠️ Loading state missing | ✅ | ✅ | ❌ | ⚠️ Partial | ✅ |
| Table | ❌ Sorting state missing | ❌ No mobile layout | ⚠️ No aria-sort | ❌ | ❌ | ⚠️ Drift |
| [Component] | [...] | [...] | [...] | [...] | [...] | [...] |
**Legend:** ✅ Complete — ⚠️ Partial / inconsistent — ❌ Missing
**Components with critical quality issues (fix before anything else):**
- [Component name]: [Specific issue and why it's blocking]
- [...]
---
## 3. Design Token Audit
**Token coverage:**
| Token type | Defined | Used consistently | Issues |
|---|---|---|---|
| **Colour** | [X tokens defined] | [⚠️ — 12 hardcoded hex values found in Figma] | [Inconsistent use of primary-500 vs primary-600 for CTAs across products] |
| **Typography** | [X tokens defined] | [✅] | [None — all type styles use token scale] |
| **Spacing** | [X tokens defined] | [⚠️ — custom spacing used in X components] | [Engineers using arbitrary px values instead of spacing tokens in X components] |
| **Border radius** | [X tokens defined] | [❌ — not defined; each component has hardcoded values] | [Button, card, modal all use different radius values with no token] |
| **Shadow / elevation** | [X tokens defined] | [⚠️] | [3 different drop-shadow values in use; no elevation scale] |
| **Animation / motion** | [X tokens defined] | [❌ — not defined] | [Transition durations inconsistent across components] |
**Semantic token layer:** [Does the system have semantic tokens (e.g. `color.action.primary` on top of `color.blue.500`) or only primitive tokens?]
**Token drift:** [Are code tokens and Figma tokens in sync? Use a tool like Token Studio, Style Dictionary, or manual comparison.]
---
## 4. Documentation Quality Audit
**Assessment per component / pattern:**
| Document type | Quality | Issues |
|---|---|---|
| **Usage guidelines** | [⚠️ — X% of components have guidelines] | [Button and Form components documented; Navigation and Data Display mostly undocumented] |
| **Do / Don't examples** | [❌ — mostly absent] | [Engineers frequently misuse components because intent is unclear] |
| **Accessibility notes** | [⚠️ — present for some components] | [No consistent format; accessibility notes missing for interactive components] |
| **Code examples** | [✅ — all Storybook components have code examples] | [...] |
| **Changelog** | [❌ — no component-level changelog exists] | [Breaking changes are not communicated; causes unexpected UI regressions] |
| **Migration guides** | [❌ — absent] | [Teams don't know how to upgrade to new component versions] |
**Documentation score:** [X% of components have complete, usable documentation]
**Most common designer / engineer complaint about docs:** [e.g. "I can't find whether to use Modal or Drawer for this use case — no guidance exists"]
---
## 5. Accessibility Audit
**WCAG 2.2 compliance status:**
| Criterion | Level | Status | Components affected |
|---|---|---|---|
| Colour contrast (text) | AA | [✅ / ⚠️ / ❌] | [e.g. ❌ — Disabled state text fails 4.5:1 ratio in 3 components] |
| Colour contrast (UI components) | AA | [✅ / ⚠️ / ❌] | [...] |
| Keyboard navigation | AA | [✅ / ⚠️ / ❌] | [⚠️ — Modal focus trap not implemented; Dropdown not keyboard accessible] |
| Focus visible | AA | [✅ / ⚠️ / ❌] | [...] |
| Screen reader support (ARIA) | AA | [✅ / ⚠️ / ❌] | [❌ — Table component lacks aria-sort; Icon buttons have no aria-label] |
| Touch target size | AA | [✅ / ⚠️ / ❌] | [⚠️ — Mobile tap targets below 44×44px in X components] |
| Motion / animation | AA | [✅ / ⚠️ / ❌] | [...] |
**Critical accessibility blockers (must fix before next release):**
1. [Most critical issue — e.g. Keyboard users cannot close Modal — focus trap missing]
2. [...]
---
## 6. Adoption Audit
**Adoption by team / product:**
| Product / Team | Components used from system | Custom components built outside system | Adoption score |
|---|---|---|---|
| [Product A] | [X% of UI uses system components] | [Y custom components] | [High / Medium / Low] |
| [Product B] | [...] | [...] | [...] |
**Why teams are not adopting:**
| Barrier | Severity | Evidence |
|---|---|---|
| [Component doesn't exist] | High | [Top reason in team survey] |
| [Component exists but doesn't meet use case] | Medium | [Modal component lacks X state needed by Product B] |
| [Documentation too sparse to know how to use it] | Medium | [...] |
| [No one enforces system use — easier to build custom] | High | [...] |
| [System is out of date with product's current visual language] | Medium | [...] |
---
## 7. Contribution Process Audit
| Dimension | Current state | Assessment |
|---|---|---|
| **How to contribute** | [Documented / Not documented] | [✅ / ❌] |
| **Contribution criteria** | [Clear entry bar for what goes in the system] | [⚠️ — unclear who decides what becomes a system component vs stays local] |
| **Review process** | [Who reviews contributions and how long it takes] | [❌ — no formal review; contributions sit unreviewed for weeks] |
| **Release cadence** | [How often system releases happen] | [⚠️ — sporadic; no set cadence] |
| **Breaking change policy** | [How breaking changes are handled and communicated] | [❌ — no policy; breaking changes are a surprise] |
| **Versioning** | [Semantic versioning in place?] | [✅ — all packages use semver] |
---
## 8. Prioritised Remediation Roadmap
| Priority | Initiative | Impact | Effort | Timeline |
|---|---|---|---|---|
| P1 | Fix [X] critical accessibility issues (keyboard nav, ARIA) | Critical — legal + user impact | Medium | Sprint 12 |
| P1 | Define and implement border radius and shadow token scale | High — ends inconsistency | Low | Sprint 1 |
| P1 | Document top 10 most-used components (usage + do/don't) | High — unblocks adoption | Medium | Sprint 24 |
| P2 | Build Skeleton loader + Inline validation components (top 2 gaps) | High — eliminates custom duplication | High | Quarter 2 |
| P2 | Establish contribution process with SLA for reviews | Medium — enables growth | Low | Sprint 3 |
| P3 | Dark mode token support | Medium — product parity | High | Quarter 3 |
| P3 | Design-code token sync tooling (Token Studio / Style Dictionary) | Medium — reduces drift | Medium | Quarter 23 |
---
## Quality Checks
- [ ] Coverage gaps are identified by comparing the design system to actual production UI, not assumed
- [ ] Accessibility issues cite specific WCAG criterion and affected components
- [ ] Adoption barriers are backed by evidence (interviews, survey, usage data) — not assumed
- [ ] Remediation roadmap has effort estimates and is sequenced by impact
- [ ] Both Figma and code (Storybook/implementation) are assessed — not just Figma
- [ ] Stakeholders from design, engineering, and product have reviewed the audit
## Anti-Patterns
- [ ] Do not assess only the Figma library without checking the code implementation — Figma-code drift is one of the most common and costly design system failures
- [ ] Do not score adoption without interviewing teams — audit tool metrics miss the human reasons teams build custom components instead of using the system
- [ ] Do not treat all component gaps equally — prioritise gaps based on how many production screens rely on custom implementations, not alphabetically
- [ ] Do not recommend adding more components without first auditing documentation quality — an undocumented component is often worse than no component
- [ ] Do not schedule remediation without a named owner per initiative — design system improvements without ownership consistently stall
## Example Trigger Phrases
- "Audit our design system for consistency and coverage"
- "Review our component library and identify gaps"
- "Assess the health of our shared design system"
- "Run a design system audit before we do a rebrand"
- "What's wrong with our design system and what should we fix first?"
@@ -152,6 +152,14 @@ For each insight: "This means we should [design/product implication]" or "This c
- [ ] Synthesis framework is included
- [ ] Incentive recommendation is included
## Anti-Patterns
- [ ] Do not write a research plan without clearly stated research objectives — every methodology choice must flow from the objectives
- [ ] Do not design a plan that mixes generative and evaluative research without clearly separating them
- [ ] Do not omit screener criteria — recruiting unqualified participants invalidates the research
- [ ] Do not write discussion guide questions that are leading — questions must be neutral and open-ended
- [ ] Do not skip the incentive recommendation — uncompensated research has lower participant quality and completion rates
## Example Trigger Phrases
- "Write a research plan for [feature or product area]"
@@ -1,36 +1,66 @@
---
name: assumption-mapper
description: Extract and risk-rate all hidden assumptions in a product brief or PRD
tool_integration: Miro
description: "Extract and risk-rate hidden assumptions in a product brief or PRD. Use when asked to review a product brief for assumptions, audit a PRD for risks, find hidden assumptions, validate product plans, or run an assumption analysis. Produces a prioritised assumption map with confidence and impact scores, recommended validation methods, and critical assumption flags."
---
# Assumption Mapping Skill
## Purpose
# Assumption Mapper Skill
Surface and prioritize the untested assumptions embedded in any product plan before development begins.
## Required Inputs
Ask the user for these if not provided:
- **Product brief, PRD, or concept description** (even rough notes work)
- **Stage** (concept / discovery / pre-build / post-launch — affects which assumptions matter most)
## Process
1. Read the provided brief, PRD, or concept description
2. Extract all assumptions across four categories:
2. Extract assumptions across four categories:
- **Desirability** (do users want this?)
- **Feasibility** (can we build it?)
- **Viability** (will it sustain the business?)
- **Usability** (can users actually use it?)
3. For each assumption, score:
3. Score each assumption:
- Confidence (1-5): How sure are we this is true?
- Impact (1-5): How badly does the plan fail if this assumption is wrong?
- Priority = Impact minus Confidence (higher score = test this first)
4. Output a ranked list with recommended validation methods
- Priority = Impact Confidence (higher = test first)
4. **Validate completeness** — Ensure at least one assumption per category. If a category is empty, re-read the brief looking specifically for that type.
5. Output a ranked list with recommended validation methods
## Output Format
## Output Structure
### Assumption Map: [Feature/Product Name]
| Assumption | Category | Confidence | Impact | Priority Score | Recommended Validation |
|------------|----------|------------|--------|----------------|----------------------|
| Assumption | Category | Confidence | Impact | Priority | Validation Method |
|------------|----------|------------|--------|----------|-------------------|
| [assumption] | [type] | [1-5] | [1-5] | [score] | [method] |
#### Top 3 Assumptions to Validate First
[Detailed recommendations for highest-priority items]
#### Critical Assumptions (Impact 4+ and Confidence 2 or below)
[Flagged items with detailed validation recommendations]
## Notes
- Flag any assumption that scores 4+ on Impact and 2 or below on Confidence as CRITICAL
- Suggest specific research methods: usability test, survey, prototype test, data analysis
#### Top 3 Assumptions to Validate First
[Detailed recommendations including specific research method, estimated effort, and what the result would change]
## Example (Partial)
Input: *"We're building a self-serve onboarding flow to reduce time-to-value for SMB customers."*
| Assumption | Category | Confidence | Impact | Priority | Validation Method |
|------------|----------|------------|--------|----------|-------------------|
| SMB users can complete onboarding without human help | Usability | 2 | 5 | 3 | Unmoderated usability test (n=8) |
| Faster onboarding correlates with higher retention | Viability | 3 | 4 | 1 | Cohort analysis of current onboarding times vs. 90-day retention |
| The current onboarding is the primary reason for slow time-to-value | Desirability | 2 | 4 | 2 | User interviews with recent churned SMB accounts |
## Anti-Patterns
- [ ] Do not only surface desirability assumptions — feasibility and viability assumptions are equally likely to kill a product and are often overlooked
- [ ] Do not assign high confidence to an assumption just because it hasn't been challenged yet — absence of evidence is not evidence
- [ ] Do not recommend "user interviews" as the validation method for every assumption — some assumptions require quantitative data, competitive analysis, or technical spikes
- [ ] Do not list assumptions that cannot be tested — every assumption in the map must have a plausible validation method, or it should be flagged as unknowable and treated as a risk
## Quality Checks
- [ ] At least one assumption per category (Desirability, Feasibility, Viability, Usability)
- [ ] All Impact 4+ / Confidence 2 assumptions flagged as CRITICAL
- [ ] Each validation method is specific (not just "do research" — name the method and sample size)
- [ ] Priority scores are consistent (Impact Confidence, higher = more urgent)
@@ -0,0 +1,223 @@
---
name: customer-journey-map
description: "Build a customer journey map for a product, service, or experience. Use when asked to map a customer journey, create a user journey, document touchpoints and pain points, or design an experience map. Produces a complete journey map with stages, touchpoints, emotions, pain points, and prioritised opportunities."
---
# Customer Journey Map Skill
This skill produces a complete customer journey map covering every stage from awareness through advocacy. Each stage includes touchpoints, customer actions, emotions, pain points, and specific improvement opportunities. Output is ready for use in product discovery, UX design, or cross-functional alignment workshops.
## Required Inputs
Ask the user for these if not provided:
- **Product or service** being mapped
- **Customer persona** — which customer segment is this map for? (be specific — one persona per map)
- **Journey scope** — full end-to-end (awareness → advocacy), or a specific phase (e.g. onboarding only)?
- **Current state or future state?** — mapping how it works today, or designing how it should work?
- **Data sources** — any research, user interviews, support tickets, NPS comments, analytics available?
- **Goal of the map** — what decision will this inform? (redesign, prioritisation, stakeholder alignment, new feature)
## Output Structure
---
# Customer Journey Map: [Product / Service]
**Persona:** [Name — e.g. "Sarah, the overwhelmed HR manager"]
**Journey scope:** [Full end-to-end / Onboarding / Purchase / Renewal]
**Current or future state:** [Current state / Desired future state]
**Prepared by:** [Name / Team]
**Date:** [Date]
**Based on:** [Research sources — interviews, analytics, support data, assumed/hypothetical]
---
## Persona Summary
| | |
|---|---|
| **Name** | [Sarah] |
| **Role** | [HR Manager at a 200-person professional services firm] |
| **Goal** | [Reduce time spent on manual employee data management] |
| **Frustrations** | [Too many tools that don't talk to each other; always chasing approvals] |
| **Tech comfort** | [Moderate — comfortable with SaaS tools but not a power user] |
| **Decision power** | [Recommends tools; budget approved by CHRO] |
---
## Journey Overview
```
AWARENESS → CONSIDERATION → DECISION → ONBOARDING → ADOPTION → ADVOCACY
[Stage 1] [Stage 2] [Stage 3] [Stage 4] [Stage 5] [Stage 6]
```
**Overall experience rating (current state):** [😤 Frustrating / 😐 Neutral / 😊 Positive]
---
## Stage 1: Awareness
*How does the customer first discover the product exists?*
**Customer goal at this stage:** [e.g. Realise they have a problem worth solving — or find a solution to a specific pain]
| Element | Detail |
|---|---|
| **Trigger** | [What event makes them start looking? — e.g. Manual process breaks down / peer recommendation / saw ad] |
| **Where they are** | [Google search / LinkedIn / conference / colleague conversation / email newsletter] |
| **What they do** | [e.g. Searches "automate employee onboarding" / asks peers in HR community / clicks LinkedIn ad] |
| **Emotion** | [😤 Frustrated — overwhelmed by manual processes and hoping for a better way] |
| **Pain points** | [Overwhelming number of options / hard to know which tools are credible / can't tell what's B2B vs B2C from homepage] |
| **Opportunities** | [SEO content targeting the trigger keyword / LinkedIn thought leadership / peer community presence] |
---
## Stage 2: Consideration
*The customer is actively evaluating options. What do they do to decide?*
| Element | Detail |
|---|---|
| **Customer goal** | [Narrow down from many options to a shortlist of 23] |
| **What they do** | [Reads G2/Capterra reviews / watches demo video / downloads comparison guide / asks peers who use something similar] |
| **Touchpoints** | [Website / review sites / social proof / demo request flow / sales email] |
| **Emotion** | [😕 Anxious — worried about making the wrong choice; past tool purchases haven't delivered] |
| **Pain points** | [Pricing not visible on website / demo requires a call before seeing the product / unclear if it works with their existing stack] |
| **Opportunities** | [Self-serve demo or interactive product tour / transparent pricing page / ROI calculator / case studies from similar company size] |
---
## Stage 3: Decision
*The customer is ready to buy — or not. What makes them commit?*
| Element | Detail |
|---|---|
| **Customer goal** | [Get sign-off from CHRO and justify the decision with a business case] |
| **What they do** | [Books sales call / requests security questionnaire / builds internal business case / negotiates contract] |
| **Touchpoints** | [AE / sales call / security review / contract / procurement process] |
| **Emotion** | [😬 Cautious — doesn't want to be wrong; presenting to leadership adds pressure] |
| **Pain points** | [Sales process is slow / security questionnaire takes weeks / contract terms are non-standard and require legal] |
| **Opportunities** | [Security FAQ self-serve / standard contract with predictable terms / champion toolkit (slides, business case template) to help them sell internally] |
---
## Stage 4: Onboarding
*The customer has bought. Now they need to get value fast.*
| Element | Detail |
|---|---|
| **Customer goal** | [Get the product working and show their CHRO it was a good decision] |
| **What they do** | [Receives welcome email / attends kickoff call / configures integrations / invites team] |
| **Touchpoints** | [Onboarding email sequence / in-product onboarding checklist / CSM / help centre / integrations marketplace] |
| **Emotion** | [😬 Anxious but hopeful — excited about potential but stressed about the setup work] |
| **Pain points** | [Setup is more complex than expected / IT required for SSO but IT is slow to respond / generic onboarding doesn't match their use case] |
| **Opportunities** | [Role-specific onboarding paths / IT connector with pre-filled request template / quick win email at day 3 (show them one thing that already works)] |
**Key moment of truth:** [What single moment in this stage determines whether they'll become an active user or ghost? — e.g. "First time the product saves them 30 minutes on a task they used to do manually"]
---
## Stage 5: Adoption
*The customer is using the product. Are they getting consistent value?*
| Element | Detail |
|---|---|
| **Customer goal** | [Make the product a regular part of their workflow; demonstrate ROI to leadership] |
| **What they do** | [Uses core features daily / discovers new features / hits a limitation / contacts support / attends webinar] |
| **Touchpoints** | [Product UI / in-app notifications / email / support / community / customer success manager] |
| **Emotion** | [Variable — some days 😊 when the product works well; some days 😤 when hitting a gap or bug] |
| **Pain points** | [Feature they expected isn't there / reporting doesn't show the metric leadership wants / power features are too complex / feels like they're underutilising what they're paying for] |
| **Opportunities** | [Proactive CSM check-in at day 30 / in-product feature discovery / usage dashboard for the customer to see their own ROI / community for peer learning] |
**Adoption health indicators:**
- [DAU/MAU ratio — what does healthy look like?]
- [Feature X used by Y% of seats within Z weeks]
- [First NPS survey at 60 days — target score]
---
## Stage 6: Advocacy
*The customer loves the product. How do you turn them into a referral engine?*
| Element | Detail |
|---|---|
| **Customer goal** | [Solve problems faster; feel like an expert; feel valued as a customer] |
| **What they do** | [Refers a peer / writes a G2 review / participates in case study / speaks at event / becomes a power user / joins community] |
| **Touchpoints** | [CSM / community / review request email / referral programme / case study outreach / conference sponsorship] |
| **Emotion** | [😊 Proud — the tool is part of their professional identity; they feel smart for choosing it] |
| **Pain points** | [Referral programme is clunky / no structured way to connect with peers / case study process is slow and effortful for them] |
| **Opportunities** | [One-click G2 review request at high-satisfaction moment / peer community / referral programme with meaningful reward / case study process that does most of the work for them] |
---
## Emotion Curve
Plot the customer's emotional experience across the journey:
```
High 😊 │ * * *
│ *
Neutral 😐│ * *
│ *
Low 😤 │ * *
└────────────────────────────────────────────────────
Aware Consider Decide Onboard Adopt Advocate
```
**Lowest point:** [Which stage has the worst experience — and why?]
**Highest point:** [When is the customer most delighted — what drove it?]
**Biggest drop:** [Where does sentiment fall most sharply — this is usually the biggest opportunity]
---
## Prioritised Opportunities
| Opportunity | Stage | Impact on customer | Effort to fix | Priority |
|---|---|---|---|---|
| [Self-serve product tour before sales call] | Consideration | [High — removes top buying barrier] | [Medium] | P1 |
| [Quick win email at day 3] | Onboarding | [High — builds early habit] | [Low] | P1 |
| [IT SSO setup template] | Onboarding | [Medium — removes specific blocker] | [Low] | P2 |
| [30-day proactive CSM check-in] | Adoption | [Medium — catches churn signals early] | [Medium] | P2 |
| [Peer referral programme] | Advocacy | [High for growth — reduces CAC] | [High] | P3 |
---
## What We Don't Know (Research Gaps)
| Gap | How to close it | Priority |
|---|---|---|
| [What actually triggers the decision to start looking?] | [5 JTBD interviews with recent buyers] | [High] |
| [What causes customers to stall in onboarding?] | [Drop-off analysis in onboarding funnel + 3 interviews with churned customers] | [High] |
| [What % of customers have reached the advocacy stage?] | [Product analytics — identify power users; NPS by cohort] | [Medium] |
---
## Quality Checks
- [ ] Map covers one specific persona — not "all customers"
- [ ] Each stage includes the customer's emotional state — not just actions
- [ ] Pain points are the customer's pain — not the company's pain
- [ ] Opportunities are specific enough to become backlog items or design prompts
- [ ] Emotion curve shows the real experience — not an aspirationally positive version
- [ ] Research gaps are documented — the map reflects what is known, not assumed
## Anti-Patterns
- [ ] Do not build the map from assumptions alone — ground at least the pain points in real customer data or research
- [ ] Do not treat all journey stages as equally weighted — identify the highest-friction moments explicitly
- [ ] Do not omit the emotional layer — a journey map without emotions is a process flow, not a customer map
- [ ] Do not create generic touchpoints that apply to any product — each touchpoint must be specific to this product and customer
- [ ] Do not leave opportunities unranked — prioritise by impact and feasibility
## Example Trigger Phrases
- "Map the customer journey for [product]"
- "Build a user journey from awareness to advocacy"
- "Create a journey map for our onboarding experience"
- "Map out the touchpoints and pain points for [customer type]"
- "Design an experience map for [process or product]"
@@ -1,6 +1,6 @@
---
name: discovery-interview-guide
description: Creates structured user discovery interview guides with screener questions, discussion guides, and synthesis frameworks. Use when planning user interviews, customer discovery sessions, Jobs-to-be-Done research, or problem validation. Triggers on "user interview", "discovery interview", "customer research", "JTBD", "problem validation".
description: "Create a structured user discovery interview guide with screener questions, a discussion guide, and a synthesis framework. Use when planning user interviews, customer discovery sessions, Jobs-to-be-Done research, or problem validation. Produces a complete guide covering warm-up, problem exploration, and a per-session synthesis template."
---
# Discovery Interview Guide Skill
@@ -81,6 +81,23 @@ Understand the competitive landscape from their perspective.
---
## Required Inputs
Ask the user for these if not provided:
- **Research topic or question** (what decision will this inform?)
- **Target participant profile** (role, behaviour, company type)
- **Session length** (30 / 45 / 60 / 90 minutes)
- **Number of interviews planned**
- **Known hypotheses to test or avoid confirming prematurely** (optional)
## Quality Checks
- [ ] No future-tense questions ("would you...") — only past-behaviour questions
- [ ] Product or solution not mentioned until after pain is confirmed
- [ ] Questions open-ended (cannot be answered yes/no)
- [ ] Synthesis template included for per-session notes
- [ ] Screener questions identify and disqualify wrong participants
## Guidelines
- Recommend 58 interviews to reach thematic saturation for most discovery questions
@@ -88,3 +105,11 @@ Understand the competitive landscape from their perspective.
- If user is new to interviewing: remind them to stay silent after asking a question (aim for 80/20 participant-to-interviewer talking ratio)
- Never synthesise during the interview — do it after, when you can look across sessions
- Flag confirmation bias: if user writes questions that lead toward a predetermined answer, rewrite them as open-ended alternatives
## Anti-Patterns
- [ ] Do not use future-tense questions ("Would you use this?") — hypothetical responses do not predict real behaviour and produce false confidence in an idea
- [ ] Do not mention your product or solution before problem exploration is complete — doing so anchors the participant's responses and invalidates the discovery
- [ ] Do not synthesise across fewer than 5 interviews — themes from 23 interviews reflect anecdote, not pattern; wait for saturation
- [ ] Do not write screener questions that are too easy to pass — if participants can guess the "right" answer, you will recruit the wrong people
- [ ] Do not treat participant opinions as evidence of future behaviour — what people say they will do consistently diverges from what they actually do
@@ -1,6 +1,6 @@
---
name: job-story-mapper
description: Writes Jobs-to-be-Done (JTBD) job stories and maps customer jobs across functional, social, and emotional dimensions. Use when defining user needs, writing job stories, conducting JTBD research, or reframing features around customer outcomes. Triggers on "job story", "JTBD", "jobs to be done", "when I want to", "user need", "hire a product".
description: "Write Jobs-to-be-Done (JTBD) job stories and map customer jobs across functional, social, and emotional dimensions. Use when defining user needs, writing job stories, conducting JTBD research, or reframing features around customer outcomes. Produces a job story map with opportunity scoring, pain intensity ratings, and product opportunity analysis."
---
# Job Story Mapper Skill
@@ -105,6 +105,31 @@ Rate each job story on:
---
## Quality Checks
- [ ] Job stories use the "When / I want to / So I can" format (not user story format)
- [ ] Situation is specific (not "as a user" — a real moment or trigger)
- [ ] All three dimensions covered: functional, emotional, social
- [ ] Opportunity score calculated for each job story
- [ ] Current workaround identified for each high-opportunity story
- [ ] Product opportunity is distinct from "build the feature" (it's an outcome)
## Required Inputs
Ask the user for these if not provided:
- **Product or feature area** to map (e.g. onboarding, checkout, dashboard)
- **User type or persona** (who are we mapping jobs for?)
- **Source material** (user interview notes, support tickets, discovery findings, or describe from memory)
- **Scope** (full product job map vs. a single feature area)
## Anti-Patterns
- [ ] Do not write job stories that describe a feature rather than a situation-motivation pair
- [ ] Do not skip the social and emotional dimensions — mapping only functional jobs misses the most defensible differentiation opportunities
- [ ] Do not define situations too broadly ("as a user who wants to manage their work") — the situation must be a specific moment or trigger
- [ ] Do not conflate opportunity scoring with priority — a high opportunity score still requires feasibility and strategic fit assessment
- [ ] Do not produce a job map without identifying current workarounds — the workaround reveals what the job is worth to the customer
## Guidelines
- Never write a job story for a feature — write it for the situation that makes the feature valuable
@@ -1,21 +1,29 @@
---
name: user-interview-synthesis
description: Synthesise user interview transcripts into structured research findings
tool_integration: Notion
description: "Synthesises user interview transcripts into structured research findings. Use when asked to analyse interview notes, synthesise qualitative research, identify themes from interviews, or turn raw interview data into actionable product insights. Produces a themed synthesis with supporting quotes per theme, 'so what' implications, and recommended next steps."
---
# User Interview Synthesis Skill
## Purpose
Transform raw interview transcripts into a structured synthesis document that surfaces themes, pain points, and actionable insights.
## Required Inputs
Ask the user for these if not provided:
- **Interview transcripts or notes** (even rough notes work)
- **Number of participants and their profiles** (role, company size, context)
- **Research questions** (what was the study trying to answer?)
- **Date range** of research (for context)
## Process
1. Read all provided transcripts fully before drawing conclusions
2. Identify recurring themes (minimum 3 mentions to qualify as a theme)
3. Categorize findings into: Pain Points, Workflow Insights, Feature Requests, Delight Moments
4. Select 2-3 verbatim quotes per theme that best represent the pattern
5. Draft "So What" implications for each theme — what does this mean for the product?
6. **Validate** — Confirm every theme has quotes from at least 3 participants. Flag any insight resting on fewer as low-confidence.
## Output Format
## Output Structure
### Research Synthesis: [Study Name]
**Participants:** [n]
@@ -24,15 +32,29 @@ Transform raw interview transcripts into a structured synthesis document that su
#### Theme 1: [Theme Name]
- Summary (2-3 sentences)
- Supporting quotes
- Supporting quotes (from at least 3 participants)
- Implication for product
[Repeat for each theme]
#### Low-Confidence Signals (1-2 participants only)
[Findings worth tracking but not acting on yet — note what further research would confirm or deny]
#### Recommended Next Steps
[Specific, actionable recommendations based on findings]
## Quality Checks
- Every theme must be supported by quotes from at least 3 participants
- Implications must connect to product decisions, not just observations
- Avoid researcher bias — let the data lead
- [ ] Every theme is supported by quotes from at least 3 participants
- [ ] Implications connect to specific product decisions, not just observations
- [ ] Researcher bias check: no leading language, findings don't all support one hypothesis
- [ ] Single-source signals are flagged separately, not mixed into main themes
- [ ] Research questions from the study brief are each addressed (even if the answer is "inconclusive")
## Anti-Patterns
- [ ] Do not mix single-source signals into main themes — insights cited by only one participant must be flagged separately
- [ ] Do not write implications that are observations restated rather than product decisions enabled
- [ ] Do not include themes that only support the project hypothesis — contradictory findings must be surfaced, not omitted
- [ ] Do not present findings without quotes — every theme requires verbatim evidence from at least 3 participants
- [ ] Do not leave research questions unanswered — each question from the study brief must be explicitly addressed, even if the answer is inconclusive
@@ -1,13 +1,13 @@
{
"$schema": "https://anthropic.com/claude-code/plugin.schema.json",
"name": "pm-engineering",
"version": "1.0.0",
"description": "Engineering & tech skills: Code Review Checklist, Incident Postmortem, API Docs Writer, Architecture Decision Record. Structured outputs for engineering teams and technical PMs.",
"version": "4.0.0",
"description": "Engineering & tech skills: Code Review Checklist, Incident Postmortem, API Docs Writer, Architecture Decision Record, Debugging Log Analyser, PR Description Writer, System Design Interview, Changelog Generator, Test Strategy Doc, Runbook Writer, CI/CD Playbook, SLO & Error Budget, Developer Onboarding Doc, On-Call Runbook, Security Threat Model, Performance Budget, Database Schema Design, Database Migration Plan, Technical Debt Register, RFC Writer, Capacity Planning, Load Testing Plan, Disaster Recovery Plan, Feature Flag Guide, Dependency Audit, Service Catalog Entry, Monitoring Setup Guide, Local Dev Setup, API Versioning Strategy, Infra-as-Code Review, Engineering Weekly Report, Tech Radar, Sprint Velocity Analysis, Microservices Decomposition, Engineering Hiring Rubric. 35 structured skills for engineering teams, SREs, and technical PMs.",
"author": {
"name": "Mohit Aggarwal",
"email": "mohit15856@gmail.com"
},
"homepage": "https://github.com/mohitagw15856/pm-claude-skills",
"license": "MIT",
"keywords": ["product-management", "engineering", "code-review", "incident-postmortem", "api-documentation", "adr", "architecture"]
"keywords": ["product-management", "engineering", "code-review", "incident-postmortem", "api-documentation", "adr", "architecture", "debugging", "pull-request", "system-design", "changelog", "test-strategy", "runbook", "devops", "cicd", "slo", "error-budget", "onboarding", "oncall", "sre", "reliability", "security", "threat-model", "performance", "database", "migration", "technical-debt", "rfc", "capacity-planning", "load-testing", "disaster-recovery", "feature-flags", "dependency-audit", "service-catalog", "monitoring", "observability", "tech-radar", "microservices", "hiring", "velocity"]
}
@@ -13,10 +13,12 @@ Ask the user for these if not provided:
- **API or endpoint details** (raw spec, Postman export, or verbal description)
- **Auth method** (API key / Bearer token / OAuth 2.0 / None)
- **Base URL**
- **API version** (e.g. v1, v2.3, or "unversioned" — affects deprecation notes and versioning headers)
- **Rate limits** (requests per second/minute per token or IP, if known — or "unknown")
- **Audience** (internal developers / external partners / public)
- **Output format** (Markdown / OpenAPI YAML / Plain prose)
- **Output format** (Markdown for developer portals and READMEs / Plain prose for Confluence or Notion — note: OpenAPI YAML is not produced by this skill)
## Output Structure
## Output Format
For each endpoint, produce the following:
@@ -133,13 +135,21 @@ data = response.json()
- [ ] Every parameter is documented (type, required/optional, description)
- [ ] Response fields are fully documented with types
- [ ] All relevant error codes are listed with resolution guidance
- [ ] Code examples are copy-paste runnable (no pseudocode)
- [ ] Error codes cover at minimum: 400 (bad request), 401/403 (auth), 404 (not found), 429 (rate limited), 500 (server error) — or explicitly note which don't apply to this endpoint
- [ ] Code examples use the actual base URL and a realistic placeholder token — no examples reference undefined variables or "YOUR_ENDPOINT" outside the snippet
- [ ] Auth method is clearly stated at the top
- [ ] Enum values are listed where applicable
- [ ] Pagination documented if the endpoint is a list endpoint
## Example Trigger Phrases
## Anti-Patterns
- [ ] Do not document only the happy path — every endpoint must have error codes for at least 400, 401/403, 404, 429, and 500
- [ ] Do not use placeholder values like "YOUR_ENDPOINT" or "INSERT_TOKEN" in code examples — use realistic-looking placeholders anchored to the actual base URL
- [ ] Do not skip enum values for fields with a fixed set of accepted values — undocumented enums cause integration bugs
- [ ] Do not omit pagination documentation on list endpoints — developers who miss this will build integrations that silently miss data
- [ ] Do not describe what a field "is" without describing what it "does" — "the ID" is not documentation; "the unique identifier used to retrieve or update this resource" is
## Usage Examples
- "Document this API endpoint: [paste spec or description]"
- "Turn this Postman collection into developer docs"
- "Write API reference docs for [endpoint]"
@@ -0,0 +1,320 @@
---
name: api-versioning-strategy
description: "Write an API versioning strategy document for a service or API platform. Use when asked to define versioning policy, plan API deprecation, classify breaking changes, or document version lifecycle. Produces a complete versioning strategy with breaking-change classification table, deprecation timeline, migration guide template, and client communication template."
---
# API Versioning Strategy
Produce a complete API versioning strategy document that gives a service team durable, consistent rules for evolving their API without breaking consumers. This document covers the versioning scheme selection (with rationale), lifecycle policy from introduction through sunset, a precise breaking-change classification, and all the communication artifacts a team needs when deprecating a version. Engineers should be able to hand this document to a new team member or external consumer and have them understand exactly what to expect.
## Required Inputs
Ask for these if not already provided:
- **API type** — REST, GraphQL, or gRPC (each has different versioning mechanics)
- **Current versioning approach** — URL path (`/v1/`), request header, query parameter, or none; if none, document starts fresh
- **Number of existing versions and active consumer count** — needed to size the lifecycle policy and migration scope
- **Deprecation timeline constraints** — any hard deadlines (contract SLAs, compliance windows, annual release cycles)
- **Consumer type** — internal teams only, external partners, public API, or mix (affects communication channel choices)
If any input is missing, ask before producing the document. For GraphQL, note that the versioning approach differs substantially (schema evolution over versioning) and tailor the scheme section accordingly.
## Output Format
---
# API Versioning Strategy: [Service Name]
**Owner:** [Team Name]
**API Type:** [REST / GraphQL / gRPC]
**Document Version:** 1.0
**Last Reviewed:** [Date]
**Next Review:** [Date + 6 months]
---
## 1. Versioning Scheme
### Selected Approach: [URL Path / Request Header / Query Parameter]
| Scheme | Example | Pros | Cons | Verdict |
|--------|---------|------|------|---------|
| URL Path | `/v2/orders` | Visible in logs and bookmarks; trivial to route | Violates strict REST resource identity; clutters URL space | **Recommended for public-facing REST APIs** |
| `Accept` Header | `Accept: application/vnd.[service].v2+json` | Keeps URLs clean; proper content negotiation | Harder to test in browser; less visible in logs | Recommended for internal APIs with controlled clients |
| Query Parameter | `/orders?version=2` | Easy to retrofit without URL restructuring | Often missed in client code; cache-key complications | Acceptable only for read-heavy APIs already in production |
| GraphQL Schema Evolution | Field deprecation + `@deprecated` directive | No versioning needed for additive changes | Requires disciplined schema design | **Recommended for GraphQL APIs** |
**Rationale for [chosen scheme]:** [One paragraph explaining why this scheme fits the API type, consumer type, and operational context provided. Reference the specific inputs — e.g., "Because this API has external partners who integrate via generated clients, URL path versioning provides the most predictable routing behavior and eliminates header negotiation complexity."]
### Version Format
```
[Base URL]/v{MAJOR}/{resource}
Examples:
https://api.[company].com/v1/orders
https://api.[company].com/v2/orders/{id}/items
Version identifier: integer only (v1, v2, v3)
No minor versions in the URL — minor/patch changes are non-breaking and deployed continuously.
```
---
## 2. Version Lifecycle Policy
### Lifecycle Stages
```
STABLE ──────────────────────────────────────────────────►
├─ STABLE Active development, full SLA, new consumers allowed
├─ DEPRECATED Announced, timeline posted, migration docs live.
│ New consumers blocked. Existing consumers receive warnings.
├─ SUNSET Requests return HTTP 410 Gone + migration pointer.
│ 30-day window before routing is removed.
└─ RETIRED Routing removed, docs archived, no traffic accepted.
```
| Stage | Duration | SLA Applies | New Consumers Allowed | Required Action |
|-------|----------|-------------|----------------------|-----------------|
| Stable | Until superseded | Yes — full | Yes | None |
| Deprecated | [12 months / adjust per constraint] | Yes — degraded acceptable | No | Migrate before sunset date |
| Sunset | 30-day window | Best-effort only | No | Migrate immediately |
| Retired | Permanent | None | No | — |
**Minimum Stable Period:** A version must remain Stable for at least [6 / 12] months before deprecation can be announced.
**Maximum Simultaneous Versions:** No more than [2] versions in Stable or Deprecated status at any time. Releasing v3 requires committing to a sunset date for v1 in the same announcement.
---
## 3. Breaking vs. Non-Breaking Change Classification
Apply this table before every API change. If a change is marked Breaking, it requires a new major version. When uncertain, default to Breaking.
| Change Type | Specific Example | Classification | Rationale |
|-------------|-----------------|----------------|-----------|
| Remove a response field | Delete `order.legacy_id` from response | **Breaking** | Clients reading this field will null-pointer or fail |
| Rename a field | `user_name``username` | **Breaking** | Clients referencing old name receive null |
| Change field type | `"amount": "10.00"``"amount": 10.00` | **Breaking** | Type mismatch at deserialization |
| Make optional field required | `email` required in POST body | **Breaking** | Existing callers omitting it receive 400 |
| Remove an endpoint | `DELETE /v1/widgets/{id}` removed | **Breaking** | Existing callers receive 404 |
| Change HTTP method | `GET /search``POST /search` | **Breaking** | Bookmarked or cached GET calls fail |
| Change authentication scheme | API key → OAuth2 | **Breaking** | All clients must re-authenticate |
| Restructure error response shape | Error JSON schema changed | **Breaking** | Error-handling code misparses responses |
| Expand enum values (response) | New `status: "on_hold"` value returned | **Breaking** | Switch statements with no default fall through |
| Change pagination defaults | `page_size` default 20 → 50 | **Breaking** | Response length changes unexpectedly |
| Tighten input validation | Max length 100 → 50 | **Breaking** | Previously valid inputs now rejected |
| Add new optional field to response | Add `order.tax_breakdown` | Non-Breaking | Clients ignore unknown fields per spec |
| Add new optional request parameter | Add `?include_archived=true` | Non-Breaking | Ignored by existing clients |
| Add a new endpoint | `GET /v1/orders/{id}/audit` | Non-Breaking | No existing client references it |
| Relax input validation | Min length 10 → 5 | Non-Breaking | Existing valid inputs remain valid |
| Performance or latency improvement | Response time reduced | Non-Breaking | — |
| Add new enum value (request-only) | Accept new `type: "express"` | Non-Breaking | Existing values still accepted |
---
## 4. Deprecation Process
### Step-by-Step Deprecation Checklist
- [ ] **T-0 (Decision day):** Engineering lead approves deprecation. New version confirmed Stable. Sunset date set.
- [ ] **T-0:** Update API docs — add deprecation banner to all v[N] endpoint pages.
- [ ] **T-0:** Add `Deprecation` and `Sunset` response headers to all v[N] responses (see format below).
- [ ] **T-0:** Block new consumer onboarding for v[N] in API gateway and developer portal.
- [ ] **T-0:** Send initial deprecation notice to all registered consumers (see Section 5 template).
- [ ] **T-0:** Open tracking issue in engineering backlog linking all known consumers to their migration status.
- [ ] **T minus 30 days:** Send 30-day warning to all consumers still sending v[N] traffic.
- [ ] **T minus 7 days:** Send final warning. If consumer traffic > 100 req/day, escalate directly to their engineering lead.
- [ ] **Sunset date:** Switch v[N] routing to return `HTTP 410 Gone` with body pointing to migration guide.
- [ ] **T plus 30 days:** Remove routing rules. Archive documentation. Close tracking issue.
### Deprecation Response Headers
```http
HTTP/1.1 200 OK
Deprecation: true
Sunset: Sat, 01 Jan 2027 00:00:00 GMT
Link: <https://docs.[company].com/api/migration/v1-to-v2>; rel="successor-version"
```
### Sunset Response Body
```http
HTTP/1.1 410 Gone
Content-Type: application/json
```
---
## 5. Client Communication Templates
### Initial Deprecation Notice
```
Subject: [Action Required] [Service Name] API v[N] Deprecation — Sunset [Date]
Hi [Team / Partner Name],
We are deprecating [Service Name] API v[N], effective [Sunset Date].
What this means for you:
- v[N] continues to work normally until [Sunset Date]
- After [Sunset Date], all v[N] requests return HTTP 410 Gone
- v[N+1] is available today and fully stable
Your current usage: approximately [X] requests/day as of [Date].
Estimated migration effort: [Small: < 1 day | Medium: 13 days | Large: 310 days]
Migration resources:
Migration guide: [URL]
Changelog: [URL]
Office hours: [Date/Time/Link]
Support: [Slack channel or email]
Key dates:
[Date] Deprecation announced (today)
[Date] New consumer onboarding blocked for v[N]
[Date] 30-day warning sent to remaining consumers
[Sunset Date] v[N] returns 410 Gone
Reply to this message or contact us at [channel] with questions.
[Your Name], [Team Name]
```
### 30-Day Warning
```
Subject: [30 Days Remaining] [Service Name] API v[N] sunsets [Date]
Hi [Team / Partner Name],
[Service Name] API v[N] sunsets in 30 days on [Date].
Your current v[N] traffic: [X] requests/day — migration is not yet complete.
If you have a technical blocker requiring an extension, contact us before
[Date minus 14 days]. Extensions require a documented blocker and a committed
migration completion date.
Migration guide: [URL] | Support: [channel]
```
---
## 6. Migration Guide Template
Publish one migration guide per version transition at `docs.[company].com/api/migration/v[N]-to-v[N+1]`.
```markdown
# Migration Guide: v[N] → v[N+1]
**Estimated effort:** [Small: < 1 day | Medium: 13 days | Large: 310 days]
**Breaking changes in this guide:** [count]
## Quick Start
Update your base URL:
Before: https://api.[company].com/v[N]/
After: https://api.[company].com/v[N+1]/
## Breaking Changes
### 1. [Field Rename: user_name → username]
**Affected endpoints:** `GET /users/{id}`, `POST /users`
Before (v[N]):
{ "user_name": "alice" }
After (v[N+1]):
{ "username": "alice" }
Migration: Replace all references to `user_name` with `username` in request
builders and response parsers.
### 2. [Next breaking change — repeat structure]
## New Capabilities in v[N+1]
| Feature | Description | Docs |
|---------|-------------|------|
| [Feature name] | [Brief description] | [Link] |
## SDK Upgrade Reference
| Language | Package | v[N+1] Version | Install Command |
|----------|---------|----------------|-----------------|
| Python | `[company]-sdk` | `2.0.0` | `pip install [company]-sdk==2.0.0` |
| Node.js | `@[company]/sdk` | `2.0.0` | `npm install @[company]/sdk@2.0.0` |
| Go | `github.com/[company]/sdk-go` | `v2.0.0` | `go get github.com/[company]/sdk-go/v2` |
| Java | `com.[company]:sdk` | `2.0.0` | Update pom.xml / build.gradle |
## Migration Validation Checklist
- [ ] Base URL updated to v[N+1]
- [ ] All renamed fields updated in request serializers
- [ ] All renamed fields updated in response deserializers
- [ ] Error-handling code updated for new error shape
- [ ] Integration tests passing against v[N+1] in staging
- [ ] Load test completed against v[N+1] — latency within acceptable range
- [ ] Rollback plan documented if issues arise post-cutover
```
---
## 7. Version-Specific Documentation
- Maintain separate documentation pages for each Stable and Deprecated version.
- Deprecated version docs carry a persistent banner: "This version is deprecated. Sunset date: [Date]. [Migrate to v[N+1]]."
- OpenAPI specs, Protobuf definitions, or GraphQL schemas are tagged and archived per version in the repository under `/api/v[N]/`.
- A root-level CHANGELOG.md records every breaking and non-breaking change by version — not buried in commit history.
---
## 8. SDK Versioning Alignment
| API Version | SDK Major Version | SDK GA Date | SDK EOL Date |
|-------------|------------------|-------------|--------------|
| v[1] | 1.x | [Date] | [API Sunset + 90 days] |
| v[2] | 2.x | [Date] | Active |
- SDK major versions align 1:1 with API major versions.
- SDK minor versions track non-breaking API additions.
- SDK EOL dates trail API sunset dates by 90 days to give consumers extra runway.
- SDKs emit a runtime deprecation warning log line when the underlying API version is Deprecated.
---
*Strategy authored by [Team Name] — questions to [Slack channel or email]*
---
## Anti-Patterns
- [ ] Do not classify expanding an enum (new response values) as non-breaking — clients with exhaustive switch statements will break when they receive an unexpected enum value
- [ ] Do not set a sunset date without confirming it is achievable for the largest consumer — a sunset that forces consumers to miss a legal deadline will be ignored or escalated
- [ ] Do not maintain more than two simultaneous stable/deprecated versions — each additional supported version multiplies maintenance burden and consumer confusion
- [ ] Do not use "monitor traffic" as the sole mechanism for knowing when all consumers have migrated — track named consumers against migration completion explicitly
- [ ] Do not skip the migration guide — consumers will delay migration indefinitely without a step-by-step guide that estimates effort
## Quality Checks
- [ ] Versioning scheme recommendation includes explicit rationale tied to the API type and consumer type provided — not a generic recommendation
- [ ] Breaking-change table covers at minimum: field removal, field rename, type change, making optional field required, endpoint removal, enum expansion, and default value change
- [ ] Deprecation timeline durations are filled in with concrete values, not left as abstract placeholders
- [ ] All three communication artifacts are present: initial deprecation notice, 30-day warning, and migration guide template
- [ ] Sunset response headers (`Deprecation`, `Sunset`, `Link`) use correct RFC date format and real URL structure
- [ ] SDK versioning alignment table is present and ties SDK major versions explicitly to API major versions
- [ ] Maximum simultaneous supported versions is stated with a concrete number
- [ ] Breaking-change table covers at minimum: field removal, field rename, type change, making optional field required, endpoint removal, enum expansion, and default value change
- [ ] Deprecation timeline durations are filled in with concrete values, not left as abstract placeholders
- [ ] All three communication artifacts are present: initial deprecation notice, 30-day warning, and migration guide template
- [ ] Sunset response headers (`Deprecation`, `Sunset`, `Link`) use correct RFC date format and real URL structure
- [ ] SDK versioning alignment table is present and ties SDK major versions explicitly to API major versions
- [ ] Maximum simultaneous supported versions is stated with a concrete number
@@ -10,6 +10,7 @@ This skill produces a complete Architecture Decision Record (ADR) following the
## Required Inputs
Ask the user for these if not provided:
- **ADR number** (sequential number in your ADR registry — e.g. 012; or "next available" if unknown)
- **Decision title** (brief, e.g. "Use PostgreSQL as primary datastore")
- **Context** (what situation led to this decision needing to be made?)
- **Options considered** (at least 2; if only 1 is given, prompt for alternatives that were considered or ruled out)
@@ -17,8 +18,9 @@ Ask the user for these if not provided:
- **Reason for choice**
- **Status** (Proposed / Accepted / Deprecated / Superseded)
- **Author and date**
- **Team context** (optional — team size, relevant experience, org constraints; helps calibrate formality and depth of the Context section)
## Output Structure
## Output Format
---
@@ -89,13 +91,13 @@ For each option, produce:
## Implementation Notes
[Optional but valuable: any specific patterns, gotchas, or guidance for the team implementing based on this decision. Link to relevant tickets, RFCs, or design docs if applicable.]
[Include if the decision has non-obvious implementation gotchas, or if there are related tickets/RFCs implementers will need. Skip only if the decision is purely tooling selection with no implementation ambiguity.]
---
## Review Date
[Optional: "This decision should be reviewed if [condition] — e.g. team grows beyond 20 engineers, or traffic exceeds 10M requests/day."]
[Include unless the decision is permanent or self-evidently final. State a specific trigger condition — e.g. "Review if team grows beyond 20 engineers or traffic exceeds 10M requests/day" — not just "should be reviewed periodically".]
---
@@ -107,10 +109,18 @@ For each option, produce:
- [ ] Consequences include *negative* consequences — no decision is consequence-free
- [ ] Decision is stated in plain language in the Decision section
- [ ] Risks section identifies what would invalidate this decision
- [ ] Written for someone with no prior context on this decision
- [ ] Context section states the problem explicitly in its first 12 sentences (does not assume the reader knows what problem the team was solving)
- [ ] Each rejected option's "Why ruled out" explanation names a specific constraint or trade-off (not a circular statement like "didn't meet our requirements")
## Example Trigger Phrases
## Anti-Patterns
- [ ] Do not write an ADR after the decision has already been fully implemented and the team has moved on — ADRs written retrospectively often omit the real reasons and alternatives
- [ ] Do not list only the chosen option — rejected options with honest reasons are the most valuable part of an ADR for future readers
- [ ] Do not write consequences that are all positive — every architectural decision involves trade-offs; an ADR with no negative consequences was not scrutinised honestly
- [ ] Do not leave the status as "Proposed" indefinitely — an ADR that no one has approved is not guiding anyone's decisions
- [ ] Do not write context that assumes the reader already knows what problem was being solved — the context section exists precisely for readers who lack that background
## Usage Examples
- "Write an ADR for using [technology]"
- "Document our decision to [architectural choice]"
- "Create an architecture decision record for [topic]"
@@ -0,0 +1,366 @@
---
name: capacity-planning
description: "Produce a capacity planning document for a service covering traffic forecasts, resource requirements, and scaling strategy. Use when asked to plan infrastructure capacity, forecast resource needs, model traffic growth, define scaling strategy, or produce a capacity review for a service. Produces a structured capacity plan covering current baseline metrics, growth projections, resource requirements per tier, scaling strategy, cost projections, capacity triggers, and an infrastructure action roadmap."
---
# Capacity Planning Skill
Produce a complete capacity planning document for a service. Capacity planning is not about predicting the future exactly — it is about understanding current headroom, modelling growth, and ensuring the team takes infrastructure action before a constraint becomes an incident.
A good capacity plan answers: what is running out first, how long before it runs out, what does it cost to fix it, and who decides when to act.
## Required Inputs
Ask for these if not already provided:
- **Service name and description** — what the service does and who depends on it
- **Current traffic and usage metrics** — requests per second (or per day), active users, data volume — whatever units are most natural for this service
- **Current resource utilisation** — CPU %, memory %, disk usage, connection pool utilisation, DB query throughput
- **Growth rate or projections** — historical growth rate, or known upcoming events (product launch, sales cycle, seasonal peak)
- **Tech stack and infrastructure** — cloud provider, compute type (VMs, containers, serverless), database, caching layer, CDN
- **Cost constraints** — current infrastructure spend, acceptable cost ceiling, or target cost per unit of traffic
## Output Format
---
# Capacity Plan: [Service Name]
**Service:** [Name] | **Team:** [Team name]
**Author:** [Name] | **Last updated:** [Date]
**Planning horizon:** [12 months — [Month Year] to [Month Year]]
**Review cadence:** [Quarterly]
---
## 1. Executive Summary
[35 sentences covering: current state, the most critical capacity constraint, the timeline before it becomes a risk, the recommended action, and the cost implication. Written for an engineering manager or VP who needs the key facts without reading the full document.]
**Critical finding:** [e.g. "The database connection pool will reach 90% utilisation within 6 weeks at current growth. Without action, this will cause request queueing and latency spikes under normal traffic."]
**Recommended immediate action:** [e.g. "Increase connection pool limit and add a read replica within the next 2 weeks."]
**Estimated cost impact:** [e.g. "Recommended changes add ~$[X]/month to infrastructure spend."]
---
## 2. Current Baseline
*All metrics are 30-day averages unless noted. Date captured: [Date]*
### Traffic
| Metric | Value | Peak (7-day) | Notes |
|---|---|---|---|
| Requests per second (avg) | [X req/s] | [X req/s] | [Peak time / day of week] |
| Requests per day | [X M/day] | [X M/day] | — |
| Active users (DAU/MAU) | [X] / [X] | — | — |
| [Service-specific metric — e.g. jobs processed/hour] | [X] | [X] | — |
| [Service-specific metric — e.g. GB ingested/day] | [X GB] | [X GB] | — |
### Compute
| Resource | Current utilisation | Instance type | Count | Notes |
|---|---|---|---|---|
| CPU (avg) | [X%] | [e.g. c5.2xlarge] | [X] | Peak: [X%] |
| Memory (avg) | [X%] | — | — | Peak: [X%] |
| Network egress | [X Mbps] | — | — | — |
| Container / pod count | [X] | [e.g. 2 vCPU / 4 GB] | — | Auto-scaling range: [XY] |
### Database
| Resource | Current utilisation | Spec | Notes |
|---|---|---|---|
| CPU | [X%] | [e.g. db.r5.2xlarge] | Peak: [X%] |
| Memory | [X%] | [X GB RAM] | — |
| Storage used | [X GB] of [Y GB] ([Z%]) | [X GB provisioned] | Growth: [~X GB/month] |
| IOPS (avg) | [X] of [Y provisioned] | [Y IOPS] | Peak: [X IOPS] |
| Connection pool | [X] of [Y max] ([Z%]) | Max connections: [Y] | [ORM pool size: X] |
| Query P99 latency | [X ms] | — | [Slowest query: X] |
| Read/write ratio | [X%] reads / [Y%] writes | — | — |
### Cache
| Resource | Current utilisation | Spec | Notes |
|---|---|---|---|
| Memory used | [X GB] of [Y GB] ([Z%]) | [e.g. cache.r6g.large] | Eviction rate: [X%] |
| Hit rate | [X%] | — | Miss rate: [Y%] |
| Connections | [X] | Max: [Y] | — |
### Storage / Object Store
| Resource | Current usage | Growth rate | Notes |
|---|---|---|---|
| [S3 / GCS / Blob] | [X GB / TB] | [~X GB/month] | [Lifecycle policies in place? Y/N] |
| Disk (if applicable) | [X GB] of [Y GB] | [~X GB/month] | [RAID / EBS type] |
### Cost Baseline
| Component | Current monthly cost | % of total |
|---|---|---|
| Compute (app servers) | $[X] | [X%] |
| Database | $[X] | [X%] |
| Cache | $[X] | [X%] |
| Storage | $[X] | [X%] |
| CDN / bandwidth | $[X] | [X%] |
| Other ([describe]) | $[X] | [X%] |
| **Total** | **$[X]** | 100% |
**Unit economics:** $[X] per [1,000 requests / 1,000 users / GB processed]
---
## 3. Growth Projections
### Assumptions
| Assumption | Value | Source | Confidence |
|---|---|---|---|
| Monthly traffic growth rate | [X%] | [Historical trend / product forecast] | [High / Medium / Low] |
| Seasonal peak factor | [+X% in [month(s)]] | [Last year's data / expected launch] | [High / Medium] |
| Upcoming events | [e.g. Marketing campaign — [Month], expected +[X]% traffic spike] | [Marketing plan] | [Medium] |
| User growth | [X new users/month] | [Sales pipeline / growth model] | [Medium] |
| Data growth | [X GB/month] | [Current trend] | [High] |
### Traffic Forecast
| Timeframe | Req/s (avg) | Req/s (peak) | DAU | Data volume (cumulative) |
|---|---|---|---|---|
| **Now** (baseline) | [X] | [X] | [X] | [X GB/TB] |
| **+3 months** | [X] | [X] | [X] | [X GB/TB] |
| **+6 months** | [X] | [X] | [X] | [X GB/TB] |
| **+12 months** | [X] | [X] | [X] | [X GB/TB] |
*Growth formula: [Baseline] × (1 + [monthly rate])^[months] + seasonal adjustment*
### Capacity Headroom Analysis
**When does each resource run out at current utilisation and projected growth?**
| Resource | Current utilisation | Safe ceiling | Headroom remaining | Months to ceiling |
|---|---|---|---|---|
| App CPU | [X%] | 70% | [X%] | [X months] |
| App memory | [X%] | 80% | [X%] | [X months] |
| DB CPU | [X%] | 70% | [X%] | [X months] |
| DB storage | [X GB] of [Y GB] | 80% = [Z GB] | [X GB] | [X months] |
| DB IOPS | [X] of [Y] | 80% = [Z] | [X IOPS] | [X months] |
| DB connections | [X] of [Y] | 80% = [Z] | [X] | [X months] |
| Cache memory | [X GB] of [Y GB] | 75% = [Z GB] | [X GB] | [X months] |
| Storage (object) | [X TB] | No hard limit — cost trigger | — | [Cost trigger: $X/month] |
**Red flags** (resources hitting ceiling within 3 months):
- [Resource]: [current]% → ceiling in [X weeks] — **Action required**
- [Resource]: [current]% → ceiling in [X weeks] — **Action required**
---
## 4. Resource Requirements
### Compute Requirements
| Timeframe | Required instances | Recommended instance type | Auto-scaling range | Notes |
|---|---|---|---|---|
| Now | [X] | [type] | [min: X, max: Y] | Current configuration |
| +3 months | [X] | [type] | [min: X, max: Y] | [Any instance type change needed?] |
| +6 months | [X] | [type or upgrade] | [min: X, max: Y] | [Consider [larger type / horizontal scale]] |
| +12 months | [X] | [type or upgrade] | [min: X, max: Y] | [State of horizontal vs vertical decision] |
**Memory headroom target:** Maintain ≥30% available memory at average load; ≥20% at peak.
**CPU headroom target:** Maintain ≥30% available CPU at average load; ≥15% at peak.
### Database Requirements
| Timeframe | Instance type | Storage | IOPS | Read replica | Notes |
|---|---|---|---|---|---|
| Now | [type] | [X GB] | [X] | [Y/N] | Current |
| +3 months | [type] | [X GB] | [X] | [Y/N] | [Upgrade storage / IOPS] |
| +6 months | [type or upgrade] | [X GB] | [X] | **Yes** | [Read replica recommended by this point] |
| +12 months | [type] | [X GB] | [X] | [X replicas] | [Consider sharding / partitioning at this scale] |
**Storage growth management:**
- Current growth: [~X GB/month]
- Storage auto-scaling: [Enabled / Not enabled — enable by [date]]
- Archiving policy: [Records older than X months moved to [cold storage / archive tier]]
### Cache Requirements
| Timeframe | Node type | Nodes | Memory | Notes |
|---|---|---|---|---|
| Now | [type] | [X] | [X GB] | Current |
| +6 months | [type] | [X] | [X GB] | [Scale out or upgrade] |
| +12 months | [type] | [X] | [X GB] | [Cluster mode if >Y GB required] |
---
## 5. Scaling Strategy
### Compute — Horizontal Scaling
**Decision: [Horizontal / Vertical / Both]**
[State the scaling strategy and the reasoning. E.g. "The application is stateless and CPU-bound; horizontal scaling is preferred. Vertical scaling is a short-term fallback only."]
**Auto-scaling configuration:**
```
Scale-out trigger: CPU > [X%] for [Y minutes] OR memory > [X%] for [Y minutes]
Scale-in trigger: CPU < [X%] for [Y minutes] AND memory < [X%] for [Y minutes]
Min instances: [X] (ensures HA across [X] AZs)
Max instances: [Y] (cost ceiling)
Cooldown period: [X seconds]
Warmup time: [X seconds] (time for new instance to be healthy)
```
**Limits of horizontal scaling:**
- [e.g. Database connection pool is the current bottleneck — adding more app instances without increasing DB connections will not help]
- [e.g. Session affinity required for WebSocket connections — limits pure stateless scaling]
### Database — Read Scaling
**Strategy:** [Read replica / Connection pooling via PgBouncer / Query caching / None needed yet]
**When to add a read replica:**
- DB CPU sustained >60% for >30 minutes, OR
- Read query P95 latency >50ms, OR
- Connection pool utilisation >70%
**Connection pooling:**
- Pooler: [PgBouncer / RDS Proxy / application-level / not configured]
- Pool size: [X connections per app instance × Y instances = Z total]
- Max DB connections: [configured to Z + 20% headroom]
### Caching Strategy
**Cache policy:** [Cache-aside / Write-through / Write-behind]
**TTL strategy:**
| Data type | TTL | Invalidation method |
|---|---|---|
| [e.g. User profile] | [5 minutes] | [Explicit invalidation on update] |
| [e.g. Product catalog] | [1 hour] | [TTL expiry — eventual consistency acceptable] |
| [e.g. Session data] | [24 hours] | [Explicit invalidation on logout] |
**Cache miss handling:** [Describe what happens on a cache miss — does it fall through gracefully or cause a thundering herd risk?]
---
## 6. Cost Projections
### Infrastructure Cost Forecast
| Component | Now (monthly) | +3 months | +6 months | +12 months |
|---|---|---|---|---|
| Compute | $[X] | $[X] | $[X] | $[X] |
| Database | $[X] | $[X] | $[X] | $[X] |
| Cache | $[X] | $[X] | $[X] | $[X] |
| Storage | $[X] | $[X] | $[X] | $[X] |
| CDN / bandwidth | $[X] | $[X] | $[X] | $[X] |
| **Total** | **$[X]** | **$[X]** | **$[X]** | **$[X]** |
| MoM growth % | — | [X%] | [X%] | [X%] |
**Unit economics trend:**
| Timeframe | Cost per 1k requests | Cost per user/month | Notes |
|---|---|---|---|
| Now | $[X] | $[X] | Baseline |
| +6 months | $[X] | $[X] | [Improving / worsening — why] |
| +12 months | $[X] | $[X] | [Target: $X per 1k requests] |
**Cost optimisation opportunities:**
| Opportunity | Estimated saving | Effort | Timeline |
|---|---|---|---|
| [e.g. Reserved instances for baseline compute] | $[X/month] | Low | Immediate |
| [e.g. S3 lifecycle policy — move objects >90 days to Glacier] | $[X/month] | Low | This sprint |
| [e.g. Right-size [instance] — current is overprovisioned] | $[X/month] | Low | This sprint |
| [e.g. Optimise top-5 slow queries — reduce DB compute need] | $[X/month] | Medium | Next quarter |
---
## 7. Capacity Triggers and Actions
Define the thresholds that require explicit action — not retrospective fixes after an incident.
| Resource | Watch (amber) | Act (red — schedule work) | Emergency (incident risk) |
|---|---|---|---|
| App CPU (sustained avg) | >60% | >70% | >85% |
| App memory | >70% | >80% | >90% |
| DB CPU | >55% | >65% | >80% |
| DB storage | >65% | >75% | >85% |
| DB connections | >60% | >70% | >85% |
| Cache memory / eviction | Hit rate <90% | Hit rate <85% | Hit rate <75% |
| Error rate | >0.5% | >1% | >2% |
| P99 latency | >2× baseline | >3× baseline | >5× baseline |
**When a Watch threshold is crossed:**
- Engineer who observes it creates a ticket with capacity label
- Ticket reviewed in next sprint planning
**When an Act threshold is crossed:**
- On-call engineer creates a ticket marked P2
- Tech lead reviews within 24 hours
- Action plan documented and scheduled within 1 sprint
**When an Emergency threshold is crossed:**
- Treat as a potential incident — page on-call
- Emergency scaling actions taken immediately (see runbook)
- Root cause investigation starts within 2 hours
**Emergency scaling runbook:** [Link to oncall-runbook for capacity incidents]
---
## 8. Infrastructure Action Roadmap
### Immediate Actions (next 2 weeks)
| Action | Owner | Effort | Justification |
|---|---|---|---|
| [e.g. Increase DB connection pool limit to X] | [Name] | [2 hours] | [DB connections at X% — hitting ceiling in X weeks] |
| [e.g. Enable storage auto-scaling on RDS] | [Name] | [30 min] | [Storage at X% — prevents emergency at X months] |
| [e.g. Add S3 lifecycle policy for [bucket]] | [Name] | [1 hour] | [Storage growing at $X/month unnecessarily] |
### This Quarter (within 3 months)
| Action | Owner | Effort | Justification |
|---|---|---|---|
| [e.g. Add read replica to production DB] | [Name] | [1 day] | [DB CPU projected to hit 65% in 2 months] |
| [e.g. Increase max auto-scaling limit from X to Y] | [Name] | [2 hours] | [Current max is too close to expected peak] |
| [e.g. Configure PgBouncer for connection pooling] | [Name] | [3 days] | [Reduce per-connection overhead; headroom for growth] |
### Next Quarter (36 months)
| Action | Owner | Effort | Justification |
|---|---|---|---|
| [e.g. Upgrade DB instance class — [current] → [next]] | [Name] | [2 hours — blue/green] | [DB CPU projected to hit 70% by Q[X]] |
| [e.g. Implement caching for [high-read endpoint]] | [Name] | [1 week] | [Reduce DB read load by estimated [X%]] |
| [e.g. Evaluate horizontal DB sharding] | [Name] | [2 weeks (spike)] | [At 12-month projections, single DB hits limits] |
### Horizon (612 months)
| Action | Description | Trigger condition |
|---|---|---|
| [e.g. Multi-region deployment] | [Active-passive setup in eu-west-2] | [DAU exceeds X or SLA requires 99.99%] |
| [e.g. Database sharding or migration to distributed DB] | [Evaluate CockroachDB / Vitess] | [Single-node DB projected to hit ceiling] |
| [e.g. CDN expansion] | [Add PoPs in [region]] | [Latency SLO breached for [geography]] |
---
## Anti-Patterns
- [ ] Do not set capacity trigger thresholds without knowing the baseline — a "CPU > 70%" alert is meaningless if you don't know what normal looks like
- [ ] Do not plan only for average traffic — capacity plans that don't model peak load will result in incidents during the events that matter most
- [ ] Do not conflate vertical and horizontal scaling — adding more app servers without addressing database connection limits will not resolve the constraint
- [ ] Do not present growth projections as certainties — all forecasts have uncertainty; state the confidence level and provide a conservative and optimistic scenario
- [ ] Do not defer action items without a named owner and a specific date — a roadmap with no owners is a wish list
## Quality Checks
- [ ] Every resource has a quantified current utilisation and a projected months-to-ceiling — no hand-waving
- [ ] The most critical constraint is called out in the executive summary with a specific timeline
- [ ] Growth projections state their assumptions and confidence level — not presented as certainties
- [ ] Capacity triggers define amber/red thresholds and name who acts at each level
- [ ] Cost projections include unit economics, not just absolute totals
- [ ] The infrastructure roadmap has named owners and effort estimates — not just a wish list
- [ ] Auto-scaling configuration includes both scale-out AND scale-in triggers, and a min/max range
- [ ] Actions are ordered by urgency — immediate items are genuinely immediate, not backlog filler
@@ -0,0 +1,97 @@
---
name: changelog-generator
description: "Convert a git log, commit list, or release notes into a polished, user-facing changelog. Use when writing release notes, generating a CHANGELOG.md entry, or documenting what changed in a version. Produces a structured changelog section with version header, categorised changes, and migration notes."
---
# Changelog Generator Skill
Converts raw git commits, a diff summary, or developer release notes into a polished changelog entry — categorised, user-facing, and following Keep a Changelog conventions.
## Required Inputs
Ask for these if not provided:
- **Commits or release notes** (paste `git log --oneline`, raw commit messages, or a description of what changed)
- **Version number** (e.g. 2.4.0, v1.0.0-beta.2)
- **Release date** (or "today")
- **Audience** (developers using an API / end users of a product / internal team — affects language)
- **Any breaking changes** (flag these explicitly if known)
- **Previous version behaviour** (optional — paste the previous changelog entry or describe what is changing; needed for accurate "Changed" entries)
- **Scope** (whole product / specific package or module — e.g. "payments SDK only", "iOS app", "all services")
## Output Format
Follow [Keep a Changelog](https://keepachangelog.com) format:
---
## [X.Y.Z] — YYYY-MM-DD
### Breaking Changes ⚠️
[Only include if there are breaking changes]
- **[Breaking change]:** [What changed and what it breaks]
- **Migration required:** [Specific action the user must take]
### Added
- [New feature or capability, written from the user's perspective]
- [Another addition]
### Changed
- [Changed behaviour — what it did before vs. what it does now]
- [Performance improvement with measurable impact if known]
### Fixed
- [Bug fixed — describe what was broken, not the fix implementation]
- [Another fix]
### Deprecated
- [Deprecated thing] — use [replacement] instead. Will be removed in [version].
### Removed
- [Removed thing] — was deprecated in [version]
### Security
- [Security fix — describe the vulnerability class, not exploit details]
---
---
> **Skill guidance — do not include the following section in the delivered changelog:**
## Formatting Rules Applied
**Language:** Write for the reader, not the committer. "Add dark mode support" not "implement ThemeProvider with dark palette variant".
**Breaking changes:** Always call these out first with ⚠️. Include a migration path.
**Bug fixes:** Describe what was broken, not what was changed. "Fix crash when user has no profile picture" not "null-check avatar URL before rendering".
**Granularity:** Group related commits into one line. Don't list every micro-commit separately.
**Tone:** Active voice, imperative mood. "Add", "Fix", "Remove" — not "Added", "Fixed", "Removed".
**Empty sections:** Omit any section with no entries. Don't include empty `### Fixed` blocks.
## Quality Checks
- [ ] Breaking changes are at the top with migration instructions
- [ ] All entries are user-facing language (no internal variable names or implementation details)
- [ ] Related commits are grouped into single entries (not listed individually)
- [ ] Version and date header is correct
- [ ] Empty sections are omitted
- [ ] No entries start with past-tense verbs (no "Added", "Fixed", "Removed" — use "Add", "Fix", "Remove")
- [ ] Every breaking change entry includes a specific migration action (not just "update your code")
## Anti-Patterns
- [ ] Do not include implementation details in changelog entries — users need to know what changed for them, not how the code was refactored internally
- [ ] Do not list every micro-commit as a separate entry — related commits should be grouped into one user-facing change
- [ ] Do not omit the migration path for breaking changes — a breaking change entry without a specific migration action forces users to read the source code
- [ ] Do not include empty sections — a "### Fixed" section with no entries signals the template was filled in carelessly
- [ ] Do not write breaking changes in the same casual tone as minor additions — breaking changes must be visually prominent and call out migration requirements explicitly
## Usage Examples
- "Write a changelog for version [X]" + [paste commits]
- "Generate release notes from these commits"
- "Turn this git log into a CHANGELOG entry"
- "Write the CHANGELOG.md update for this release"
- "What changed in this release?" + [paste commit list]
@@ -0,0 +1,309 @@
---
name: cicd-playbook
description: "Write a CI/CD pipeline playbook for a service or team. Use when asked to document a CI/CD pipeline, write a deployment process, define release gates, document build and test stages, or create a deployment guide. Produces a structured playbook covering pipeline stages, environment definitions, deployment gates, rollback procedures, and on-call responsibilities."
---
# CI/CD Playbook Skill
Produce a complete, actionable CI/CD playbook for a service or team — covering everything a new engineer needs to understand, contribute to, and operate the pipeline safely.
A good playbook is not a diagram. It is a document that answers: what runs, when, why, who owns it, and what to do when it breaks.
## Required Inputs
Ask for these if not already provided:
- **Service name** and brief description
- **Tech stack** — language, framework, containerisation (Docker, etc.)
- **Source control** — GitHub / GitLab / Bitbucket, branching strategy
- **CI platform** — GitHub Actions / CircleCI / Jenkins / BuildKite / other
- **CD platform / deployment target** — Kubernetes, ECS, Lambda, Heroku, VMs, etc.
- **Environments** — e.g. dev, staging, production (and any canary / feature environments)
- **Deployment frequency** — how often does the team ship?
- **Any existing gates** — manual approvals, smoke tests, feature flags
- **On-call setup** — who's responsible during deploys?
## Output Format
---
# CI/CD Playbook: [Service Name]
**Service:** [Name] | **Team:** [Team name]
**Last updated:** [Date] | **Owner:** [Name / role]
**Pipeline platform:** [CI tool] → [CD tool / platform]
---
## Overview
[23 sentences describing what this service does and why the CI/CD pipeline is structured the way it is. Include the deployment target and how frequently the team ships.]
**Deployment frequency:** [Multiple times per day / Daily / Weekly / On-demand]
**Average pipeline duration:** [X minutes]
**Rollback time (p95):** [X minutes]
---
## Pipeline Stages
```
[Branch push]
[1. Build & Lint] ──fail──▶ ❌ Block PR
[2. Unit Tests] ──fail──▶ ❌ Block PR
[3. Integration Tests] ──fail──▶ ❌ Block PR
[4. Security Scan] ──fail──▶ ⚠️ [Block / Warn — specify]
[5. Build Artefact / Container Image]
[6. Deploy to Staging] ──fail──▶ ❌ Block promotion
[7. Smoke Tests (Staging)]
[8. Manual Approval Gate] ──(if required)
[9. Deploy to Production] ──fail──▶ 🔁 Auto-rollback (if configured)
[10. Post-deploy checks]
```
---
## Stage Definitions
### Stage 1 — Build & Lint
**What runs:** [Build command] + [Linter — e.g. ESLint, golangci-lint, flake8]
**Trigger:** Every commit to any branch
**Blocking:** Yes — PR cannot be merged if this fails
**Typical duration:** [X minutes]
**Owner if it fails:** PR author
**Common failure causes:**
- [e.g. Missing dependency — run `npm install` locally before pushing]
- [e.g. Lint rule violation — run `npm run lint --fix` to auto-fix most issues]
---
### Stage 2 — Unit Tests
**What runs:** [Test command — e.g. `npm test`, `go test ./...`, `pytest`]
**Coverage gate:** [X]% minimum — pipeline fails below this threshold
**Trigger:** Every commit
**Blocking:** Yes
**Typical duration:** [X minutes]
**Coverage report:** [Where to find it — e.g. uploaded to Codecov, available in CI artifacts]
---
### Stage 3 — Integration Tests
**What runs:** [Test suite description — e.g. "API integration tests against a test database using Docker Compose"]
**Environment:** [Ephemeral test environment / shared test DB / etc.]
**Trigger:** Every commit to `main` and feature branches targeting `main`
**Blocking:** Yes
**Typical duration:** [X minutes]
**If slow:** [e.g. "Integration tests can be skipped locally with `SKIP_INTEGRATION=true` — never skip in CI"]
---
### Stage 4 — Security Scan
**Tools:** [e.g. Snyk, Trivy, OWASP Dependency Check, Semgrep]
**What it checks:** [Dependency vulnerabilities / SAST / secrets detection — list what applies]
**Blocking on:** Critical and High severity findings
**Non-blocking on:** Medium and Low (flagged, not blocking)
**Trigger:** Every commit to `main`
**How to handle a flagged vulnerability:**
1. Check if a fix is available — upgrade the dependency
2. If no fix available, open a security ticket and add a suppression with justification
3. Never suppress without a ticket and owner
---
### Stage 5 — Build Artefact
**What is produced:** [Docker image / binary / zip — be specific]
**Registry:** [ECR / GCR / Docker Hub / Artifactory — URL]
**Tagging convention:** `[service-name]:[git-sha]` (also tagged `:latest` on `main`)
**Trigger:** Commits to `main` only (not feature branches)
---
### Stage 6 — Deploy to Staging
**Deployment method:** [e.g. Helm upgrade / kubectl apply / ecs deploy / Terraform apply]
**Staging URL:** [URL]
**Trigger:** Automatic on successful artefact build from `main`
**Who can deploy to staging:** Any engineer (automatic)
**Environment variables:** Managed in [Vault / AWS SSM / GitHub Secrets / etc.]
**Staging is not production:** [Any differences in config, scale, or data — state them here]
---
### Stage 7 — Smoke Tests (Staging)
**What runs:** [Description — e.g. "10 critical path tests covering login, core API endpoints, and payment flow"]
**Tool:** [e.g. Playwright / Postman / custom script]
**Pass criteria:** All smoke tests pass within [X seconds] timeout
**Blocking:** Yes — production deploy will not proceed if smoke tests fail
**Smoke test suite location:** [Link to test files or folder]
---
### Stage 8 — Manual Approval Gate
**Required for:** [Production deploys / deploys affecting >X% of traffic / deploys to specific regions]
**Who can approve:** [e.g. Any engineer on the team / Lead engineer / On-call engineer]
**Approval timeout:** [e.g. 24 hours — auto-cancelled if no approval]
**How to approve:** [GitHub Actions approve step / Slack command / other — with link]
**When to withhold approval:**
- Active incident in production
- Deploy is outside the deployment window (see below)
- On-call engineer has not been notified
---
### Stage 9 — Deploy to Production
**Deployment method:** [Same as staging or different — specify]
**Deployment window:** [e.g. MondayThursday 09:0016:00 UTC — no deploys on Fridays or before bank holidays]
**Canary / progressive rollout:** [Yes — X% initial traffic, full rollout after Y minutes / No — full deploy]
**Deployment notifications:** [Slack channel — #deployments]
**Who is on-call during deploy:** Deploying engineer is responsible until post-deploy checks pass.
---
### Stage 10 — Post-Deploy Checks
**Automated checks (run for [X minutes] after deploy):**
- [ ] Error rate: <[X]% (baseline: [Y]%)
- [ ] P99 latency: <[X]ms (baseline: [Y]ms)
- [ ] [Key business metric]: within [X]% of baseline
**Where to watch:** [Datadog / Grafana / CloudWatch dashboard — link]
**If a check fails:** See Rollback Procedure below.
---
## Environments
| Environment | Purpose | Deploy trigger | URL | Data |
|---|---|---|---|---|
| **Dev** | Local development | Manual | localhost | Seeded test data |
| **Staging** | Pre-production validation | Automatic (main) | [URL] | Anonymised prod copy |
| **Production** | Live traffic | Manual approval | [URL] | Live data |
---
## Branching Strategy
**Model:** [Trunk-based / GitFlow / GitHub Flow — describe briefly]
| Branch | Purpose | Who merges | Deploy target |
|---|---|---|---|
| `main` | Production-ready code | PR + review | Staging → Production |
| `feature/*` | Feature development | Author | None (CI only) |
| `hotfix/*` | Critical production fixes | Lead engineer | Can bypass staging gate with approval |
**Hotfix process:** [Describe when and how to use a hotfix branch — what level of incident justifies bypassing the standard process]
---
## Rollback Procedure
**Automated rollback:** [Yes — triggered if post-deploy error rate exceeds [X]% / No — manual only]
**Manual rollback steps:**
```bash
# 1. Identify the last known good image tag
[command to list recent deployments]
# 2. Deploy the previous version
[deployment command with previous tag]
# 3. Confirm rollback is live
[smoke test command or health check URL]
# 4. Notify the team
[Slack command or template]
```
**Rollback decision authority:** Any engineer on-call can initiate a rollback without waiting for approval.
**After a rollback:**
1. Create a post-deploy incident report (see [incident-postmortem skill])
2. Do not re-deploy the same commit without fixing the root cause
3. Notify [stakeholder / support team] of the rollback and expected fix timeline
---
## Secrets and Configuration Management
**Secret store:** [Vault / AWS SSM / GitHub Secrets / Doppler — specify]
**How to add a new secret:**
1. [Step 1]
2. [Step 2]
**Who has access:** [Role or team]
**Rotation policy:** [How often secrets are rotated and who owns it]
**Never do:** Commit secrets to source control, even in `.env` files. The pipeline includes secret scanning (Stage 4) which will flag this.
---
## Common Failures and Fixes
| Failure | Likely cause | Fix |
|---|---|---|
| Build fails with "module not found" | Dependency not installed | Run `[install command]` and commit `lock file` |
| Integration tests timeout | Test DB not seeded / external service down | Check [service] status; re-run pipeline |
| Smoke tests fail after staging deploy | Environment variable missing | Check [config location]; compare staging and prod env vars |
| Production deploy stuck at approval | Approver not notified | Tag `@[on-call handle]` in `#deployments` |
| Post-deploy error rate spike | Bad deploy / upstream dependency | Check [dashboard]; initiate rollback if >5 min |
---
## On-Call Responsibilities During Deploy
- The deploying engineer is responsible for monitoring post-deploy checks for [X minutes] after a production deploy
- If you cannot monitor after deploying, hand off explicitly to another engineer in `#deployments`
- For deploys outside business hours: only hotfixes — always page the on-call engineer before deploying
---
## Anti-Patterns
- [ ] Do not describe a rollback procedure that has never been tested — a theoretical rollback is not a rollback plan; test it in staging before production
- [ ] Do not allow deploys on Fridays or before holidays without an explicit on-call engineer who will monitor through the weekend
- [ ] Do not commit secrets to source control even in non-production branches — secret scanning in the pipeline catches this, but prevention is the standard
- [ ] Do not skip post-deploy monitoring after a production deploy — the deploying engineer must watch error rates and latency for the specified observation window
- [ ] Do not suppress a security scan finding without a linked ticket and a named owner — suppressions without accountability accumulate into unmanaged risk
## Quality Checks
- [ ] Every stage has a clear owner when it fails
- [ ] Rollback procedure is tested — not theoretical
- [ ] Secrets management section names the actual tool used (not "use secrets management")
- [ ] Deployment window is specific — not "during business hours"
- [ ] Post-deploy check thresholds are calibrated to actual baseline metrics
@@ -0,0 +1,290 @@
---
name: claude-superpowers
description: "Activate a 4-stage coding discipline framework that forces Claude to plan before coding, isolate changes on a branch, write tests first, and self-review output twice before presenting it. Use when starting a complex coding task, when past Claude sessions produced broken first drafts, or when you want to prevent rework cycles. Produces a confirmed written plan, isolated feature branch, test-first implementation, and a double-reviewed output with a correctness and code-quality checklist."
---
# Claude Superpowers Skill
Stop Claude from shipping the first thing it writes. Superpowers mode locks Claude into four stages — Plan, Isolate, Test First, Double Review — so that what it presents at the end is actually right.
The default problem: Claude sprints out of the gate, writes the whole thing in one shot, and it looks great — until someone runs it. It doesn't plan. It doesn't test. It doesn't verify. The result: code that breaks on edge cases, debugging rounds that burn tokens, and rework that costs more than doing it right the first time.
> **Credit:** Inspired by a skill from Nate Herk's YouTube channel — adapted and extended for this library.
---
## Required Inputs
No inputs required. Superpowers activates on command, then applies to whatever coding task follows.
---
## The Four Stages
### Stage 1 — Plan
Before writing a single line of code, Claude must produce a written plan and wait for user confirmation.
**Plan format:**
```
PLAN
════
TASK
[One-sentence restatement of what was asked. If anything is ambiguous, flag it here before proceeding.]
APPROACH
[24 sentences describing the implementation approach and key decisions. If there are multiple valid approaches, briefly explain why this one was chosen.]
FILES TO CREATE OR MODIFY
- [path/to/file.ts] — [what changes: create / modify / delete — one line reason]
- [path/to/file.ts] — [what changes]
EDGE CASES I WILL HANDLE
- [Edge case 1]
- [Edge case 2]
- [Edge case 3]
EDGE CASES I AM NOT HANDLING (out of scope)
- [Out of scope case — reason]
ASSUMPTIONS
- [Any assumption made where the requirements were unclear]
Confirm this plan before I start coding.
```
Claude must not proceed until the user says yes (or provides corrections). If the user corrects the plan, revise and re-confirm before starting.
---
### Stage 2 — Isolate
Claude works in isolation until the output is complete and reviewed. Nothing touches the main project until explicitly approved.
**Isolation rules:**
- If git is available: create a feature branch before making any changes. Branch name format: `superpowers/[task-slug]`
- If no git: note that changes are being made to a working copy and flag all modified files at the end for user review before they're considered "shipped"
- Do not modify files outside the scope defined in the plan unless the user explicitly expands scope during the session
- If new scope is discovered mid-task (e.g. a dependency needs to change), surface it: "This requires also modifying [X] — should I include that in scope?"
**On starting Stage 2, announce:**
```
ISOLATE
Working in isolation on branch: superpowers/[task-slug]
No changes will be considered final until Stage 4 review is complete.
```
---
### Stage 3 — Test First
Before writing the implementation, write the tests (or at minimum, define the expected behaviour as executable assertions).
**Test-first approach:**
1. Write tests that define the expected behaviour for the task
2. Write tests that cover each edge case identified in the plan
3. Run the tests — they should fail (implementation doesn't exist yet)
4. Confirm the tests are failing for the right reason before writing implementation
5. Write the implementation
6. Run the tests — they should now pass
7. If tests fail: fix the implementation, not the tests
**If the project has no test setup:** flag it and offer two options:
- Option A: Set up a minimal test harness before proceeding (recommended)
- Option B: Define the expected behaviour as a checklist of manual verification steps (faster but weaker)
**Test summary to show before writing implementation:**
```
TESTS WRITTEN
─────────────
File: [test file path]
Tests:
✗ [test description — covers: happy path]
✗ [test description — covers: edge case 1]
✗ [test description — covers: edge case 2]
✗ [test description — covers: error state]
All tests failing as expected. Starting implementation.
```
---
### Stage 4 — Double Review
After completing the code and running tests, Claude reviews its own work twice before presenting it. Neither review is a formality.
**Review 1 — "Does this match what was asked for?"**
Check the completed code against the original request and confirmed plan:
- Does it do everything that was asked?
- Does it handle all edge cases from the plan?
- Are there any mismatches between what was planned and what was built?
- Are there any assumptions baked in that weren't confirmed?
**Review 2 — "Is this good code?"**
Check for technical quality independent of the requirements:
- Obvious bugs or logic errors
- Missing error handling (especially at boundaries: API calls, file I/O, user input)
- Security issues (injection vulnerabilities, exposed secrets, missing auth checks)
- Readability: would another developer understand this in 6 months?
- Performance: any obvious inefficiencies on the critical path?
- Dead code or unused imports introduced
**Double Review output format:**
```
REVIEW 1 — CORRECTNESS
───────────────────────
✅ Handles [requirement 1]
✅ Handles [requirement 2]
✅ Edge case [X] covered
⚠️ [Issue found — what it is and what was changed to fix it]
REVIEW 2 — CODE QUALITY
────────────────────────
✅ Error handling present at all API boundaries
✅ No obvious security issues
⚠️ [Issue found — what it was and how it was fixed]
✅ Readable — no unexplained complexity
VERDICT: [Ready to present / Fixed N issues before presenting]
```
If issues are found in either review, fix them and note what was fixed. Present the corrected version, not the original draft.
---
## Activation Response
When the user triggers Superpowers mode, respond with:
```
Superpowers mode active.
I'll work in 4 stages for every coding task this session:
1. PLAN — Write a plan and wait for your confirmation before coding
2. ISOLATE — Work on a branch; nothing ships until you approve
3. TEST — Write tests before the implementation
4. REVIEW — Review my own work twice before presenting it
What are we building?
```
---
## Output Structure
### Full task flow (all four stages)
```
PLAN
════
[Plan format as above]
Confirm this plan before I start coding.
---
[User confirms]
---
ISOLATE
Working in isolation on branch: superpowers/[task-slug]
TESTS WRITTEN
─────────────
[Test summary — all failing]
Starting implementation.
---
[Implementation runs]
---
REVIEW 1 — CORRECTNESS
───────────────────────
[Checklist]
REVIEW 2 — CODE QUALITY
────────────────────────
[Checklist]
VERDICT: Ready to present.
---
COMPLETE
════════
[Summary of what was built, files created/modified, how to run/test it]
Branch: superpowers/[task-slug] — merge when ready.
```
---
## CLAUDE.md Installation Text
After activating Superpowers for the session, provide the user with the exact text to add to their `CLAUDE.md` to make it permanent:
````
```
## Superpowers Framework
This framework is always active for coding tasks in this project.
### Stage 1 — Plan
Before writing any code: produce a written plan including task restatement, approach, files to create/modify, edge cases to handle, and assumptions. Wait for explicit user confirmation before proceeding.
### Stage 2 — Isolate
Work on a feature branch (superpowers/[task-slug]) or clearly flagged working copy. Nothing is considered shipped until the user approves after Stage 4.
### Stage 3 — Test First
Write tests before writing the implementation. Tests should fail before implementation, pass after. If no test setup exists, offer to create one or produce a manual verification checklist.
### Stage 4 — Double Review
After completing code, run two reviews before presenting:
- Review 1: Does this match what was asked for? Check against original request and plan.
- Review 2: Is this good code? Check for bugs, missing error handling, security issues, readability.
Fix any issues found. Present the corrected version. Show the review checklist.
```
````
Tell the user: "Add this to your CLAUDE.md and Superpowers will be active permanently for this project."
---
## Quality Checks
- [ ] Stage 1 plan was shown and user explicitly confirmed before any code was written
- [ ] Plan includes: task restatement, approach, files to modify, edge cases in scope, edge cases out of scope, assumptions
- [ ] Ambiguities in the original request were flagged in the plan (not silently assumed)
- [ ] Stage 2 isolation: a feature branch was created (or flagged as working copy if no git)
- [ ] Stage 3 tests were written before implementation — not after
- [ ] Tests were run and confirmed to be failing before implementation started
- [ ] Stage 4 Review 1 checked against the original request — not just against the plan
- [ ] Stage 4 Review 2 checked for bugs, error handling, security, readability — all four
- [ ] Issues found in either review were fixed before presenting — not flagged as "things to fix later"
- [ ] Final output shows what was built, which files were changed, and how to run/test it
- [ ] CLAUDE.md installation text was offered after activation
---
## Anti-Patterns
- [ ] Do not proceed to Stage 2 without explicit user confirmation of the plan — coding before confirmation defeats the entire purpose of the planning stage
- [ ] Do not write tests after the implementation and call it "test-first" — tests must be written and confirmed failing before the implementation starts
- [ ] Do not skip the Double Review when time is tight — the review is most valuable precisely when speed is the priority, because that is when errors are most likely
- [ ] Do not expand scope during Stage 2 without surfacing it — silent scope expansion produces code the user did not approve and may not want
- [ ] Do not mark both reviews as clean without actually performing them — a rubber-stamp review produces false confidence and defeats the framework
## Example Trigger Phrases
- "Enable superpowers mode"
- "Activate superpowers"
- "Turn on superpowers for this session"
- "Use the superpowers framework"
- "Make sure you plan before coding"
- "I want you to review your work before showing me"
- "Write tests first this time"
- "Slow down and plan it out before you start building"
- "Work on a branch and show me a plan before touching anything"
@@ -1,114 +1,122 @@
---
name: code-review-checklist
description: "Generate a tailored code review checklist for any PR, language, or risk level. Use when asked to create a code review checklist, review guidelines, PR standards, or quality gates for a codebase. Produces a structured, prioritised checklist adapted to the specific language, PR type, and risk level."
description: "Generate a tailored code review checklist for any pull request based on the language, type of change, and risk level. Use when asked to review code, check a PR, review a pull request, or generate a code review checklist. Produces a focused checklist with language-specific checks, risk-level-appropriate depth, and a clear approve/request-changes recommendation."
---
# Code Review Checklist Skill
This skill generates a structured, prioritised code review checklist tailored to a specific PR, language, and risk level. It helps reviewers be thorough without being bureaucratic.
Produces a tailored code review checklist for a specific pull request — scaled to the language, type of change, and risk level. Not a generic template.
## Required Inputs
Ask the user for these if not provided:
- **Programming language(s)** (e.g. Python, TypeScript, Go, Java)
- **PR type** (new feature / bug fix / refactor / performance improvement / security patch / infrastructure change)
- **Risk level** (Low: internal tooling, Low traffic / Medium: user-facing feature / High: payment, auth, data pipeline, public API)
- **Team context** (optional: team size, seniority mix, any known recurring issues)
- **Language and framework** (e.g. TypeScript + React / Python + FastAPI / Go)
- **Type of change** (feature / bug fix / refactor / dependency upgrade / security patch / performance)
- **Risk level** (low / medium / high / critical)
- **PR description** (paste the description or link to the PR)
- **Code or diff** (optional — paste key changed files or a `git diff`; significantly improves checklist specificity)
- **Author context** (new starter / experienced / external contributor)
## Output Structure
### 1. Checklist Header
**PR:** [Title if provided]
**Language:** [Language]
**Type:** [PR Type]
**Risk Level:** [Low / Medium / High]
**Estimated review depth:** [Quick scan ~15 min / Standard ~30 min / Deep review ~60 min+]
## Output Format
---
### 2. The Checklist
# Code Review: [PR Title or Reference]
Organise into sections. Mark each item with a priority indicator:
- 🔴 **MUST** — Blocking. PR should not merge without this.
- 🟡 **SHOULD** — Important. Address before merge unless there's a good reason not to.
- 🟢 **CONSIDER** — Nice to have. Worth a comment but not blocking.
### 1. PR Overview
**Scope assessment:** [Small / Medium / Large / Too large — should be split]
**Recommended review depth:** [Skim / Standard / Deep dive]
**Estimated review time:** [e.g. 2030 min — use 5 min per 50 lines of diff as a rough guide]
#### Section A: Correctness
- 🔴 Does the code do what the ticket/requirement describes?
- 🔴 Are edge cases handled? (nulls, empty arrays, zero values, max values)
- 🔴 Are error states handled and surfaced appropriately?
- 🟡 Does the happy path have adequate test coverage?
- 🟡 Are failure paths tested?
### 2. Correctness Checks
#### Section B: Security (scale with risk level — expand for High risk PRs)
- 🔴 [High risk only] Is user input sanitised before use in queries or commands?
- 🔴 [High risk only] Are auth/permission checks in place?
- 🟡 Are secrets/credentials committed anywhere? (check .env handling)
- 🟡 Are third-party dependencies known-safe versions?
Language-specific correctness checks — choose based on the language stated:
#### Section C: Performance
- 🟡 Are there N+1 query patterns in database calls?
- 🟡 Is there unnecessary work inside loops?
- 🟢 Are database queries indexed appropriately?
- 🟢 Is caching considered where appropriate?
**For TypeScript/JavaScript:**
- Type definitions match actual usage
- No implicit `any` in non-test code
- Async/await used consistently; no unhandled promises
- Null/undefined handling is explicit
#### Section D: Readability & Maintainability
- 🟡 Are function and variable names clear without needing a comment to explain them?
- 🟡 Are complex logic blocks explained with inline comments?
- 🟢 Is the code consistent with existing patterns in the codebase?
- 🟢 Are there any magic numbers that should be named constants?
**For Python:**
- Type hints present on public functions
- Exception handling is specific (no bare except)
- Resources are closed (context managers, with blocks)
#### Section E: Language-Specific Checks
[Populate this section based on the specified language. Examples below:]
**For Go:**
- Errors are handled or explicitly ignored with a comment
- Context propagation is correct
- Goroutine lifetimes are bounded
**Python:**
- 🟡 Are type hints used on function signatures?
- 🟡 Are exceptions caught specifically (not bare `except:`)?
- 🟢 Does it follow PEP 8 (or the team's linter config)?
[Include only the section matching the stated language]
**TypeScript/JavaScript:**
- 🔴 Are there any `any` types that should be properly typed?
- 🟡 Are async/await patterns used consistently (no mixed Promise.then chains)?
- 🟢 Are there unnecessary re-renders in React components?
### 3. Change-Type-Specific Checks
**Go:**
- 🔴 Are errors checked (not ignored with `_`)?
- 🟡 Are goroutines properly managed to prevent leaks?
- 🟢 Are exported functions documented?
**For bug fixes:**
- A test exists that would have caught this bug
- The fix addresses root cause, not symptom
- Related code paths checked for the same issue
#### Section F: PR Hygiene
- 🟡 Is the PR a reasonable size? (>500 lines diff suggests it should be split)
- 🟡 Does the PR description explain *why*, not just *what*?
- 🟢 Are there linked tickets or context in the PR description?
- 🟢 Are migration scripts or deployment notes included if needed?
**For features:**
- Acceptance criteria met
- Edge cases handled (empty, large, concurrent)
- Error paths tested, not just happy path
- Telemetry/logging added for debugging
---
**For refactors:**
- Behaviour unchanged (tests still pass)
- No scope creep — refactor only
- Complexity reduced, not just moved
### 3. Risk-Specific Additions
**For dependency upgrades:**
- Breaking changes reviewed
- Security advisories checked
- License compatibility verified
For **High risk** PRs, always add:
- 🔴 Has this been tested in a staging environment?
- 🔴 Is there a rollback plan?
- 🔴 Has a second reviewer been assigned?
[Include only the section matching the stated change type]
For **Infrastructure / DB changes**, always add:
- 🔴 Are migrations backward-compatible?
- 🔴 Has the migration been tested against production data volume?
### 4. Risk-Appropriate Checks
**Low risk:** basic correctness, style conventions, test coverage
**Medium risk:** above + rollback plan, monitoring updates, performance considerations
**High risk:** above + security implications, data migration safety, feature flag/gradual rollout
**Critical risk:** above + staging validation plan, incident response plan, post-deploy verification checklist
### 5. Testing Adequacy
- Unit tests cover new logic
- Integration tests cover the contract changes
- Edge cases tested
- Failure modes tested
- Performance tests if performance-sensitive
### 6. Review Decision Framework
**Approve if:** [2-3 specific conditions based on this PR]
**Request changes if:** [Specific blockers]
**Comment (non-blocking) if:** [Items worth discussing but not blocking merge]
### 7. Common Pitfalls for This Change Type
Based on the change type and language, flag 2-3 things reviewers typically miss for this combination.
---
## Quality Checks
- [ ] Checklist is tailored to the stated language (not generic)
- [ ] Change-type-specific section is included
- [ ] Risk-appropriate depth matches stated risk level
- [ ] Decision framework includes at least one named blocking condition and one named non-blocking comment condition
- [ ] Common pitfalls are specific to the stated language + change-type combo (not generic advice like "watch out for bugs")
- [ ] Checklist is tailored to the specified language (not generic)
- [ ] Risk level is reflected in the MUST vs SHOULD balance
- [ ] Language-specific section covers the most common issues for that language
- [ ] PR hygiene section is always present
- [ ] High-risk additions are included when risk level = High
## Anti-Patterns
## Example Trigger Phrases
- [ ] Do not generate a generic checklist that ignores the stated language — a Python checklist and a Go checklist have fundamentally different correctness concerns
- [ ] Do not treat "looks fine" as a valid review outcome — the checklist exists to surface specific concerns, not validate a superficial read
- [ ] Do not scope a "high risk" review the same as a "low risk" review — depth must scale with the stated risk level
- [ ] Do not flag every stylistic preference as a blocking issue — distinguish between blocking correctness issues and non-blocking comments
- [ ] Do not skip the "common pitfalls" section for the stated language and change-type combination — this is where the most valuable knowledge lives
- "Generate a code review checklist for a Python bug fix PR"
- "Give me a review checklist for a high-risk TypeScript auth change"
- "What should I check in this Go PR?"
- "Create PR review standards for our team"
## Usage Examples
- "Generate a code review checklist for [PR description]"
- "What should I check in this pull request?"
- "Give me a code review checklist for a [language] [change type]"
- "Review checklist for a high-risk PR in [language]"
@@ -0,0 +1,248 @@
---
name: context-mode
description: "Activate output filtering, session logging, and auto-resume to keep Claude Code sessions productive across resets. Use when starting a long or complex coding session, when previous sessions lost context mid-task, or when you need Claude to resume exactly where it left off after a reset. Installs a session.log at project root, filters verbose command output to preserve context, and automatically resumes in-progress tasks after any Claude reset."
---
# Context Mode Skill
Fix the two session killers that end most Claude Code sessions in under 30 minutes: context bloat from raw command output, and memory loss after a reset.
Context Mode runs three systems simultaneously to keep sessions alive:
- **Output Filtering** — strips verbose command output before it enters context
- **Session Log** — writes a running log of everything that happened
- **Auto-Resume** — reads the log on reset and picks up exactly where you left off
> **Credit:** Inspired by a skill from Nate Herk's YouTube channel — adapted and extended for this library.
---
## Required Inputs
No inputs required. Context Mode activates on command.
Optional: user can specify a custom log file path if they don't want `session.log` in the project root.
---
## How Context Mode Works
### Part 1 — Output Filtering
The problem: every time Claude Code runs a command, the full raw output enters the context window. A single `npm install` can dump hundreds of lines. A test suite run? Thousands. Within 30 minutes, the context is full of noise and Claude resets.
The fix: before any command output enters context, filter it to the useful summary only.
**What gets kept:**
- Last 10 lines of stdout
- Every line containing `error`, `warn`, `fail`, `exception`, `traceback`, or `fatal` (case-insensitive)
- The exit code
- A one-line summary of what the command did and whether it succeeded
**What gets discarded:**
- Middle section of long stdout (replaced with `[... N lines of output truncated ...]`)
- Progress bars, download indicators, verbose install logs
- Repeated identical lines (deduplicated)
**Filtering summary format:**
```
COMMAND: [command run]
STATUS: [exit code — success / failed]
SUMMARY: [one sentence: what happened]
ERRORS: [any error/warn lines — or "none"]
TAIL: [last 10 lines of stdout]
```
---
### Part 2 — Session Log
Claude maintains a running log file at `[project root]/session.log`. This file is written after every significant action and is the source of truth for resuming after a reset.
**Session log format:**
```
SESSION LOG
===========
Started: [timestamp]
Branch: [current git branch]
Directory: [working directory]
FILES EDITED
────────────
[timestamp] [file path] — [one-line description of what changed]
COMMANDS RUN
────────────
[timestamp] [command] — [outcome: success / failed — brief reason]
TASKS IN PROGRESS
─────────────────
[ ] [Task description — what's been done so far and what's left]
[x] [Completed task]
LAST USER PROMPT
────────────────
[The most recent instruction from the user, verbatim]
LAST ACTION TAKEN
─────────────────
[What Claude did last, in one sentence]
```
**Log update rules:**
- Write to `session.log` after every file edit
- Write to `session.log` after every command run
- Update "Tasks in Progress" when a task is started, progressed, or completed
- Always overwrite "Last User Prompt" and "Last Action Taken" with the current values — don't append, replace
---
### Part 3 — Resume on Reset
When a new Claude session starts, the first action is:
1. Check for `session.log` in the project root
2. If found, read it and announce the resume:
```
Resuming session.
Branch: [branch]
Last working on: [last task in progress]
Files edited: [list from session log]
Tasks pending: [incomplete tasks]
Last prompt: "[last user prompt]"
Continuing from where we left off.
```
3. Continue with the next logical step — don't ask "what should I do?" — check the task list and carry on
If no `session.log` exists, start fresh and initialise the log.
---
## Activation Response
When the user triggers Context Mode, respond with:
```
Context Mode active.
Session log initialised at: [absolute path to session.log]
Output filtering: enabled
Auto-resume: enabled
I'll maintain your session state across resets. Long sessions won't lose context.
```
Then immediately initialise `session.log` with the current timestamp, branch, and directory.
---
## Output Structure
### On activation
```
Context Mode active.
Session log initialised at: [path]
Output filtering: enabled
Auto-resume: enabled
I'll maintain your session state across resets. Long sessions won't lose context.
```
### On command execution (filtered output format)
```
COMMAND: npm test
STATUS: exit 1 — failed
SUMMARY: 47 tests passed, 3 failed in auth.test.ts
ERRORS: Error: Expected 200, received 401 (line 84)
Error: Token not found in response (line 112)
TAIL:
✓ login with valid credentials (23ms)
✓ logout clears session (11ms)
✗ refresh token after expiry
...
```
### On reset / new session (resume announcement)
```
Resuming session.
Branch: feature/auth-refresh
Last working on: Fixing token refresh logic in auth.service.ts
Files edited: src/auth/auth.service.ts, src/auth/auth.test.ts
Tasks pending: [ ] Fix failing test on line 112
[ ] Run full test suite once fix is applied
Last prompt: "The refresh token test is still failing — look at the 401 handling"
Continuing from where we left off.
```
---
## CLAUDE.md Installation Text
After activating Context Mode for the session, provide the user with the exact text to add to their `CLAUDE.md` to make it permanent across all sessions:
````
```
## Context Mode
Context Mode is always active in this project.
### Output Filtering
Before any command output enters context, filter it to:
- Last 10 lines of stdout
- Any lines containing: error, warn, fail, exception, traceback, fatal (case-insensitive)
- Exit code
- One-line summary of what the command did
Use this format for filtered output:
COMMAND: [command]
STATUS: [exit code — success/failed]
SUMMARY: [one sentence]
ERRORS: [error lines or "none"]
TAIL: [last 10 lines]
### Session Log
Maintain a running session log at ./session.log. Write to it after every file edit and every command run. Track: files edited, commands run, tasks in progress, last user prompt, last action taken. Format defined in Context Mode skill.
### Auto-Resume
At the start of every new session, check for ./session.log. If it exists, read it and announce the resume state. Continue from the last task in progress without asking for instructions.
```
````
Tell the user: "Add this to your CLAUDE.md and Context Mode will be active permanently for this project — even after you close and reopen the session."
---
## Quality Checks
- [ ] `session.log` was initialised immediately on activation (not deferred)
- [ ] Log path shown to user is the absolute path, not relative
- [ ] Output filtering is applied on the very next command run — not just announced
- [ ] Filtered output format includes: command, status, summary, errors, and tail — all five fields
- [ ] Session log tracks all four categories: files edited, commands run, tasks in progress, last prompt
- [ ] Resume announcement reads the actual log contents — not a generic template
- [ ] On resume, Claude continues the work without prompting the user for instructions
- [ ] CLAUDE.md installation text was offered after activation
- [ ] Log update rule is clear: "Last User Prompt" and "Last Action Taken" replace previous values, not append
---
## Example Trigger Phrases
- "Enable context mode"
- "Turn on context mode for this session"
- "Activate long session mode"
- "I keep losing context — fix it"
- "Set up session logging"
- "Keep track of what you've done so you can resume after a reset"
- "Enable output filtering to save context"
- "Set up auto-resume so we don't lose our place"
@@ -0,0 +1,462 @@
---
name: database-migration-plan
description: "Write a safe, zero-downtime database migration plan for a schema change. Use when asked to plan a database migration, design a zero-downtime schema change, document an expand/contract migration, produce a rollback procedure for a database change, or coordinate a database schema update with a deployment. Produces a structured migration plan covering migration objectives, backward compatibility analysis, expand/contract phase breakdown, exact SQL, rollback steps per phase, data validation queries, and a deployment runbook."
---
# Database Migration Plan Skill
Produce a complete, safe database migration plan for a schema change. A migration plan is not just the SQL — it is a coordinated sequence of steps that ensures the application stays available, data stays consistent, and every step can be rolled back independently.
The expand/contract pattern is the default approach: expand the schema to support both old and new states, migrate the application, then contract to remove the old state. Never combine schema changes and data backfills in a single migration that runs during deployment.
## Required Inputs
Ask for these if not already provided:
- **Current schema state** — the DDL or description of the table(s) as they are now
- **Target schema state** — the DDL or description of what the table(s) should look like after migration
- **Migration reason** — why this change is being made (new feature, performance fix, normalization, compliance)
- **Database engine** — PostgreSQL, MySQL, SQLite, CockroachDB, etc.
- **Estimated data volume** — approximate number of rows in affected tables
- **Deployment constraints** — is any downtime allowed? What is the expected traffic level during migration? Are there multiple app instances running?
- **Rollback window** — how long after deploy can the team roll back before the migration becomes irreversible?
## Output Format
---
# Database Migration Plan: [Migration Name]
**Service:** [Name] | **Team:** [Team name]
**Author:** [Name] | **Reviewed by:** [Name / DBA]
**Date:** [Date] | **Target deploy date:** [Date]
**Database engine:** [PostgreSQL X.X / MySQL X.X]
**Ticket:** [JIRA-XXX]
---
## 1. Migration Overview
**What is changing:**
[12 sentences: the specific schema change — e.g. "Adding a non-nullable `organisation_id` column to the `users` table and backfilling it from the `accounts` table."]
**Why:**
[12 sentences: the business or technical reason driving the change.]
**Migration type:** [Additive only / Additive + backfill / Column rename / Column type change / Table restructure / Index change]
**Zero-downtime:** [Yes — using expand/contract / No — requires maintenance window — state duration]
**Estimated migration duration:**
- Expand phase: [~X minutes]
- Data backfill: [~X minutes/hours — based on X rows at Y rows/second]
- Contract phase: [~X minutes after app version deployed]
---
## 2. Backward Compatibility Analysis
Before writing a single line of SQL, assess whether each change is backward compatible with the currently deployed application code.
| Change | Backward compatible? | Risk | Notes |
|---|---|---|---|
| [e.g. Add nullable column `org_id`] | Yes | Low | Old app ignores new column |
| [e.g. Backfill `org_id`] | Yes | Medium | Old app unaffected; new app reads backfilled values |
| [e.g. Add NOT NULL constraint to `org_id`] | **No** | High | Old app that inserts without `org_id` will fail |
| [e.g. Drop old column `account_id`] | **No** | High | Old app that reads `account_id` will fail |
| [e.g. Add index on `org_id`] | Yes | Low | Additive; no breaking change |
| [e.g. Rename column] | **No** | High | Never rename in one step; use expand/contract |
**Summary:** [e.g. "This migration requires the expand/contract pattern across 3 deployment phases because steps 3 and 4 are not backward compatible."]
---
## 3. Expand/Contract Phases
### Phase Overview
```
Phase 1 — EXPAND
Deploy migration: add new column (nullable), create new indexes
Old app: continues to work (ignores new column)
New app: not yet deployed
Duration: [~X min] | Rollback: trivial — drop new column
Phase 2 — BACKFILL + DUAL-WRITE
Deploy app update: writes to both old and new columns
Run backfill: populate new column for existing rows
Validate: confirm 100% of rows have non-null new column
Duration: [~X hours depending on data volume]
Rollback: deploy previous app version; new column is still nullable
Phase 3 — ENFORCE + SWITCH
Deploy migration: add NOT NULL constraint, drop old column/index
Deploy app update: reads only from new column
Duration: [~X min] | Rollback: requires forward-fix (constraint must be dropped first)
Phase 4 — CONTRACT (optional cleanup)
Deploy migration: drop deprecated columns, rename if needed
Final state matches target schema
Rollback: not recommended — contract changes are destructive
```
---
### Phase 1 — Expand Schema
**Goal:** Add the new column and structures without breaking the existing application.
**Deploy order:** Run migration first, then (optionally) deploy app.
**Application state:** Old app running; no app changes required yet.
```sql
-- Migration: 001_add_org_id_to_users.sql
BEGIN;
-- Add nullable column (safe — old app ignores it)
ALTER TABLE users
ADD COLUMN org_id UUID NULL
REFERENCES organisations(id) ON DELETE RESTRICT;
-- Add index NOW, not in Phase 3 — building index on large table during Phase 3 is risky
CREATE INDEX CONCURRENTLY users_org_id_idx ON users (org_id);
-- Note: CONCURRENTLY does not lock the table; safe on live traffic
-- Note: Cannot run CONCURRENTLY inside a transaction block; run separately if needed
COMMIT;
```
**Validation after Phase 1:**
```sql
-- Confirm column exists and is nullable
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'org_id';
-- Expected: is_nullable = 'YES'
-- Confirm index exists
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'users' AND indexname = 'users_org_id_idx';
```
**Rollback (Phase 1 only):**
```sql
BEGIN;
DROP INDEX CONCURRENTLY IF EXISTS users_org_id_idx;
ALTER TABLE users DROP COLUMN IF EXISTS org_id;
COMMIT;
```
---
### Phase 2 — Backfill Existing Data
**Goal:** Populate the new column for all existing rows before enforcing NOT NULL.
**When to run:** After Phase 1 is live and stable. Can be run as a background job or a one-time script.
**Application state:** Deploy app version that dual-writes to both old and new columns.
**App code change required:**
```
// All INSERT and UPDATE operations must now set BOTH old_column and new_column
// until Phase 3 is complete. This ensures new rows are populated during the backfill window.
```
**Backfill script — batch processing:**
```sql
-- Run in batches to avoid locking. Adjust batch size based on table size and DB load.
-- Target: no single batch takes more than 5 seconds.
DO $$
DECLARE
batch_size INT := 1000;
affected INT;
BEGIN
LOOP
UPDATE users
SET org_id = accounts.organisation_id
FROM accounts
WHERE users.account_id = accounts.id
AND users.org_id IS NULL
LIMIT batch_size;
GET DIAGNOSTICS affected = ROW_COUNT;
EXIT WHEN affected = 0;
-- Pause between batches to avoid saturating I/O
PERFORM pg_sleep(0.1);
END LOOP;
END $$;
```
**Monitoring during backfill:**
```sql
-- Check progress — run periodically during backfill
SELECT
COUNT(*) FILTER (WHERE org_id IS NOT NULL) AS backfilled,
COUNT(*) FILTER (WHERE org_id IS NULL) AS remaining,
COUNT(*) AS total,
ROUND(
100.0 * COUNT(*) FILTER (WHERE org_id IS NOT NULL) / COUNT(*), 2
) AS pct_complete
FROM users;
```
**Backfill completion validation:**
```sql
-- Must return 0 before proceeding to Phase 3
SELECT COUNT(*) AS unbackfilled_rows
FROM users
WHERE org_id IS NULL;
-- Confirm no new rows written without org_id (dual-write working)
SELECT COUNT(*) AS recent_missing
FROM users
WHERE org_id IS NULL
AND created_at > now() - INTERVAL '1 hour';
```
**Rollback (Phase 2 — app only):**
- Deploy previous app version (single-write to old column)
- `org_id` column remains nullable; no data is lost
- Backfilled values remain; harmless
---
### Phase 3 — Enforce Constraints
**Goal:** Add NOT NULL constraint and remove dependency on the old column.
**Prerequisites:** Phase 2 backfill must be 100% complete (zero rows with `org_id IS NULL`).
**Deploy order:** Run migration, then deploy app version that reads only from `org_id`.
**PostgreSQL — use NOT VALID + VALIDATE for large tables:**
```sql
-- Step 1: Add constraint as NOT VALID (no full table scan — instant)
ALTER TABLE users
ADD CONSTRAINT users_org_id_not_null
CHECK (org_id IS NOT NULL) NOT VALID;
-- Step 2: VALIDATE CONSTRAINT (takes a SHARE UPDATE EXCLUSIVE lock — allows reads and writes)
-- Run this separately, as it can take minutes on large tables
ALTER TABLE users
VALIDATE CONSTRAINT users_org_id_not_null;
-- Step 3: Once validated, convert to actual NOT NULL
-- (PostgreSQL trusts the validated check constraint — this is instant)
ALTER TABLE users
ALTER COLUMN org_id SET NOT NULL;
-- Step 4: Drop the now-redundant check constraint
ALTER TABLE users
DROP CONSTRAINT users_org_id_not_null;
```
**Validation after Phase 3:**
```sql
-- Confirm NOT NULL is enforced
SELECT column_name, is_nullable
FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'org_id';
-- Expected: is_nullable = 'NO'
-- Test that insert without org_id fails (run in a transaction and roll back)
BEGIN;
INSERT INTO users (email) VALUES ('test@example.com');
-- Expected: ERROR: null value in column "org_id" violates not-null constraint
ROLLBACK;
```
**Rollback (Phase 3):**
```sql
-- Drop the NOT NULL constraint (restores nullable state)
ALTER TABLE users ALTER COLUMN org_id DROP NOT NULL;
-- Then deploy previous app version (dual-write)
-- Note: Once app code reading the new column is live, rolling back the constraint
-- without rolling back the app will cause issues — plan this carefully.
```
---
### Phase 4 — Contract (Remove Old Column)
**Goal:** Remove the old column once the app no longer references it.
**Prerequisites:** Phase 3 fully deployed and stable for at least [X days/hours rollback window].
**Warning:** This phase is destructive — the old column's data is permanently deleted.
```sql
BEGIN;
-- Drop the old column
ALTER TABLE users DROP COLUMN account_id;
-- Drop any indexes that referenced the old column
DROP INDEX IF EXISTS users_account_id_idx;
COMMIT;
```
**Pre-drop validation:**
```sql
-- Confirm no application queries still reference the old column
-- (Check this in code review and via a search of the codebase before running)
-- grep -r "account_id" app/
-- Confirm the column is safe to drop
SELECT COUNT(*) FROM users WHERE account_id IS NOT NULL;
-- Should be 0 (or irrelevant once new column is canonical)
```
**Rollback:** Not straightforward — dropped column data cannot be recovered. Only proceed to Phase 4 after the rollback window has passed and the change is confirmed stable.
---
## 4. Data Validation Plan
Run these queries before and after the full migration to confirm data integrity.
**Pre-migration baseline:**
```sql
-- Record these values before any migration step
SELECT COUNT(*) AS total_users FROM users;
SELECT COUNT(*) AS total_orgs FROM organisations;
SELECT MIN(created_at), MAX(created_at) FROM users;
-- Check for any anomalies in the source data before backfill
SELECT COUNT(*) AS users_without_account
FROM users WHERE account_id IS NULL;
```
**Post-backfill integrity check:**
```sql
-- All users have an org that exists
SELECT COUNT(*) AS orphaned_org_refs
FROM users u
WHERE u.org_id IS NOT NULL
AND NOT EXISTS (
SELECT 1 FROM organisations o WHERE o.id = u.org_id
);
-- Expected: 0
-- org_id matches expected value from source column
SELECT COUNT(*) AS mismatched_backfill
FROM users u
JOIN accounts a ON u.account_id = a.id
WHERE u.org_id != a.organisation_id;
-- Expected: 0
-- Row count unchanged (no rows created or deleted by migration)
SELECT COUNT(*) AS total_users_after FROM users;
-- Must match pre-migration baseline
```
**Post-contract final check:**
```sql
-- Old column is gone
SELECT COUNT(*) FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'account_id';
-- Expected: 0
-- New column is NOT NULL
SELECT is_nullable FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'org_id';
-- Expected: NO
```
---
## 5. Performance Impact Assessment
| Step | Lock type | Lock duration | Traffic impact |
|---|---|---|---|
| Add nullable column | ACCESS EXCLUSIVE | Milliseconds | Negligible |
| CREATE INDEX CONCURRENTLY | SHARE UPDATE EXCLUSIVE | Minutes (proportional to table size) | Reads and writes continue |
| Batch backfill | Row-level locks only | <5s per batch | Low if batches are small |
| ADD CONSTRAINT NOT VALID | ACCESS EXCLUSIVE | Milliseconds | Negligible |
| VALIDATE CONSTRAINT | SHARE UPDATE EXCLUSIVE | Minutes | Reads and writes continue |
| ALTER COLUMN SET NOT NULL | ACCESS EXCLUSIVE | Milliseconds (if check constraint validated) | Negligible |
| DROP COLUMN | ACCESS EXCLUSIVE | Milliseconds | Negligible |
**Expected load increase during backfill:**
- DB CPU: [estimated % increase during batch writes]
- DB I/O: [estimated increase]
- Monitoring threshold to pause backfill: [e.g. DB CPU > 80% for >2 minutes]
**Backfill rate estimate:**
- Table size: [X million rows]
- Batch size: [1000 rows]
- Pause between batches: [100ms]
- Estimated total duration: [X hours at Y rows/second]
---
## 6. Deployment Runbook
Follow this checklist on the day of migration. Mark each step as done before proceeding.
**Pre-migration (day before):**
- [ ] DBA / tech lead has reviewed the migration plan
- [ ] Performance impact assessed; monitoring dashboards ready
- [ ] Backfill script tested on a staging DB with production-scale data
- [ ] Rollback procedure tested on staging
- [ ] On-call engineer briefed; Slack channel [#db-migrations] set up for coordination
- [ ] Maintenance window scheduled (if required)
**Phase 1 — Expand (T+0):**
- [ ] Take a manual DB snapshot / verify automated backup is recent
- [ ] Run `001_expand_add_org_id.sql` on production
- [ ] Run Phase 1 validation queries — confirm pass
- [ ] Deploy app version with dual-write
- [ ] Monitor error rate for [10 minutes]
**Phase 2 — Backfill (T+[X hours]):**
- [ ] Confirm Phase 1 has been stable for [X hours]
- [ ] Start backfill script in a screen/tmux session
- [ ] Monitor progress via backfill progress query every [5 minutes]
- [ ] Monitor DB CPU and I/O — pause if thresholds exceeded
- [ ] Run completion validation — confirm 0 unbackfilled rows
- [ ] Run integrity checks — confirm 0 orphaned refs, 0 mismatches
**Phase 3 — Enforce (T+[X days]):**
- [ ] Confirm backfill 100% complete and stable for [X hours]
- [ ] Add NOT VALID constraint
- [ ] Run VALIDATE CONSTRAINT (monitor duration and lock waits)
- [ ] Alter column to NOT NULL
- [ ] Run Phase 3 validation queries
- [ ] Deploy app version reading only from new column
- [ ] Monitor error rate for [30 minutes]
**Phase 4 — Contract (T+[X days after rollback window]):**
- [ ] Confirm rollback window has passed — no incidents, no rollback needed
- [ ] Search codebase for references to old column — confirm zero
- [ ] Run DROP COLUMN migration
- [ ] Run final integrity checks
- [ ] Close migration ticket; update schema documentation
---
## Quality Checks
- [ ] Every migration phase has an independent rollback procedure — no phase assumes the next one has run
- [ ] Batch backfill script includes a pause between batches to avoid saturating I/O
- [ ] NOT NULL constraints use the NOT VALID + VALIDATE pattern on tables with >100k rows
- [ ] The app dual-write period is explicitly defined — old column writes are not dropped until Phase 3 is deployed
- [ ] Data validation queries include a row count check to confirm no data loss
- [ ] Lock types are identified for every DDL statement — no "should be fine" assumptions
- [ ] The deployment runbook names who runs each step, not just what to run
- [ ] Phase 4 (contract) is explicitly gated on the rollback window passing — not run on the same day as Phase 3
## Anti-Patterns
- [ ] Do not combine the expand and contract phases into a single deployment — they must be separated by a deployment cycle
- [ ] Do not run DDL changes without first testing on a production-sized data clone
- [ ] Do not skip the NOT VALID + VALIDATE pattern for constraint additions on large tables — it causes full table locks
- [ ] Do not define a rollback as "restore from backup" — each phase must have an explicit, fast rollback procedure
- [ ] Do not omit dual-write logic during the transition period — removing the old column before all writers are updated causes data loss
@@ -0,0 +1,364 @@
---
name: database-schema-design
description: "Document or design a database schema with entity relationships, table definitions, constraints, indexes, and access patterns. Use when asked to design a database, document an existing schema, model entities and relationships, define table structures, plan an index strategy, or produce a data model for review. Produces a structured schema document covering an ER diagram, table DDL definitions, index strategy, access pattern analysis, normalization decisions, and migration notes."
---
# Database Schema Design Skill
Produce a complete database schema design document for a given domain. A schema document is not just a list of tables — it is a record of decisions: what was modelled, how entities relate, which queries the schema is optimised for, and what trade-offs were made.
A good schema design document lets an engineer understand the data model, query it correctly, extend it safely, and write migrations without breaking things.
## Required Inputs
Ask for these if not already provided:
- **Domain description** — what the system does; what business objects are being modelled
- **Entities and relationships** — the main things in the domain and how they relate (e.g. "a User has many Orders; an Order has many OrderItems; an OrderItem references a Product")
- **Expected query patterns** — the most important read and write queries (e.g. "fetch all orders for a user, sorted by date"; "look up a product by SKU")
- **Database engine** — PostgreSQL, MySQL, SQLite, CockroachDB, etc. — this affects DDL syntax and available types
- **Expected data volume** — approximate row counts, growth rate, and any partitioning needs
- **Constraints** — any existing conventions, naming standards, or migration constraints to respect
## Output Format
---
# Database Schema Design: [Domain / Service Name]
**Service:** [Name] | **Team:** [Team name]
**Author:** [Name] | **Reviewed by:** [Name]
**Date:** [Date] | **Database engine:** [PostgreSQL X.X / MySQL X.X / etc.]
**Status:** [Draft / Reviewed / Approved]
---
## 1. Overview
[23 sentences describing the domain being modelled, the scope of this schema, and any key design philosophy (e.g. "this schema prioritises read performance for the customer-facing API over write simplicity", or "designed for eventual migration to multi-tenancy")]
**In scope:**
- [Entity or subsystem]
- [Entity or subsystem]
**Out of scope:**
- [e.g. Analytics / reporting tables — separate schema]
- [e.g. Audit log tables — covered in separate design doc]
---
## 2. Entity Relationship Diagram
```
┌───────────────────┐ ┌───────────────────────┐
│ users │ │ organisations │
│───────────────── │ │─────────────────────── │
│ id (PK) │ ┌───▶│ id (PK) │
│ org_id (FK) ─────┼────┘ │ name │
│ email │ │ plan │
│ display_name │ │ created_at │
│ created_at │ └───────────────────────┘
│ updated_at │
└─────────┬─────────┘
│ 1
│ N
┌─────────▼─────────┐ ┌───────────────────────┐
│ [table_a] │ │ [table_b] │
│───────────────── │ │─────────────────────── │
│ id (PK) │ N │ id (PK) │
│ user_id (FK) ─────┼────────▶│ [table_a]_id (FK) │
│ [field] │ │ │ [field] │
│ [field] │ │ │ [field] │
│ created_at │ │ created_at │
└───────────────────┘ └───────────────────────┘
```
**Relationship summary:**
| Entity A | Relationship | Entity B | Notes |
|---|---|---|---|
| organisations | has many | users | An org can have many users |
| users | has many | [table_a] | Soft-deleted on user deletion |
| [table_a] | has many | [table_b] | Cascade delete |
| [table_b] | belongs to | [table_a] | Non-nullable FK |
| [table_c] | many-to-many (via [join_table]) | [table_d] | Join table with metadata |
---
## 3. Table Definitions
### `organisations`
[1 sentence describing what this table stores and its role in the domain.]
```sql
CREATE TABLE organisations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
slug VARCHAR(100) NOT NULL UNIQUE,
plan VARCHAR(50) NOT NULL DEFAULT 'free'
CHECK (plan IN ('free', 'pro', 'enterprise')),
settings JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
| Column | Type | Nullable | Default | Notes |
|---|---|---|---|---|
| id | UUID | No | gen_random_uuid() | Surrogate PK — UUID preferred over serial for distributed use |
| name | VARCHAR(255) | No | — | Display name; not unique |
| slug | VARCHAR(100) | No | — | URL-safe identifier; unique across all orgs |
| plan | VARCHAR(50) | No | 'free' | Constrained to known values via CHECK |
| settings | JSONB | No | {} | Flexible config; avoid for queryable fields |
| created_at | TIMESTAMPTZ | No | now() | Always use TIMESTAMPTZ, not TIMESTAMP |
| updated_at | TIMESTAMPTZ | No | now() | Updated via trigger (see below) |
---
### `users`
[1 sentence describing what this table stores.]
```sql
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL REFERENCES organisations(id)
ON DELETE RESTRICT,
email VARCHAR(254) NOT NULL,
display_name VARCHAR(255) NOT NULL DEFAULT '',
role VARCHAR(50) NOT NULL DEFAULT 'member'
CHECK (role IN ('owner', 'admin', 'member', 'viewer')),
email_verified BOOLEAN NOT NULL DEFAULT false,
deleted_at TIMESTAMPTZ NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT users_email_org_unique UNIQUE (email, org_id)
);
```
| Column | Type | Nullable | Default | Notes |
|---|---|---|---|---|
| id | UUID | No | gen_random_uuid() | — |
| org_id | UUID | No | — | FK to organisations; RESTRICT prevents orphaning |
| email | VARCHAR(254) | No | — | RFC 5321 max length; unique per org (not globally) |
| role | VARCHAR(50) | No | 'member' | Application-level RBAC |
| deleted_at | TIMESTAMPTZ | Yes | NULL | Soft delete; NULL = active |
**Soft delete policy:** Rows with `deleted_at IS NOT NULL` are considered deleted. All application queries MUST filter `WHERE deleted_at IS NULL` unless explicitly fetching deleted records. Use a view or ORM scope to enforce this.
---
### `[table_a]`
[Description of what this table models.]
```sql
CREATE TABLE [table_a] (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
[field_1] VARCHAR(255) NOT NULL,
[field_2] TEXT NULL,
[field_3] INTEGER NOT NULL DEFAULT 0 CHECK ([field_3] >= 0),
status VARCHAR(50) NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending', 'active', 'archived')),
metadata JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
| Column | Type | Nullable | Notes |
|---|---|---|---|
| user_id | UUID | No | CASCADE delete — when user is deleted, their [table_a] rows are too |
| [field_1] | VARCHAR(255) | No | [Reason for length constraint] |
| status | VARCHAR(50) | No | State machine: pending → active → archived (no other transitions) |
| metadata | JSONB | No | [What is stored here and why it's not a typed column] |
---
### `[join_table]` *(Many-to-many)*
[Description of the relationship this table represents.]
```sql
CREATE TABLE [join_table] (
[table_c]_id UUID NOT NULL REFERENCES [table_c](id) ON DELETE CASCADE,
[table_d]_id UUID NOT NULL REFERENCES [table_d](id) ON DELETE CASCADE,
granted_by UUID NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
granted_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY ([table_c]_id, [table_d]_id)
);
```
**Why a composite PK:** The combination of `[table_c]_id + [table_d]_id` is the natural key — each association is unique and the primary key doubles as the uniqueness constraint without needing a separate index.
---
## 4. Index Strategy
For each table, define which indexes are created and why. Include the query they are designed to serve.
| Table | Index name | Columns | Type | Query served | Notes |
|---|---|---|---|---|---|
| users | `users_org_id_idx` | `(org_id)` | B-tree | `SELECT * FROM users WHERE org_id = $1` | FK lookup; required for join performance |
| users | `users_email_lower_idx` | `(lower(email))` | B-tree (functional) | `WHERE lower(email) = lower($1)` | Case-insensitive email lookup |
| users | `users_active_by_org_idx` | `(org_id, created_at DESC)` | B-tree | `WHERE org_id = $1 AND deleted_at IS NULL ORDER BY created_at DESC` | Partial index candidate (see below) |
| [table_a] | `[table_a]_user_id_status_idx` | `(user_id, status)` | B-tree | `WHERE user_id = $1 AND status = 'active'` | Compound — order matters |
| [table_a] | `[table_a]_metadata_gin_idx` | `metadata` | GIN | `WHERE metadata @> '{"key": "value"}'` | Only add if JSONB queried frequently |
**Partial indexes (PostgreSQL):**
```sql
-- Index only active (non-deleted) users — dramatically smaller for soft-delete tables
CREATE INDEX users_active_email_idx
ON users (email, org_id)
WHERE deleted_at IS NULL;
-- Index only pending items — avoids indexing the majority of rows
CREATE INDEX [table_a]_pending_idx
ON [table_a] (user_id, created_at)
WHERE status = 'pending';
```
**Index design principles applied:**
- FKs that appear in JOIN conditions always have an index
- Compound indexes follow selectivity order: most selective column first
- Functional indexes for case-insensitive lookups
- GIN indexes only where JSONB containment queries are frequent
- Partial indexes for status-filtered queries on large tables
---
## 5. Access Pattern Analysis
Document the primary queries this schema is designed to serve. For each, show the query, the indexes used, and any caveats.
### AP-1: Fetch all active users for an organisation (paginated)
**Frequency:** Very high — called on every dashboard load
**Query:**
```sql
SELECT id, email, display_name, role, created_at
FROM users
WHERE org_id = $1
AND deleted_at IS NULL
ORDER BY created_at DESC
LIMIT 50 OFFSET $2;
```
**Index used:** `users_active_by_org_idx` (org_id, created_at DESC)
**Notes:** Use keyset pagination (`WHERE created_at < $cursor`) at scale; OFFSET degrades past ~10k rows.
---
### AP-2: Look up a user by email (case-insensitive)
**Frequency:** High — every authentication attempt
**Query:**
```sql
SELECT id, org_id, role, email_verified
FROM users
WHERE lower(email) = lower($1)
AND deleted_at IS NULL;
```
**Index used:** `users_email_lower_idx`
**Notes:** Returns multiple rows if same email exists across orgs. Application resolves by org context.
---
### AP-3: Fetch [table_a] items for a user by status
**Frequency:** High
**Query:**
```sql
SELECT *
FROM [table_a]
WHERE user_id = $1
AND status = $2
ORDER BY created_at DESC
LIMIT 25;
```
**Index used:** `[table_a]_user_id_status_idx`
**Notes:** Compound index covers both filter columns. Status filter must come second in the index because user_id is more selective.
---
### AP-4: [Add further access patterns as needed]
---
## 6. Normalization Decisions
Document deliberate choices to normalize or denormalize, with reasoning.
| Decision | Approach | Reasoning |
|---|---|---|
| [e.g. Organisation name on users table?] | **Not denormalized** — always join to organisations | Avoid stale copies; org name changes are infrequent and joining is cheap |
| [e.g. Status history] | **Not in this table** — separate `[table_a]_status_history` if needed | Current status is all that's needed for 99% of queries; history is auditing, not application data |
| [e.g. JSONB `settings` column on organisations] | **Denormalized into JSONB** | Settings are read together; never queried by field; schema changes don't require migrations |
| [e.g. Computed aggregate counts] | **Not stored** — computed at query time | Counts are small; maintaining a counter column requires careful locking; use `SELECT COUNT(*)` with the index |
---
## 7. Triggers and Automation
```sql
-- Automatically update updated_at on any row modification
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Apply to all tables with updated_at
CREATE TRIGGER users_updated_at
BEFORE UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION set_updated_at();
CREATE TRIGGER [table_a]_updated_at
BEFORE UPDATE ON [table_a]
FOR EACH ROW EXECUTE FUNCTION set_updated_at();
```
---
## 8. Migration Notes
If this schema is being introduced to an existing system, note the migration approach.
| Step | Description | Backward compatible | Risk |
|---|---|---|---|
| 1 | Create `organisations` table | Yes — additive | Low |
| 2 | Create `users` table | Yes — additive | Low |
| 3 | Backfill `org_id` on existing users | **Requires dual-write period** | Medium |
| 4 | Add NOT NULL constraint on `org_id` | Requires backfill to be 100% complete | Medium |
| 5 | Remove deprecated columns | Requires app code updated first | Low once app deployed |
**Backfill strategy:** [Describe how to handle existing data — batch size, rate limiting, validation queries]
**Rollback:** Each migration step should be independently reversible. See [database-migration-plan skill] for the full rollback procedure template.
---
## Quality Checks
- [ ] Every table has a primary key and a `created_at` column — no implicit ordering by row insertion
- [ ] Every foreign key has a corresponding index — no missing FK indexes that would cause full table scans on joins
- [ ] All TIMESTAMPTZ columns, not TIMESTAMP — timezone awareness is explicit
- [ ] Soft-delete tables document the convention and where the filter is enforced (ORM scope, view, or query standard)
- [ ] Every access pattern in the design has a supporting index or an explicit note that a full table scan is acceptable
- [ ] JSONB columns are justified — not used as a substitute for proper schema design on queryable fields
- [ ] Normalization decisions are documented with reasoning, not just stated
- [ ] Migration notes address existing data if this is a schema change, not a greenfield schema
## Anti-Patterns
- [ ] Do not use JSONB columns as a substitute for proper relational schema design on fields that will be queried
- [ ] Do not add indexes speculatively — every index must be justified by a specific access pattern
- [ ] Do not omit timezone-awareness — use TIMESTAMPTZ, never plain TIMESTAMP
- [ ] Do not design without documenting normalization decisions — future maintainers need the reasoning, not just the structure
- [ ] Do not skip the access patterns section — schema without query patterns cannot be evaluated for correctness
@@ -0,0 +1,87 @@
---
name: debugging-log-analyser
description: "Parse error logs, stack traces, and crash reports into a structured root cause diagnosis. Use when an application is throwing exceptions, crashing, or producing unexpected errors and you need to understand why and what to fix. Produces a structured diagnosis with error classification, stack trace walkthrough, probable root cause with confidence level, affected code path, a concrete code-level fix suggestion, and ordered next debugging steps."
---
# Debugging Log Analyser Skill
Parses raw error logs, stack traces, and crash reports into a structured diagnosis with probable root cause, affected code path, and specific next steps — no hand-waving.
## Required Inputs
Ask for these if not provided:
- **The log / stack trace / error output** (paste directly or describe the error)
- **Language and framework** (e.g. Node.js + Express, Python + Django, Java Spring, Go)
- **Context** (what changed before this started — e.g. recent deploy, config change, increased traffic, new input data; or "nothing changed" is also useful)
- **Frequency** (one-off / intermittent / consistent / regression after a specific change)
- **Environment** (local dev / staging / production)
- **What they've already tried** (if anything)
## Output Format
---
# Debugging Report: [Service/App Name]
### 1. Error Classification
**Error type:** [Runtime exception / Build error / Config error / Network error / Memory error / Unknown]
**Severity:** [Fatal / Critical / Warning / Informational]
**Recurrence pattern:** [One-off / Intermittent / Consistent / On-startup / Under load]
### 2. Stack Trace Analysis
Walk the stack frame by frame, starting from the origin:
- **Origin frame:** [File, line, function where it started]
- **Propagation path:** [How it travelled through the call stack]
- **Crash point:** [Where it ultimately threw/panicked/exited]
For each significant frame, note whether it is:
- User code (fixable here)
- Framework/library code (usually a misuse issue)
- System/runtime code (usually a config or environment issue)
### 3. Root Cause Assessment
**Probable root cause:** [12 sentence plain English statement]
**Confidence:** [High / Medium / Low — and why]
**Alternative causes to rule out:** [If confidence is not high]
### 4. Affected Code Path
**Entry point:** [Where the triggering call began]
**Key function(s) involved:** [Specific functions/methods named in the trace]
**Data that triggered it:** [If inferable from the log — e.g. null value, malformed JSON]
### 5. Suggested Fix
Provide a concrete, code-level suggestion:
- What to change (the minimal fix)
- Why this fixes the root cause
- Any trade-offs or risks in the fix
- A short code snippet if helpful
### 6. Next Debugging Steps
If the root cause is uncertain, provide an ordered list of 35 specific debugging actions:
1. [Specific thing to check — file, log line, config value]
2. [Specific reproduction step or isolation test]
3. [Specific tool command — e.g. `strace`, `pprof`, `--verbose`, add logging at X]
### 7. Prevention
One or two concrete things that would prevent this class of error recurring:
- Better input validation at [point]
- Add monitoring/alerting for [condition]
- Test that covers [scenario]
---
## Quality Checks
- [ ] Root cause is specific (not "there might be a null pointer issue")
- [ ] At least one concrete code-level fix is suggested
- [ ] Next steps are actionable commands, not vague advice
- [ ] Suggested fix references the actual language/framework in the input (not a generic fix that could apply to any language)
- [ ] Confidence level includes a stated reason (not just "High" or "Low" with no explanation)
- [ ] Prevention is proactive (not just "add error handling")
## Usage Examples
- "Why is this crashing?" + [paste log]
- "Can you analyse this stack trace?"
- "I'm getting this error, what does it mean?"
- "Debug this log for me"
- "What's causing this exception?"
@@ -0,0 +1,340 @@
---
name: dependency-audit
description: "Audits project dependencies for security vulnerabilities, license compliance issues, outdated packages, and transitive dependency risk. Use when asked to audit dependencies, review package security, check license compliance, assess dependency health, or produce a vulnerability report. Produces a vulnerability findings table, license compliance matrix, update priority matrix, dependency health score, and 30-day remediation plan."
---
# Dependency Audit Skill
Produce a complete dependency audit report for a project — covering security vulnerabilities (with CVE references), license compliance against policy, outdated packages prioritised by risk, transitive dependency risk analysis, and a concrete remediation plan with timeline. A good dependency audit gives the team a clear, prioritised action list — not a raw dump of audit output that no one acts on.
## Required Inputs
Ask for these if not already provided:
- **Project language and ecosystem** — npm, pip/PyPI, Maven/Gradle, Go modules, Cargo, RubyGems, NuGet, or mixed
- **Dependency list or package manifest** — paste the contents of `package.json`, `requirements.txt`, `go.mod`, `pom.xml`, etc., or provide the audit tool output
- **License policy** — which licenses are allowed, which are restricted (e.g. "GPL is prohibited", "MIT/Apache/BSD only", or "no policy yet — recommend one")
- **Current security tooling** — Dependabot, Snyk, OWASP Dependency-Check, npm audit, pip-audit, or none
## Output Format
---
# Dependency Audit Report: [Project Name]
**Ecosystem:** [npm / pip / Maven / Go / etc.]
**Audit date:** [Date]
**Auditor:** [Name]
**Total direct dependencies:** [N]
**Total transitive dependencies:** [N]
**Audit tool(s) used:** [npm audit / pip-audit / Snyk / OWASP Dependency-Check / etc.]
---
## Executive Summary
| Category | Finding | Risk level |
|---|---|---|
| Critical vulnerabilities | [N] CVEs requiring immediate action | [Critical / High / Low] |
| High vulnerabilities | [N] CVEs — fix within 7 days | [High / Medium] |
| License violations | [N] packages with non-compliant licenses | [High / Low] |
| Severely outdated packages | [N] packages > 2 major versions behind | [Medium] |
| Packages with no active maintenance | [N] packages — no commits in 12+ months | [Medium] |
| **Overall dependency health score** | **[Score]/100** | **[Red / Amber / Green]** |
**Scoring methodology:** Critical CVEs: 20 each. High CVEs: 10 each. License violations: 15 each. Abandoned packages: 5 each. Maximum deduction: 100. Score ≥80 = Green, 6079 = Amber, <60 = Red.
**Immediate actions required:**
1. [Most critical action — e.g. "Upgrade lodash from 4.17.11 to 4.17.21 to fix CVE-2021-23337 (Critical — prototype pollution)"]
2. [Second action]
3. [Third action]
---
## 1. Security Vulnerability Findings
### Critical and High Severity (Act within 2472 hours)
| Package | Installed version | Fix version | CVE | Severity | CVSS score | Description | Exploitability |
|---|---|---|---|---|---|---|---|
| [package-name] | [X.Y.Z] | [A.B.C] | [CVE-YYYY-NNNNN] | Critical | [9.x] | [e.g. Prototype pollution via `merge` function — remote code execution possible] | [Known exploit / PoC available / No known exploit] |
| [package-name] | [X.Y.Z] | [A.B.C] | [CVE-YYYY-NNNNN] | High | [7.x] | [e.g. Path traversal in file serving utility] | [PoC available] |
| [package-name] | [X.Y.Z] | [A.B.C] | [CVE-YYYY-NNNNN] | High | [7.x] | [e.g. Regular expression denial of service (ReDoS)] | [No known exploit] |
### Medium Severity (Fix within 30 days)
| Package | Installed version | Fix version | CVE | Severity | CVSS score | Description |
|---|---|---|---|---|---|---|
| [package-name] | [X.Y.Z] | [A.B.C] | [CVE-YYYY-NNNNN] | Medium | [5.x] | [Description] |
| [package-name] | [X.Y.Z] | [A.B.C] | [CVE-YYYY-NNNNN] | Medium | [4.x] | [Description] |
### Low Severity (Fix within 90 days or accept risk)
| Package | Installed version | Fix version | CVE | Severity | Description |
|---|---|---|---|---|---|
| [package-name] | [X.Y.Z] | [A.B.C] | Low | [Description] |
### Vulnerabilities With No Fix Available
| Package | CVE | Severity | Recommended mitigation |
|---|---|---|---|
| [package-name] | [CVE-YYYY-NNNNN] | [High] | [e.g. "Remove this package — alternative: [replacement]"] |
| [package-name] | [CVE-YYYY-NNNNN] | [Medium] | [e.g. "Vendor has a fix in progress — track issue [URL]. Mitigate by [X]"] |
---
## 2. License Compliance Matrix
### License Policy Reference
| License | Category | Policy | Notes |
|---|---|---|---|
| MIT | Permissive | Allowed | Attribution required in distributed products |
| Apache 2.0 | Permissive | Allowed | Attribution + NOTICE file required |
| BSD 2-Clause / 3-Clause | Permissive | Allowed | Attribution required |
| ISC | Permissive | Allowed | |
| MPL 2.0 | Weak copyleft | Allowed with review | Source disclosure required for modified MPL files only |
| LGPL v2 / v3 | Weak copyleft | Allowed with review | Dynamic linking permitted; static linking may require disclosure |
| GPL v2 / v3 | Strong copyleft | **Restricted** | May require open-sourcing the entire codebase — legal review required |
| AGPL v3 | Strong copyleft | **Restricted** | Network use triggers copyleft — especially risky for SaaS |
| SSPL | Source available | **Prohibited** | Not OSI-approved — treat as proprietary |
| Proprietary / Commercial | Commercial | **Requires contract** | Verify license covers current use case and scale |
| Unknown / Unlicensed | — | **Prohibited** | No license = all rights reserved — cannot use legally |
### Findings: Packages With Compliance Issues
| Package | License | Issue | Recommendation | Risk if unaddressed |
|---|---|---|---|---|
| [package-name] | GPL v3 | Copyleft — may require open-sourcing this project | Replace with [alternative] or get legal sign-off | Legal / IP risk |
| [package-name] | AGPL v3 | Network copyleft — SaaS use triggers disclosure | Replace with [alternative] | Legal / IP risk |
| [package-name] | Proprietary | License may not cover current usage tier | Verify license scope with vendor | Contract breach |
| [package-name] | Unknown | No license declared in package metadata | Contact maintainer or replace | Cannot use legally |
### All Licenses in Use (Full Inventory)
| License | Package count | Compliance status |
|---|---|---|
| MIT | [N] | Compliant |
| Apache 2.0 | [N] | Compliant |
| BSD-3-Clause | [N] | Compliant |
| ISC | [N] | Compliant |
| MPL 2.0 | [N] | Review required |
| GPL v3 | [N] | **Non-compliant** |
| Unknown | [N] | **Non-compliant** |
---
## 3. Outdated Package Analysis
### Severely Outdated (2+ major versions behind — high upgrade effort)
| Package | Installed | Latest stable | Versions behind | Last updated | Breaking changes summary |
|---|---|---|---|---|---|
| [package-name] | [1.x.x] | [3.x.x] | 2 major | [Date] | [e.g. "API redesign in v2; async support added in v3"] |
| [package-name] | [0.x.x] | [2.x.x] | 2 major | [Date] | [Summary] |
### Moderately Outdated (1 major version behind)
| Package | Installed | Latest stable | Versions behind | Security fix in newer version? |
|---|---|---|---|---|
| [package-name] | [2.x.x] | [3.x.x] | 1 major | [Yes — CVE-YYYY-NNNNN / No] |
| [package-name] | [4.x.x] | [5.x.x] | 1 major | [No] |
### Minor/Patch Updates Available (Low risk to update)
| Package | Installed | Latest | Contains security fix? |
|---|---|---|---|
| [package-name] | [2.3.1] | [2.3.9] | [Yes / No] |
| [package-name] | [1.0.0] | [1.2.1] | [No] |
---
## 4. Dependency Graph Risk Analysis
### Transitive Dependency Risk
Transitive (indirect) dependencies carry risk because they are not explicitly managed. These are the highest-risk transitive dependencies in this project:
| Vulnerable transitive dep | Pulled in by | Installed version | Fix available | Action |
|---|---|---|---|---|
| [transitive-package] | [direct-parent] | [X.Y.Z] | [Yes — upgrade [parent] to [version]] | Upgrade direct dependency [parent] |
| [transitive-package] | [direct-parent] | [X.Y.Z] | [No] | Remove [parent] or use [alternative] |
### Dependency Concentration Risk
These packages are depended on by many other packages in the project — a vulnerability or deprecation would have cascading effects:
| Package | Depended on by (N packages) | Actively maintained? | Risk level |
|---|---|---|---|
| [package-name] | [N] | [Yes / No — last commit: date] | [High / Medium] |
| [package-name] | [N] | [Yes] | [Medium] |
### Abandoned / Unmaintained Packages
| Package | Last release | Last commit | Weekly downloads | Recommended alternative |
|---|---|---|---|---|
| [package-name] | [Date] | [Date] | [N] | [alternative-package] |
| [package-name] | [Date] | [Date] | [N] | [Maintained fork: URL] |
---
## 5. Remediation Plan
### 30-Day Plan
**Week 1 — Critical vulnerabilities (Days 17)**
| Action | Owner | Package | Effort | Notes |
|---|---|---|---|---|
| Upgrade [package] [old] → [new] | [Name] | [package-name] | [30 min] | [No API changes / check breaking changes guide: URL] |
| Replace [package] with [alternative] | [Name] | [package-name] | [2 hours] | [No fix available — must replace] |
| Patch override for [transitive-dep] | [Name] | [transitive-dep] | [15 min] | [Add resolutions/overrides entry in manifest] |
```bash
# Commands for Week 1 upgrades:
# npm
npm install [package]@[target-version]
npm audit fix --force # use with caution — may introduce breaking changes
# pip
pip install --upgrade [package]==[target-version]
pip-audit --fix # if using pip-audit
# Go
go get [module]@[version]
go mod tidy
# Maven
# Update pom.xml version property, then:
mvn versions:use-latest-releases -DallowMajorUpdates=false
mvn dependency:resolve
```
**Week 2 — High vulnerabilities and license violations (Days 814)**
| Action | Owner | Package | Effort | Notes |
|---|---|---|---|---|
| Upgrade [package] | [Name] | [package-name] | [1 hour] | |
| Replace GPL-licensed [package] | [Name] | [package-name] | [4 hours] | [Alternative: [package]] |
| Legal review for [package] license | Legal team | [package-name] | [Legal team SLA] | [Submit via [process]] |
**Week 3 — Medium vulnerabilities and abandoned packages (Days 1521)**
| Action | Owner | Package | Effort | Notes |
|---|---|---|---|---|
| Upgrade [package] | [Name] | [package-name] | [30 min] | |
| Replace abandoned [package] | [Name] | [package-name] | [2 hours] | [Maintained fork or alternative: [URL]] |
**Week 4 — Process improvements (Days 2230)**
| Action | Owner | Effort | Notes |
|---|---|---|---|
| Enable Dependabot / Renovate for automated PRs | [Name] | [2 hours] | [Config in Section 6] |
| Add `npm audit` / `pip-audit` to CI — fail on Critical/High | [Name] | [1 hour] | [Config in Section 6] |
| Document license policy in CONTRIBUTING.md | [Name] | [1 hour] | [Based on policy in Section 2] |
| Schedule next quarterly audit | [Name] | [15 min] | [Add to team calendar] |
---
## 6. Policy Recommendations
### Automated Vulnerability Scanning in CI
Add the following to your CI pipeline to catch vulnerabilities before they merge:
```yaml
# GitHub Actions — adapt for your CI platform
dependency-audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
# npm
- name: npm audit
run: npm audit --audit-level=high
# Fails build on High or Critical vulnerabilities
# pip
- name: pip-audit
run: |
pip install pip-audit
pip-audit --requirement requirements.txt --severity high
# Go
- name: govulncheck
run: |
go install golang.org/x/vuln/cmd/govulncheck@latest
govulncheck ./...
```
### Dependabot / Renovate Configuration
```yaml
# .github/dependabot.yml — automated dependency update PRs
version: 2
updates:
- package-ecosystem: "[npm / pip / gomod / maven]"
directory: "/"
schedule:
interval: "weekly"
day: "monday"
open-pull-requests-limit: 10
labels:
- "dependencies"
- "automated"
ignore:
# Ignore major version bumps — review these manually
- dependency-name: "*"
update-types: ["version-update:semver-major"]
```
### License Scanning
```bash
# npm — license checker
npx license-checker --onlyAllow 'MIT;Apache-2.0;BSD-2-Clause;BSD-3-Clause;ISC' \
--failOn 'GPL;AGPL;LGPL'
# Python — pip-licenses
pip install pip-licenses
pip-licenses --allow-only="MIT;Apache Software License;BSD License;ISC License" \
--fail-on="GNU General Public License"
# Go — go-licenses
go install github.com/google/go-licenses@latest
go-licenses check ./... --allowed_licenses=MIT,Apache-2.0,BSD-2-Clause,BSD-3-Clause
```
---
## 7. Dependency Health Score Detail
| Category | Max points | Score | Notes |
|---|---|---|---|
| No critical vulnerabilities | 30 | [N]/30 | 20 per critical CVE |
| No high vulnerabilities | 20 | [N]/20 | 10 per high CVE |
| License compliance | 20 | [N]/20 | 15 per violation |
| No abandoned packages | 15 | [N]/15 | 5 per abandoned package |
| Up-to-date major versions | 10 | [N]/10 | 2 per major version behind |
| Automated scanning enabled | 5 | [N]/5 | All-or-nothing |
| **Total** | **100** | **[Score]/100** | **[Red / Amber / Green]** |
---
## Quality Checks
- [ ] Every Critical and High CVE has a named owner and a resolution date in the 30-day plan
- [ ] License findings have been reviewed by legal or a named engineer with authority to accept the risk
- [ ] Transitive dependency vulnerabilities are included — not just direct dependencies
- [ ] Abandoned packages have a concrete replacement recommendation, not just "consider replacing"
- [ ] CI pipeline change is included — the audit findings should be the last time these are caught manually
- [ ] The dependency health score is calculated from actual findings, not estimated
- [ ] Remediation plan actions are specific commands or steps, not "upgrade package X" without version targets
## Anti-Patterns
- [ ] Do not report only direct dependencies — transitive dependency vulnerabilities are often more dangerous and are the most commonly missed
- [ ] Do not present raw audit tool output without interpretation — a table of 200 CVEs with no prioritisation is worse than no audit at all
- [ ] Do not assign all Critical CVEs as "fix immediately" without checking whether an exploitable path exists in your usage context
- [ ] Do not make license compliance decisions without legal input — flagging a GPL dependency without a recommendation is incomplete work
- [ ] Do not complete the audit without including a CI/CD pipeline step — a one-time audit that leaves the door open for new vulnerabilities is not a remediation
@@ -0,0 +1,340 @@
---
name: developer-onboarding-doc
description: "Write a developer onboarding document for a service, codebase, or team. Use when asked to write a developer guide, service README, onboarding doc for a new engineer, codebase orientation, or getting-started guide for a technical team. Produces a structured doc covering service overview, architecture, local setup, key patterns, testing, deployment, and who to ask for what."
---
# Developer Onboarding Document Skill
Produce a complete developer onboarding document for a service or team — covering everything a new engineer needs to be productive within their first week.
A good onboarding doc is not a wiki dump. It answers the questions a new engineer actually has on day one, in the order they'll have them.
## Required Inputs
Ask for these if not already provided:
- **Service name** and what it does
- **Team** responsible for it
- **Tech stack** — language(s), framework(s), database(s), message queues, etc.
- **Key external dependencies** — upstream services, third-party APIs
- **Deployment target** — Kubernetes, ECS, Lambda, bare metal, etc.
- **Local dev setup** — how to run locally (Docker Compose, local DB, etc.)
- **Testing approach** — unit, integration, E2E; test commands
- **Deployment process** — summary of how code gets to production
- **On-call setup** — who's on-call, how alerts work
- **Contacts** — tech lead, platform team, related service owners
## Output Format
---
# Developer Onboarding: [Service Name]
**Team:** [Team name] | **Tech lead:** [Name]
**Last updated:** [Date] | **Updated by:** [Name]
> If something in this doc is wrong or out of date, fix it now — it will affect every engineer who onboards after you.
---
## What This Service Does
[35 sentences. What problem does this service solve? Who calls it, and who does it call? What would break if this service went down?]
**Service type:** [API / Background worker / Event consumer / Data pipeline / etc.]
**Consumers:** [List internal services or external clients that depend on this service]
**Dependencies:** [List upstream services, databases, and third-party APIs this service calls]
**Architecture diagram:** [Link or embed — even a rough ASCII diagram helps]
```
[Caller A] ──→ [This Service] ──→ [Database]
└──→ [Downstream Service]
```
---
## Codebase Orientation
**Repository:** [Link]
**Main branch:** `[main / master]`
**Language:** [e.g. Go 1.22 / Node.js 20 / Python 3.12]
**Framework:** [e.g. Express / FastAPI / Gin / Rails]
### Key directories
```
[repo-root]/
├── [src/ or cmd/] # Application code
│ ├── [handlers/] # HTTP handlers / controllers
│ ├── [services/] # Business logic
│ ├── [repository/] # Database access layer
│ └── [models/] # Data models / types
├── [tests/] # Test files
├── [migrations/] # Database migrations
├── [scripts/] # Utility scripts
├── [.github/workflows/] # CI/CD pipeline definitions
└── [docs/] # Additional documentation
```
**Where to start reading:** [Point to 23 key files that give the best orientation — e.g. `main.go`, `routes.js`, `app.py`]
### Things that might surprise you
- [Unusual pattern 1 — e.g. "We use event sourcing — state is derived from an event log, not stored directly"]
- [Unusual pattern 2 — e.g. "Auth is handled by the gateway — this service trusts the `X-User-Id` header"]
- [Unusual pattern 3 — any non-obvious decisions or legacy choices]
---
## Local Development Setup
**Estimated setup time:** [X minutes for a fresh machine]
### Prerequisites
- [ ] [Tool 1] — version [X] — [install link]
- [ ] [Tool 2] — version [X] — [install link]
- [ ] Access to [repo / internal package registry] — request from [who]
- [ ] [Any secrets or credentials needed] — request from [who]
### Step-by-step setup
```bash
# 1. Clone the repo
git clone [repo URL]
cd [repo-name]
# 2. Copy and configure environment variables
cp .env.example .env
# Edit .env — see "Environment Variables" section below
# 3. Start dependencies (database, cache, etc.)
[docker compose up -d / make deps / etc.]
# 4. Install dependencies
[npm install / go mod download / pip install -r requirements.txt]
# 5. Run database migrations
[migration command]
# 6. Start the service
[start command]
# 7. Verify it's working
curl http://localhost:[PORT]/health
# Expected: {"status":"ok"}
```
**If this doesn't work:** Check [Troubleshooting section below] or ask in `#[channel]`.
### Environment Variables
| Variable | Required | Description | Example |
|---|---|---|---|
| `DATABASE_URL` | Yes | Connection string for the primary DB | `postgres://localhost:5432/[db]` |
| `[VAR_2]` | Yes | [Description] | [Example] |
| `[VAR_3]` | No | [Description — default value] | [Example] |
**Secrets for local dev:** [Where to get them — e.g. "Run `[command]` to pull from Vault" or "Ask [person] in #[channel]"]
### Useful local commands
```bash
[start command] # Start the service
[test command] # Run all tests
[lint command] # Run linter
[format command] # Format code
[migration command] # Run pending migrations
[seed command] # Seed local database
```
---
## Testing
**Testing philosophy:** [e.g. "We test at the integration layer — unit tests for pure functions, integration tests for anything touching the DB or external services"]
### Running tests
```bash
# All tests
[test command]
# Unit tests only
[unit test command]
# Integration tests (requires local deps running)
[integration test command]
# A specific test file or test case
[test command with filter]
```
**Test coverage:** [X]% (minimum required to pass CI: [Y]%)
**Coverage report:** [Where to find it]
### Writing tests
- **Unit tests:** [Where to put them — e.g. alongside source files as `*_test.go`]
- **Integration tests:** [Where to put them — e.g. `tests/integration/`]
- **Test database:** [How it works — e.g. "Each test gets a clean transaction that rolls back on teardown — see `tests/helpers/db.go`"]
- **Mocking:** [Policy — e.g. "We mock at the repository layer — don't mock the DB directly"]
---
## Making Changes
### Branching
[Branch naming convention — e.g. `feature/[ticket-id]-short-description`, `fix/[ticket-id]-short-description`]
### Before opening a PR
- [ ] Tests pass locally
- [ ] Linter passes (`[lint command]`)
- [ ] New behaviour has test coverage
- [ ] Any new environment variables are added to `.env.example` and documented
- [ ] Database migrations are backward-compatible (old code can run against new schema)
### Code review
- **Reviewers:** [Who to request review from — e.g. "Any engineer on [team]; lead review required for auth changes"]
- **Expected review time:** [X hours / 1 business day]
- **PR template:** [Link or auto-generated by GitHub]
### Database migrations
```bash
# Create a new migration
[migration create command]
# Apply pending migrations
[migration up command]
# Roll back last migration
[migration down command]
```
**Migration rules:**
- All migrations must be backward-compatible — old code must run against the new schema
- Never rename or drop a column in a single migration — do it in two steps (add new, migrate data, drop old)
- Test your rollback before merging
---
## Deployment
**How code gets to production:** [12 sentence summary — link to full CI/CD playbook if it exists]
1. Merge to `main` → automatic deploy to staging
2. Smoke tests run on staging
3. Manual approval → deploy to production
4. Post-deploy monitoring for [X minutes]
**Deployment docs:** [Link to CI/CD playbook or pipeline docs]
**Who can deploy:** [Any engineer / Lead engineer / On-call engineer — specify]
**Deployment channel:** `#[deployments channel]`
---
## Monitoring and Observability
**Dashboard:** [Datadog / Grafana / CloudWatch — link]
**Logs:** [Log aggregation tool and link — e.g. "Logs are in Datadog under service:[name]"]
**Traces:** [Tracing tool and link if applicable]
**Alerts:** [Where alerts fire — e.g. PagerDuty / Slack #alerts-[service]]
**Key metrics to know:**
- **Error rate:** Should be <[X]% (alert at [Y]%)
- **P99 latency:** Should be <[X]ms
- **[Business metric]:** [e.g. "Queue depth should be <100 items"]
---
## On-Call
**On-call schedule:** [PagerDuty / Opsgenie link]
**Who's on-call now:** [Link to current schedule or `#oncall` channel]
**Escalation:** [On-call → [team lead] → [EM] — after [X] minutes unacknowledged]
**If you get paged:**
1. Acknowledge the alert
2. Check [dashboard link] for the first clue
3. Common alert runbooks: [link to oncall-runbook or runbook-writer output]
4. If you can't resolve in [X minutes], escalate to [person/channel]
---
## Key Contacts
| Role | Name | Best way to reach |
|---|---|---|
| Tech lead | [Name] | Slack: @[handle] |
| On-call rotation | [Team] | PagerDuty / `#on-call` |
| Platform / infra | [Team] | `#platform` Slack channel |
| Database / DBA | [Name or team] | `#database` Slack channel |
| [Upstream service] owner | [Name] | Slack: @[handle] |
**Where to ask questions:**
- General engineering: `#engineering`
- This service specifically: `#[service-name]`
- Urgent / production issues: `#incidents`
---
## Troubleshooting
### "The service won't start locally"
1. Check that Docker / dependencies are running: `[command]`
2. Check `.env` is populated — missing values cause silent failures
3. Check logs: `[log command]`
4. Ask in `#[channel]`
### "Tests are failing locally but passing in CI"
- Check your local dependency versions match CI: `[version check command]`
- Try a clean install: `[clean install command]`
- Integration tests need local deps running — `[start deps command]`
### "I can't access [internal tool / system]"
- Request access through [process — e.g. Okta self-serve / ask your manager]
### "Something looks wrong in production"
1. Check [dashboard] for the error spike
2. Check recent deploys in `#deployments`
3. If it's an active incident, page on-call via [PagerDuty / Slack command]
---
## Further Reading
- [Architecture Decision Records (ADRs)](./docs/decisions/) — why the codebase is the way it is
- [API documentation](./docs/api/) or [link to external docs]
- [Incident runbooks](./docs/runbooks/)
- [CI/CD pipeline documentation](./docs/cicd/)
- [Team working agreements](./docs/team/)
---
## Quality Checks
- [ ] Local setup instructions work on a fresh machine — tested recently
- [ ] Environment variables table is complete and accurate
- [ ] "Things that might surprise you" captures the actual surprises (ask a recent joiner)
- [ ] On-call section has real links, not placeholders
- [ ] Contacts are current — team members with real Slack handles
- [ ] Troubleshooting covers the top 3 actual questions new joiners ask
## Anti-Patterns
- [ ] Do not document the ideal setup — document the actual setup; real oddities and gotchas are what new engineers need most
- [ ] Do not leave placeholder contacts like "ask your manager" — name specific people for each domain or the doc becomes useless when the new joiner has an urgent question
- [ ] Do not write the onboarding doc without reviewing it with a recent joiner — the author is blind to what they take for granted
- [ ] Do not include every piece of architectural detail — an onboarding doc that covers everything teaches nothing; link to deeper docs instead
- [ ] Do not skip the "things that might surprise you" section — undocumented non-obvious patterns are the number one cause of wasted engineering time in the first week
@@ -0,0 +1,568 @@
---
name: disaster-recovery-plan
description: "Write a disaster recovery plan for a service or system — covering RPO/RTO targets, failure scenario runbooks, backup and restore procedures, DR testing cadence, and communication templates. Use when asked to write a DR plan, document failover procedures, create recovery runbooks, define RTO/RPO targets, or prepare for a disaster recovery game day. Produces a full DR document with per-scenario recovery runbooks, backup validation procedures, testing schedule, and communication templates."
---
# Disaster Recovery Plan Skill
Produce a complete disaster recovery plan for a service or system — giving engineers, SREs, and on-call responders everything they need to recover from a disaster scenario in the shortest possible time. A good DR plan is tested regularly, has exact commands (not vague instructions), and makes RTO/RPO targets measurable so the team knows whether recovery succeeded.
## Required Inputs
Ask for these if not already provided:
- **Service name** and what it does (business function and technical role)
- **Criticality tier** — business impact of extended downtime (e.g. Tier 1 = revenue-critical, Tier 2 = ops impact, Tier 3 = internal only)
- **Current infrastructure setup** — cloud provider, regions/zones, deployment model (Kubernetes, ECS, VMs, serverless)
- **RPO/RTO requirements** — Recovery Point Objective (how much data loss is acceptable) and Recovery Time Objective (how long can it be down)
- **Backup strategy** — what is backed up, how often, where backups are stored, retention policy
- **On-call contacts** — names and contact details for the responder chain
## Output Format
---
# Disaster Recovery Plan: [Service Name]
**Team:** [Team name] | **Tech lead:** [Name]
**Criticality tier:** [Tier 1 / Tier 2 / Tier 3] | **Last tested:** [Date]
**Next DR test:** [Date] | **Document owner:** [Name]
**Last updated:** [Date] | **Review cycle:** Quarterly
> **Emergency? Skip to Section 3 — Failure Scenario Runbooks.** Find the scenario that matches your situation and follow the steps exactly.
---
## 1. Recovery Targets
| Target | Value | Rationale |
|---|---|---|
| RPO (Recovery Point Objective) | [X minutes/hours] | [e.g. "Last committed transaction — database replication is synchronous"] |
| RTO (Recovery Time Objective) | [Y minutes/hours] | [e.g. "Revenue impact begins at 30 min; target recovery in 15 min"] |
| MTTR target (non-disaster) | [Z minutes] | [Operational incidents, not DR events] |
| Data retention (backups) | [N days/weeks] | [Compliance requirement or operational policy] |
| Backup frequency | [Every X hours] | [RPO-driven — backup interval must be ≤ RPO] |
**What these mean in practice:**
- If a database is corrupted, we can lose at most [X minutes] of transactions before the business impact is unacceptable.
- The service must be operational again within [Y minutes/hours] of declaring a DR event.
- If either target cannot be met, escalate to [Engineering Manager] immediately.
---
## 2. Failure Scenario Inventory
| Scenario | Likelihood | Impact | RTO target | RPO target | Runbook |
|---|---|---|---|---|---|
| Single availability zone failure | Medium | [Partial / Full outage] | [15 min] | [0 — no data loss] | Section 3.1 |
| Full region failure | Low | Full outage | [60 min] | [5 min] | Section 3.2 |
| Database corruption / data loss | Low | Full outage | [90 min] | [RPO value] | Section 3.3 |
| Critical dependency outage | High | [Partial degradation] | [30 min] | [N/A] | Section 3.4 |
| Security breach / ransomware | Very low | Full outage + investigation | [4 hours] | [Last clean backup] | Section 3.5 |
| Accidental bulk data deletion | Low | Partial or full data loss | [60 min] | [RPO value] | Section 3.6 |
---
## 3. Failure Scenario Runbooks
### 3.1 Single Availability Zone Failure
**Trigger:** One AZ becomes unreachable — pods/instances in that zone stop responding.
**Detection:** PagerDuty alert `[AlertName]` fires, or cloud provider status page shows AZ degradation.
**Expected RTO:** [15 minutes] | **Expected RPO:** Zero (no data loss if multi-AZ replication is working)
**Step 1 — Confirm the failure**
```bash
# Check pod/instance health across zones
kubectl get pods -o wide -n [namespace] | grep -v Running
# Check which nodes are affected
kubectl get nodes -o wide | grep -v Ready
# Verify cloud provider AZ status
# AWS: https://health.aws.amazon.com/health/status
# GCP: https://status.cloud.google.com
```
**Step 2 — Assess whether auto-recovery has occurred**
```bash
# If using auto-scaling, check if replacement instances launched
kubectl get pods -n [namespace] --watch
# Check deployment replica count
kubectl get deployment [service-name] -n [namespace]
# Verify load balancer health checks are passing
[cloud provider CLI command to check target group health]
```
**Step 3 — Force rescheduling if auto-recovery stalled**
```bash
# Cordon the affected node so no new pods schedule on it
kubectl cordon [node-name]
# Drain the node — moves all pods to healthy nodes
kubectl drain [node-name] --ignore-daemonsets --delete-emptydir-data
# Verify pods have rescheduled successfully
kubectl get pods -o wide -n [namespace]
```
**Step 4 — Verify service health**
```bash
# Smoke test key endpoints
curl -s -o /dev/null -w "%{http_code}" https://[service-url]/health
curl -s -o /dev/null -w "%{http_code}" https://[service-url]/[critical-endpoint]
# Check error rate in monitoring
[dashboard link or query]
```
**Recovery confirmed when:** All pods are Running, health check returns 200, error rate is at baseline.
---
### 3.2 Full Region Failure
**Trigger:** The primary region is entirely unavailable.
**Detection:** All service health checks failing, cloud provider status page confirms region-wide event.
**Expected RTO:** [60 minutes] | **Expected RPO:** [5 minutes — based on cross-region replication lag]
**Step 1 — Confirm regional failure (5 minutes)**
```bash
# Confirm the primary region is unreachable
ping [primary-region-endpoint] || echo "Primary region unreachable"
# Check replication lag on standby region database
[command to check replica lag — e.g. for RDS: aws rds describe-db-instances --region [dr-region]]
```
**Step 2 — Declare DR event and notify (2 minutes)**
Post to `#incidents`:
```
🔴 DR EVENT — [Service Name] — Region Failure
Primary region: [region] — UNREACHABLE
Activating failover to: [dr-region]
Incident commander: [Name]
Next update: 15 minutes
```
Page [Engineering Manager] and [CTO/VP Eng] via PagerDuty.
**Step 3 — Promote DR database (10 minutes)**
```bash
# AWS RDS — promote read replica to primary
aws rds promote-read-replica \
--db-instance-identifier [dr-replica-identifier] \
--region [dr-region]
# Wait for promotion to complete
aws rds wait db-instance-available \
--db-instance-identifier [dr-replica-identifier] \
--region [dr-region]
# Record the new database endpoint
aws rds describe-db-instances \
--db-instance-identifier [dr-replica-identifier] \
--region [dr-region] \
--query 'DBInstances[0].Endpoint.Address'
```
**Step 4 — Deploy service in DR region (20 minutes)**
```bash
# Update service configuration to point at DR database
kubectl set env deployment/[service-name] \
DATABASE_URL=[new-dr-database-url] \
-n [namespace] \
--context [dr-region-context]
# Scale up the DR deployment
kubectl scale deployment/[service-name] --replicas=[N] \
-n [namespace] \
--context [dr-region-context]
# Verify all pods are running
kubectl get pods -n [namespace] --context [dr-region-context]
```
**Step 5 — Cut over DNS / load balancer (5 minutes)**
```bash
# Update DNS to point to DR region load balancer
# AWS Route 53:
aws route53 change-resource-record-sets \
--hosted-zone-id [zone-id] \
--change-batch file://dr-failover-dns.json
# Verify DNS propagation (may take up to [TTL] seconds)
dig [service-domain] @8.8.8.8
```
**Step 6 — Verify end-to-end**
```bash
# Full smoke test against DR endpoint
curl -s https://[service-url]/health
[run automated smoke test suite if available]
```
**Recovery confirmed when:** DNS resolves to DR region, smoke tests pass, error rate is at baseline.
**Post-failover actions (not urgent — after service is stable):**
- Do not fail back to primary until root cause is confirmed resolved
- Document data loss window (check replication lag at time of failure)
- Begin post-incident review — see [incident-postmortem skill]
---
### 3.3 Database Corruption or Data Loss
**Trigger:** Data in the database is corrupted, deleted, or otherwise incorrect due to a software bug, operator error, or hardware fault.
**Detection:** Application errors referencing missing/invalid data, monitoring alerts on query error rate, user reports.
**Expected RTO:** [90 minutes] | **Expected RPO:** [Backup interval — e.g. 1 hour]
**Step 1 — Stop the bleeding immediately**
```bash
# Put the service into maintenance mode to prevent further writes to corrupted data
[command to enable maintenance mode — e.g. kubectl set env deployment/[name] MAINTENANCE_MODE=true]
# Or: scale down the service to zero to prevent writes
kubectl scale deployment/[service-name] --replicas=0 -n [namespace]
```
**Step 2 — Assess scope of corruption**
```bash
# Identify which tables/records are affected
[SQL query to check data integrity — e.g.]
# psql $DATABASE_URL -c "SELECT COUNT(*) FROM [table] WHERE [integrity check condition]"
# Determine when corruption started (cross-reference with deploy times and error logs)
[log query to find earliest error — e.g. in Datadog:]
# service:[service-name] status:error "[corruption error message]" | sort by timestamp asc
```
**Step 3 — Identify the correct restore point**
```bash
# List available backups
[command to list backups — e.g. for RDS:]
aws rds describe-db-snapshots \
--db-instance-identifier [db-identifier] \
--query 'DBSnapshots[*].[SnapshotCreateTime,DBSnapshotIdentifier]' \
--output table
# Choose the most recent backup BEFORE corruption started
# Record the chosen snapshot ID: [snapshot-id]
```
**Step 4 — Restore from backup**
```bash
# Restore to a NEW database instance (never overwrite production directly)
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier [service-name]-restored-[date] \
--db-snapshot-identifier [snapshot-id] \
--region [region]
# Wait for restore to complete
aws rds wait db-instance-available \
--db-instance-identifier [service-name]-restored-[date]
# Get the restored instance endpoint
aws rds describe-db-instances \
--db-instance-identifier [service-name]-restored-[date] \
--query 'DBInstances[0].Endpoint.Address'
```
**Step 5 — Validate restored data**
```bash
# Connect to restored database and verify integrity
psql [restored-db-endpoint] -U [user] -d [database] -c "[data integrity query]"
# Confirm record counts match expectations
psql [restored-db-endpoint] -U [user] -d [database] -c "SELECT COUNT(*) FROM [critical-table]"
```
**Step 6 — Point service at restored database**
```bash
kubectl set env deployment/[service-name] \
DATABASE_URL=postgres://[user]:[pass]@[restored-endpoint]/[db] \
-n [namespace]
kubectl scale deployment/[service-name] --replicas=[N] -n [namespace]
```
**Recovery confirmed when:** Service is running against restored database, data integrity checks pass, error rate is at baseline.
---
### 3.4 Critical Dependency Outage
**Trigger:** A service that [service name] depends on is unavailable or degraded.
**Detection:** Increased error rate or latency on endpoints that call [dependency], alerts from dependency owner.
**Expected RTO:** Depends on dependency — [30 minutes for mitigation, resolution depends on dependency owner]
**Dependency map:**
| Dependency | Criticality | Degraded behaviour | Mitigation |
|---|---|---|---|
| [Database] | Critical — all writes fail | Full outage | Activate DR database (Section 3.3) |
| [Cache — Redis] | High — latency increases | Performance degradation | Bypass cache, serve from DB |
| [Auth service] | Critical — auth fails | All authenticated endpoints fail | Return cached tokens (if implemented) |
| [Message queue] | Medium — async processing delays | Writes succeed, async jobs queue | Queue backlog — see on-call runbook |
| [External API — name] | Low — feature X unavailable | Graceful degradation | Feature flag to disable feature X |
**Mitigation steps:**
```bash
# Enable circuit breaker / fallback for [dependency] if implemented
kubectl set env deployment/[service-name] [DEPENDENCY]_CIRCUIT_BREAKER=open -n [namespace]
# Enable feature flag to disable [dependency-backed feature]
[feature flag CLI command or dashboard link]
# Check if dependency has a status page
# [Dependency status URL]
```
**Escalation:** Contact [dependency] on-call via [PagerDuty / Slack `#[channel]`]. Share your service's error rate and the time dependency errors started.
---
### 3.5 Security Breach or Ransomware
**Trigger:** Evidence of unauthorized access, data exfiltration, or encryption of service data.
**Detection:** Security tooling alert, unusual access patterns, user reports of data exposure.
**Expected RTO:** [4+ hours — prioritise containment over speed] | **Expected RPO:** [Last verified clean backup]
**Step 1 — Isolate immediately**
```bash
# Take the service offline — do not attempt to recover while breach is active
kubectl scale deployment/[service-name] --replicas=0 -n [namespace]
# Revoke all API keys and service account credentials immediately
[command to rotate secrets — e.g. via Vault or cloud provider]
# Block all external access at network level
[firewall/security group command to deny all inbound traffic]
```
**Step 2 — Notify security team immediately**
Page [Security lead] via PagerDuty. Do NOT attempt to remediate without security team involvement.
Post to `#security-incidents` (private channel, not `#incidents`):
```
🔴 SECURITY INCIDENT — [Service Name]
Time detected: [Time]
Evidence: [One sentence — what was observed]
Actions taken: Service isolated, credentials revoked
Awaiting: Security team guidance
```
**Step 3 — Preserve evidence**
```bash
# Export current logs before any remediation
[log export command — preserve evidence for forensics]
# Snapshot the current state of all infrastructure
[snapshot/image command]
```
**Steps 4+ — Follow security team guidance.** Do not restore from backup until security team confirms the attack vector is closed.
---
### 3.6 Accidental Bulk Data Deletion
**Trigger:** An operator, script, or application bug has deleted records in bulk.
**Detection:** Sudden drop in record counts, user reports of missing data, application errors.
**Expected RTO:** [60 minutes] | **Expected RPO:** [Backup interval]
```bash
# Step 1 — Stop further writes immediately
kubectl scale deployment/[service-name] --replicas=0 -n [namespace]
# Step 2 — Determine what was deleted and when
psql $DATABASE_URL -c "
SELECT schemaname, tablename,
n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC LIMIT 10;
"
# Step 3 — Check if deletion is recoverable via MVCC (PostgreSQL)
# Records may still be recoverable if VACUUM has not run
psql $DATABASE_URL -c "
SELECT * FROM [table]
WHERE xmax != 0 -- recently deleted rows
LIMIT 100;
"
# Step 4 — If not recoverable via MVCC, restore from backup
# Follow Section 3.3 (Database Corruption runbook) from Step 3 onward
```
---
## 4. Backup and Restore Procedures
### Backup Configuration
| Data store | Backup type | Frequency | Retention | Location |
|---|---|---|---|---|
| [Primary database] | Automated snapshots | Every [N] hours | [N] days | [S3 bucket / cloud storage path] |
| [Primary database] | Transaction log backups | Continuous | [N] days | [Location] |
| [Secondary store — e.g. Redis] | RDB dump | Daily | [N] days | [Location] |
| [Blob/object storage] | Cross-region replication | Continuous | [N] days | [DR region bucket] |
| [Config / secrets] | Terraform state + Vault backup | On change | Indefinite | [Location] |
### Backup Validation (Run Weekly)
```bash
# Test restore of latest database backup to a throwaway instance
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier [service-name]-backup-test-$(date +%Y%m%d) \
--db-snapshot-identifier $(aws rds describe-db-snapshots \
--db-instance-identifier [db-id] \
--query 'sort_by(DBSnapshots, &SnapshotCreateTime)[-1].DBSnapshotIdentifier' \
--output text)
# Wait for restore, then run integrity checks
psql [test-instance-endpoint] -c "[integrity check query]"
# Confirm row counts match recent production values (allow ≤ RPO difference)
psql [test-instance-endpoint] -c "SELECT COUNT(*) FROM [critical-table]"
# Destroy the test instance
aws rds delete-db-instance \
--db-instance-identifier [service-name]-backup-test-$(date +%Y%m%d) \
--skip-final-snapshot
```
---
## 5. DR Testing Cadence
Regular testing is mandatory. An untested DR plan is not a DR plan.
| Test type | Frequency | Who runs it | Pass criteria |
|---|---|---|---|
| Backup restore validation | Weekly (automated) | On-call rotation | Restore completes, integrity checks pass |
| Zone failover drill | Monthly | Engineering team | RTO target met, zero data loss |
| Region failover drill | Quarterly | Engineering + SRE | RTO/RPO targets met |
| Full DR game day | Annually | Engineering + stakeholders | All scenarios exercised, gaps documented |
| Chaos engineering (infra failures) | Weekly (automated) | Chaos engineering tooling | Service degrades gracefully, recovers automatically |
### Game Day Procedure
1. **Pre-game day (1 week before):** Notify all stakeholders, freeze production changes for the day, prepare DR environment.
2. **Scope definition:** Choose 23 scenarios from Section 2. Document expected outcomes before the test.
3. **Execute:** One person acts as incident commander, others execute runbook steps while another observes and times.
4. **Measure:** Record actual RTO and RPO against targets for each scenario.
5. **Debrief (same day):** Document gaps, runbook inaccuracies, and automation opportunities.
6. **Action items:** File tickets for every gap found. Priority: P1 items must be fixed before next game day.
---
## 6. Communication Plan
### Internal Communication During DR Event
**Incident commander responsibilities:**
- Declare the DR event and open the incident channel
- Post updates every 15 minutes minimum
- Make the call to fail over (do not let the team decide by committee)
- Notify business stakeholders of expected recovery time
**Notify these people at DR event start:**
| Role | Name | Contact | When to notify |
|---|---|---|---|
| Engineering manager | [Name] | [Slack / Phone] | Immediately |
| CTO / VP Engineering | [Name] | [Phone] | Tier 1 services: immediately |
| Customer success lead | [Name] | [Slack] | If customer-facing impact |
| Security lead | [Name] | [Slack / PagerDuty] | If breach suspected |
| Legal / compliance | [Name] | [Email / Phone] | If data loss involves PII |
### Communication Templates
**DR event declared:**
```
🔴 DR EVENT — [Service Name]
Time: [HH:MM UTC]
Scenario: [Zone failure / Region failure / Data loss / etc.]
Impact: [Who is affected and how]
RTO target: [X minutes]
Incident commander: [Name]
War room: [Slack channel / call link]
Next update: [Time + 15 min]
```
**Status update (every 15 minutes):**
```
🔴 DR UPDATE — [Service Name] — [HH:MM UTC]
Status: [Investigating / Executing recovery / Verifying]
Progress: [One sentence on current step]
Blockers: [Any — or "None"]
Updated RTO estimate: [Time]
Next update: [Time + 15 min]
```
**Recovery confirmed:**
```
✅ DR RESOLVED — [Service Name] — [HH:MM UTC]
Total downtime: [X minutes]
Data loss: [None / X minutes of transactions]
RTO target: [X min] — Actual: [Y min] — [MET / MISSED]
RPO target: [X min] — Actual: [Y min] — [MET / MISSED]
Root cause: [One sentence]
Post-incident review: [Scheduled for / Link when created]
```
---
## 7. DR Readiness Checklist
Run this checklist quarterly and before any major infrastructure change:
**Backups:**
- [ ] Automated backups are running and alerts fire if they fail
- [ ] Most recent backup restore was tested within the last 7 days
- [ ] Backup retention meets RPO and compliance requirements
- [ ] Backups are stored in a separate region / account from primary
**Failover infrastructure:**
- [ ] DR region / environment exists and is provisioned (not just documented)
- [ ] DNS failover procedure is documented with exact commands
- [ ] DR database replica is current (replication lag is within RPO)
- [ ] Service can be deployed in DR region with a single command or automated pipeline
**Runbooks:**
- [ ] All runbooks in Section 3 have been tested within the last quarter
- [ ] Runbook commands have been verified against current infrastructure (no stale references)
- [ ] Contact list is current (no departed employees)
**Access:**
- [ ] On-call engineers have access to DR region console / CLI
- [ ] Service account credentials for DR region are provisioned and tested
- [ ] Break-glass accounts exist for emergency access if SSO is unavailable
**Monitoring:**
- [ ] Monitoring exists in DR region (not just primary)
- [ ] Alerts fire correctly when DR environment has issues
---
## Quality Checks
- [ ] RPO and RTO targets are specific numbers, not ranges, and are agreed with the business
- [ ] Every command in every runbook has been run by a human in the last quarter — not copied from documentation untested
- [ ] DR database exists in the DR region and replication lag is monitored
- [ ] Backup restore has been tested end-to-end within the last 7 days
- [ ] The game day schedule is on the team calendar — not just documented here
- [ ] Contact list contains current phone numbers, not just Slack handles (Slack may be down during a DR event)
- [ ] Security breach runbook (3.5) explicitly names the security team contact and does not attempt self-remediation
- [ ] All thresholds (RTO/RPO) are visible in the monitoring dashboard so actual vs. target is measurable in real time
## Anti-Patterns
- [ ] Do not write runbook commands without testing them — an untested command in a runbook is actively dangerous during a real disaster when cognitive load is highest
- [ ] Do not set RTO/RPO targets without business sign-off — technical teams often set aspirational targets that do not reflect actual business cost tolerance for downtime
- [ ] Do not include only the "happy path" of each failover scenario — runbooks must explicitly cover what to do when the recovery step itself fails
- [ ] Do not list Slack handles as the only escalation contact — Slack may be unavailable during a region-wide failure; phone numbers are mandatory
- [ ] Do not schedule DR game days without pre-committing to fix the gaps found — a game day that produces action items no one owns is theater, not preparedness
@@ -0,0 +1,346 @@
---
name: engineering-hiring-rubric
description: "Build an engineering hiring rubric and technical interview scorecard for evaluating software engineers at a specific level. Use when asked to create an interview rubric, design a hiring process, build a technical scorecard, or standardize engineer evaluation. Produces a full interview scorecard, behavioral question bank, technical question set with evaluation criteria, system design rubric, and debrief agenda."
---
# Engineering Hiring Rubric
Produce a complete hiring rubric and interview scorecard for evaluating software engineers at a specific role and level. The rubric must be specific enough that two interviewers who have never compared notes will score the same candidate within one level of each other. That requires: explicit behavioral anchors (what does "Strong Hire" look like vs. "Hire" for each competency), calibrated technical questions with written evaluation criteria, and a structured debrief format that surfaces signal rather than recency bias. Include calibration notes to help interviewers recognize and counter common evaluation biases.
## Required Inputs
Ask for these if not already provided:
- **Role** — backend, frontend, fullstack, SRE/platform, data, ML, or mobile engineer
- **Level** — junior (L3/IC2), mid (L4/IC3), senior (L5/IC4), or staff (L6/IC5); clarify the company's level naming if different
- **Team context** — what the team builds, team size, and what problems this hire will work on in the first year
- **Tech stack** — primary languages and frameworks for the technical questions; list the stack explicitly
- **Interview format** — which rounds are used (phone screen, coding, system design, behavioral, take-home); if not specified, produce a recommended format
## Output Format
---
# Engineering Hiring Rubric: [Role] — [Level]
**Role:** [e.g., Senior Backend Engineer]
**Level equivalent:** [e.g., L5 / IC4 / Senior]
**Team:** [Team name and one-sentence description of what they build]
**Tech stack:** [Languages and frameworks]
**Interview loop:** [List the rounds in order]
---
## 1. Role Definition and Level Expectations
### What This Role Does
[23 sentences describing the scope of work: what systems they'll own, what problems they'll solve, and who they'll work with. Make this specific to the team context provided.]
### Level Bar
Define the minimum bar for a Hire recommendation at this level. This is not the ideal candidate description — it is the floor.
| Dimension | [Level] Floor | One Level Below (No Hire) | One Level Above (Stretch) |
|-----------|--------------|---------------------------|---------------------------|
| Technical scope | [e.g., "Owns a service or major feature area end-to-end with minimal guidance"] | [e.g., "Completes well-defined tasks; needs guidance on scope and approach"] | [e.g., "Leads cross-team technical initiatives; sets technical direction"] |
| Problem solving | [e.g., "Breaks ambiguous problems into concrete sub-problems independently"] | [e.g., "Solves defined problems well; struggles with ambiguity"] | [e.g., "Identifies problems others miss; structures organization-level technical challenges"] |
| Code quality | [e.g., "Writes production-ready code; anticipates edge cases; reviewable without significant rework"] | [e.g., "Writes working code that requires significant review feedback"] | [e.g., "Sets code quality standards; designs reusable abstractions adopted by others"] |
| Communication | [e.g., "Communicates technical decisions clearly to peers and stakeholders"] | [e.g., "Communicates well with direct team; struggles with cross-team or stakeholder comms"] | [e.g., "Drives technical consensus across teams; writes documents others reference"] |
| Ownership | [e.g., "Sees work to production; monitors after deploy; follows up on issues proactively"] | [e.g., "Delivers assigned work; escalates issues but doesn't drive them to resolution"] | [e.g., "Owns outcomes across teams; improves team processes and systems beyond their own work"] |
---
## 2. Interview Loop Structure
| Round | Format | Duration | Interviewer | Competencies Assessed |
|-------|--------|----------|-------------|----------------------|
| Phone screen | Video call, technical questions | 45 min | [Hiring manager or senior engineer] | Problem solving, communication, basic technical depth |
| Coding interview 1 | Live coding — [platform] | 60 min | [Engineer] | Coding, data structures, code quality |
| Coding interview 2 | Live coding — [platform] | 60 min | [Engineer] | Algorithms, debugging, code quality |
| System design | Whiteboard / shared doc | 60 min | [Senior/Staff engineer] | System design, scalability, technical communication |
| Behavioral | Structured interview | 45 min | [Hiring manager] | Ownership, collaboration, growth mindset |
| [Optional] Take-home | Asynchronous project | [X hours] | [Reviewer] | Code quality, thoroughness, real-world problem solving |
**Interview coverage matrix:** Each competency dimension must be assessed by at least 2 independent interviewers.
| Competency | Phone Screen | Coding 1 | Coding 2 | System Design | Behavioral |
|-----------|-------------|---------|---------|--------------|-----------|
| Coding | ○ | ● | ● | ○ | |
| System design | ○ | | | ● | |
| Problem solving | ● | ● | ● | ● | |
| Code quality | | ● | ● | | |
| Communication | ● | ● | ● | ● | ● |
| Ownership | ○ | | | ○ | ● |
| Debugging | | ● | ● | | |
● = Primary signal ○ = Secondary signal
---
## 3. Coding Interview Guide
### Question Selection
Choose 12 problems per coding round. Problems should be solvable in 3040 minutes with the remaining time for discussion and follow-ups. Prefer problems with multiple solution tiers so you can see how far candidates take their thinking.
### Problem Template
**Problem: [Title]**
*Prompt (read to candidate):*
> [Problem statement — be specific. Include constraints (input size, value ranges). Avoid ambiguity that tests problem-reading rather than problem-solving.]
*Example:*
> Given a list of integers representing stock prices at each minute of a trading day, return the maximum profit you could achieve by making exactly one buy and one sell. You may not sell before you buy.
**Clarifying questions a strong candidate will ask:**
- [e.g., "Can the list be empty?" / "Are all values positive?" / "Can profit be negative — i.e., should we return 0 if no profit is possible?"]
**Solution tiers:**
| Tier | Approach | Time Complexity | Space Complexity | Signals |
|------|----------|-----------------|-----------------|---------|
| Baseline | [Brute force — O(n²) nested loop] | O(n²) | O(1) | Can solve the problem; understands correctness |
| Expected | [Single pass, tracking min price seen so far] | O(n) | O(1) | Strong problem solver; explains tradeoff |
| Strong | [Generalizes to k transactions, or extends to cooldown variant without prompting] | O(n) | O(1) | Staff-level generalization thinking |
**Follow-up questions:**
- [e.g., "What if you could make at most k trades?"]
- [e.g., "How would you test this function? Write me 3 test cases."]
- [e.g., "Walk me through your code as if you're explaining it in a code review."]
**Evaluation rubric for this problem:**
| Signal | Strong Hire | Hire | No Hire |
|--------|------------|------|---------|
| Problem comprehension | Asks 12 clarifying questions immediately; identifies edge cases before coding | Understands the problem after 1 prompt; misses 12 edge cases | Misunderstands the problem or requires repeated clarification |
| Solution quality | O(n) solution; clean code; handles all edge cases | O(n) with hints; code is readable but has minor issues | O(n²) with hints, or correct solution with significant issues |
| Code quality | Well-named variables; logical structure; would pass code review | Functional but verbose or inconsistently named | Hard to follow; would require significant review feedback |
| Communication | Narrates thinking throughout; explains complexity; self-corrects | Explains solution when asked; answers follow-ups well | Silent during coding; unable to explain their approach |
| Follow-ups | Extends solution confidently; identifies further improvements | Handles follow-ups with moderate prompting | Unable to extend or explain tradeoffs |
---
## 4. System Design Interview Guide
### [Level]-Appropriate Design Scope
At [Level], expect the candidate to:
- [e.g., Senior: "Design a complete system with capacity estimates, component breakdown, and discussion of failure modes"]
- [e.g., Mid: "Design the core components of a system; may need prompting on scalability and failure handling"]
- [e.g., Junior: "Design a simple client-server system; focus on clarity of thinking over complete distributed systems knowledge"]
### Sample Design Question
**Question:** "Design [a URL shortener / a rate limiter / a notification service / a ride-matching system — choose one relevant to the team's domain]."
**Evaluation dimensions:**
| Dimension | What to assess | Strong Hire | Hire | No Hire |
|-----------|---------------|------------|------|---------|
| Requirements clarification | Does the candidate ask before designing? | Asks scope, scale, SLA, and key use cases before drawing anything | Asks some questions; may miss scale or SLA | Starts designing immediately without clarifying |
| High-level design | Can they describe the major components? | Clear component breakdown with justified choices; covers data flow | Reasonable breakdown; may overcomplicate or undercomplicate | Missing key components or cannot explain data flow |
| Data model | Can they design a schema or data structure for the system? | Models the core entities with normalization/denormalization tradeoffs discussed | Reasonable schema; may miss indexing or partitioning needs | Cannot model the data or produces clearly wrong schema |
| Scalability | Can they identify and address bottlenecks? | Identifies bottlenecks proactively; proposes horizontal scaling, caching, or sharding as appropriate | Discusses scaling when prompted; reasonable solutions | Cannot identify bottlenecks or proposes solutions that don't match the scale |
| Failure handling | Do they think about what happens when things break? | Proactively discusses failure modes: single points of failure, retry logic, idempotency | Discusses failure when prompted; identifies some failure modes | Does not think about failure; assumes happy path |
| Communication | Is the design explained clearly? | Could run this meeting with a team of engineers at a real company | Clear enough to follow; some gaps in explanation | Difficult to follow; interviewer cannot understand the design |
### Design Probing Questions
Use these to probe depth after the candidate presents their design:
- "Walk me through what happens when a write request comes in at peak load — 10,000 requests per second."
- "Your primary database just failed. What happens to the system?"
- "You estimated X QPS. How would your design change if it needed to handle 100× that?"
- "Where is the first place this system would fall over under load?"
- "How would you monitor this in production? What would your on-call runbook look like?"
---
## 5. Behavioral Interview Question Bank
Map every question to a competency. Ask 46 questions per behavioral round using STAR format (Situation, Task, Action, Result). Do not ask leading questions.
### Competency: Ownership and Delivery
1. "Tell me about a time you owned something end-to-end — from design through production monitoring. What did you do when something went wrong after launch?"
- *Strong signal:* Describes proactive monitoring setup, a specific incident they caught themselves, and what they changed
- *Weak signal:* Describes writing the code and handing off; no discussion of production behavior
2. "Describe a project that was significantly delayed or failed. What was your role, and what did you take responsibility for?"
- *Strong signal:* Direct ownership of their contribution to the failure; specific changes to how they work
- *Weak signal:* Attributes all delay to external factors; no reflection on their own actions
### Competency: Technical Judgment
3. "Tell me about a significant technical decision you made. What options did you consider, and how did you decide?"
- *Strong signal:* Named alternatives with clear tradeoffs; explains who they consulted; reflects on whether they'd decide the same way today
- *Weak signal:* "I knew X was the right answer" without describing the decision process
4. "Describe a time you had to push back on a technical direction — either from management or from peers. What happened?"
- *Strong signal:* Evidence-based disagreement; constructive communication; willing to commit once decision was made even if they lost the argument
- *Weak signal:* Either never pushed back or pushed back emotionally without evidence
### Competency: Collaboration and Communication
5. "Tell me about a time you had to explain a complex technical concept to a non-technical stakeholder. How did you approach it?"
- *Strong signal:* Used analogy or simplified model; confirmed understanding; adapted to the audience
- *Weak signal:* "I explained it technically and told them to trust me"
6. "Describe a situation where you and a peer strongly disagreed on an approach. How did it resolve?"
- *Strong signal:* Sought a third opinion or data; focused on the right outcome, not being right; maintained relationship
- *Weak signal:* Escalated immediately or capitulated without engaging
### Competency: Growth and Learning
7. "What is a significant technical mistake you made in the last two years? What did you learn from it?"
- *Strong signal:* Specific mistake, clear causal analysis, concrete behavioral change afterward
- *Weak signal:* Cannot name a specific mistake; describes a minor issue to avoid vulnerability
8. "How do you stay current in [relevant technical area]? Give me a specific example of something you learned recently and applied."
- *Strong signal:* Named sources, applied learning in a specific project with a concrete outcome
- *Weak signal:* "I read blogs" with no specifics; no applied example
---
## 6. Full Interview Scorecard
Complete one scorecard per interview round. Collect all scorecards before the debrief.
```
INTERVIEW SCORECARD
===================
Candidate: ______________________
Interviewer: ______________________
Round: ______________________
Date: ______________________
Interview format: ______________________
COMPETENCY RATINGS
Rate each dimension independently. Do not average.
Scale: 1 = Strong No Hire | 2 = No Hire | 3 = Hire | 4 = Strong Hire
1 2 3 4 Notes
Coding / Technical skill [ ] [ ] [ ] [ ] ___________________________
Problem solving [ ] [ ] [ ] [ ] ___________________________
System design [ ] [ ] [ ] [ ] ___________________________
Code quality [ ] [ ] [ ] [ ] ___________________________
Debugging [ ] [ ] [ ] [ ] ___________________________
Communication [ ] [ ] [ ] [ ] ___________________________
Ownership [ ] [ ] [ ] [ ] ___________________________
Collaboration [ ] [ ] [ ] [ ] ___________________________
SPECIFIC EVIDENCE
What did the candidate do or say that drove your rating?
(Required — write observable behaviors, not impressions)
Strongest signal (positive):
___________________________________________________________________________
Strongest concern or gap:
___________________________________________________________________________
OVERALL RECOMMENDATION
[ ] Strong Hire [ ] Hire [ ] No Hire [ ] Strong No Hire
OVERALL RECOMMENDATION RATIONALE
(Required — 35 sentences minimum. State your recommendation, the evidence
that supports it, and the specific gap or risk if not a Strong Hire)
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
Level signal: This candidate demonstrated [ L_ / L_ ] level behaviors.
SHOULD INTERVIEWERS DISCUSS BEFORE DEBRIEF?
[ ] No — I have a clear independent signal
[ ] Yes — I need context on [specific area] to complete my assessment
```
---
## 7. Hiring Recommendation Framework
| Recommendation | Meaning | When to use |
|---------------|---------|-------------|
| **Strong Hire** | Confident the candidate will exceed the level bar and be a high performer on the team | Evidence across 3+ competencies at above-bar level; no significant concerns |
| **Hire** | Confident the candidate meets the level bar; will perform well | Meets bar on all must-have competencies; may have 1 area to develop |
| **No Hire** | Does not meet the level bar | Below bar on 1+ must-have competency, or gap too large to close quickly |
| **Strong No Hire** | Clear mismatch — well below the bar, or a specific disqualifying signal | Significant gaps across multiple competencies, or a values/behavior concern |
**Must-hire competencies for [Role] at [Level]:** [List 34 competencies where a No Hire score on any one of them means the overall recommendation must be No Hire, regardless of performance elsewhere. Example: "Coding and System Design are must-hire competencies for a Senior Backend Engineer. Strong performance on Behavioral dimensions cannot compensate for a No Hire on Coding."]
**Debrief rule:** A Strong Hire can override one No Hire only if: (a) the No Hire is not on a must-hire competency, and (b) the Strong Hire interviewer can articulate why the concern is not disqualifying. A Strong No Hire cannot be overridden — escalate to hiring manager.
---
## 8. Debrief Agenda
Run the debrief before scorecards are shared verbally. Everyone submits a written scorecard first.
```
DEBRIEF AGENDA — [Candidate Name]
Duration: 45 minutes
Facilitator: [Hiring Manager]
0:00 0:05 SCORECARD REVIEW
Each interviewer states their overall recommendation only (no rationale yet).
Facilitator notes alignment and disagreements on whiteboard/doc.
0:05 0:15 EVIDENCE ROUND
Go around the table. Each interviewer shares:
- Their strongest positive signal (observable behavior, not impression)
- Their biggest concern (observable behavior, not impression)
No discussion yet — just evidence gathering.
0:15 0:30 DISCUSS DISAGREEMENTS
Address only the competency dimensions where interviewers disagree.
Anchor discussion on: "What did you observe?" not "What do you think?"
If interviewers assessed different competencies, disagreement may reflect
insufficient signal — note this.
0:30 0:40 DECISION
Reach a decision on overall recommendation.
If consensus: state the recommendation and rationale.
If not consensus: hiring manager makes the call and states why.
0:40 0:45 PROCESS NOTES
- Were any questions unclear or hard to compare across candidates?
- Any bias signals observed during the debrief? (see Section 9)
- Feedback to improve the process for next time.
```
---
## 9. Calibration and Bias Reduction Notes
Brief every interviewer on these before they conduct their first interview for this role.
| Bias | How it manifests | Counter-measure |
|------|-----------------|-----------------|
| Halo effect | Strong performance in round 1 colors ratings in round 2 | Submit scorecard before reading others; rate each competency independently |
| Similarity bias | "I liked them" correlates with "they think like me" | Require observable evidence for every rating; check: "Is this a signal about their ability or their similarity to me?" |
| Recency bias | Final impression dominates overall rating | Take notes during the interview; write evidence immediately after; debrief uses written evidence, not memory |
| Expectation anchoring | First interviewer's opinion anchors all others | No verbal discussion between interviewers before debrief; written scorecards submitted before debrief starts |
| Culture fit as cover | "Not a culture fit" without specific behavioral evidence | "Culture fit" is not a valid dimension on this scorecard; use Collaboration and Communication with evidence |
| Credential bias | Degree or previous employer overweights rating | Do not list educational background in pre-interview briefing documents; focus on demonstrated behaviors |
| Confidence ≠ Competence | Articulate candidates rated higher regardless of correctness | Grade the answer quality, not the delivery style; use written rubrics per question |
---
## Quality Checks
- [ ] Level bar table defines a concrete floor for the level — not aspirational traits — with a comparison to one level below and above
- [ ] Every behavioral question includes explicit Strong Hire and Weak/No Hire signal descriptions — not just the question text
- [ ] Coding problem(s) include solution tiers with time and space complexity, plus a per-question rubric with behavioral anchors
- [ ] System design rubric evaluates at minimum: requirements clarification, component design, data model, scalability, and failure handling
- [ ] Scorecard uses observable behavior fields ("What did the candidate do or say") — not impression fields
- [ ] Must-hire competencies are explicitly named for the role and level
- [ ] Debrief agenda enforces written scorecard submission before verbal discussion to prevent anchoring
## Anti-Patterns
- [ ] Do not use a single behavioral anchor description per competency — you must define what Strong Hire AND No Hire look like separately, or interviewers cannot calibrate
- [ ] Do not allow "culture fit" as a standalone assessment dimension — it masks similarity bias; all judgments must use observable behavioral evidence
- [ ] Do not let interviewers share scorecard feedback before the debrief — verbal pre-debrief discussion anchors everyone to the first opinion expressed
- [ ] Do not set the same must-hire competency list for all engineering roles — a senior backend engineer and a frontend engineer have different non-negotiable competencies
- [ ] Do not skip the calibration bias notes section — interviewers who have never been briefed on halo effect, recency bias, and credential bias will reproduce them in every loop
@@ -0,0 +1,172 @@
---
name: engineering-weekly-report
description: "Write a weekly engineering status report for a team, service, or initiative. Use when asked to write a team update, weekly engineering report, sprint status email, or standing team communication to stakeholders. Produces a concise, scannable weekly report covering shipping progress, metrics, decisions, blockers, and next-week priorities."
---
# Engineering Weekly Report
Produce a weekly engineering status report that a team can send to stakeholders, their engineering manager, and the team itself. The format is fixed week-over-week so readers know exactly where to look — shipping progress at the top, decisions in the middle, risks and next steps at the bottom. The report must be readable in under 2 minutes. Avoid prose walls: use bullet points, status tags, and short tables. If metrics are not provided, leave the metrics section with [data needed] markers rather than fabricating numbers.
## Required Inputs
Ask for these if not already provided:
- **Team name and report period** — team name plus week number or date range (e.g., "Platform Team, Week 21, May 1216")
- **Work items shipped this week** — what was completed and released or merged
- **Work items in progress** — what is actively being worked on, with rough percent-complete if known
- **Blocked items** — what is blocked, who owns the block, and what is needed to unblock
- **Key decisions made** — any architecture, process, or priority decisions made this week
- **Decisions needed next week** — any decisions that need to be made soon and who needs to make them
- **Risks and escalations** — anything that threatens next week's commitments or needs leadership visibility
- **Next week's top priorities** — the 35 things the team plans to accomplish next week
Optional but useful:
- **Key metrics** — reliability (error rate, p99 latency), velocity (story points completed), or other health indicators
- **Team health notes** — PTO, new joins, attrition, morale signals worth noting
- **Sprint or iteration number** — if the team runs sprints
## Output Format
---
# Engineering Weekly Report — [Team Name]
**Week:** [Week Number] | [Date Range, e.g., May 1216, 2025]
**Author:** [Name or Team Lead]
**Distribution:** [e.g., Eng leadership, Product, Team]
---
## Shipping Progress
### Shipped This Week
| Item | Description | Impact |
|------|-------------|--------|
| [Feature / Fix / Infra change] | [One-line description] | [Who benefits / what it unblocks] |
| [Feature / Fix / Infra change] | [One-line description] | [Who benefits / what it unblocks] |
| [Feature / Fix / Infra change] | [One-line description] | [Who benefits / what it unblocks] |
### In Progress
| Item | Owner | Status | Target Ship |
|------|-------|--------|-------------|
| [Work item] | [Name] | [~40% / On Track / At Risk] | [Date or Sprint] |
| [Work item] | [Name] | [~70% / On Track / At Risk] | [Date or Sprint] |
| [Work item] | [Name] | [~20% / On Track / At Risk] | [Date or Sprint] |
### Blocked
| Item | Blocked Since | Blocker Description | Owner | Needed To Unblock |
|------|--------------|--------------------|----|-------------------|
| [Work item] | [Date] | [What is blocking progress] | [Name] | [Specific ask — decision, resource, dependency] |
If no items are blocked: *No active blockers.*
---
## Key Metrics
*Metrics reported as of [Date]. Prior week in parentheses.*
| Metric | This Week | Last Week | Trend | Target |
|--------|-----------|-----------|-------|--------|
| Error rate (5xx) | [X%] | [X%] | [↑ / ↓ / →] | < [threshold] |
| p99 latency | [Xms] | [Xms] | [↑ / ↓ / →] | < [threshold] |
| Deployment frequency | [X deploys] | [X deploys] | [↑ / ↓ / →] | [target] |
| Story points completed | [X] | [X] | [↑ / ↓ / →] | [sprint target] |
| On-call page volume | [X pages] | [X pages] | [↑ / ↓ / →] | < [threshold] |
**Metrics notes:** [Any context that makes the numbers meaningful — e.g., "Error rate spike on Tuesday tied to downstream dependency outage, resolved by EOD."]
If metrics are not provided: replace table rows with `[data needed — provide metric values for this section]`.
---
## Decisions
### Made This Week
| Decision | Rationale | Owner | Stakeholders Informed |
|----------|-----------|-------|----------------------|
| [Decision description] | [Why — 1 sentence] | [Name] | [Yes / No — who] |
| [Decision description] | [Why — 1 sentence] | [Name] | [Yes / No — who] |
If no decisions were made: *No major decisions this week.*
### Needed Next Week
| Decision | Context | Deadline | Decision Owner |
|----------|---------|----------|----------------|
| [What needs to be decided] | [Why it matters, what happens if delayed] | [Date] | [Name or role] |
If no decisions are pending: *No decisions pending.*
---
## Risks and Escalations
| Risk | Likelihood | Impact | Mitigation | Escalate To |
|------|-----------|--------|-----------|-------------|
| [Risk description] | [High/Med/Low] | [High/Med/Low] | [What we're doing about it] | [Name/role if escalation needed] |
**Escalations this week:** [Any item that needs immediate leadership attention — call it out explicitly here, do not bury it in a table row. If none: "None."]
---
## Team Health
| Item | Status |
|------|--------|
| Team capacity this week | [X of Y people at full capacity] |
| PTO / out of office | [Names and dates, or "None"] |
| New joins / departures | [Name, role, and date, or "None"] |
| On-call this week | [Name] |
| On-call next week | [Name] |
**Team notes:** [Any morale, workload, or team dynamic signals worth surfacing — keep this factual and constructive. If nothing to note: omit this line.]
---
## Next Week's Priorities
*The [35] things this team will ship or meaningfully advance next week.*
1. **[Priority item]** — [One sentence: what done looks like and who owns it]
2. **[Priority item]** — [One sentence: what done looks like and who owns it]
3. **[Priority item]** — [One sentence: what done looks like and who owns it]
4. **[Priority item]** — [One sentence: what done looks like and who owns it]
5. **[Priority item]** — [One sentence: what done looks like and who owns it]
**Capacity risk:** [If the team is at reduced capacity next week (PTO, incidents, etc.), note it here so stakeholders calibrate expectations.]
---
## Appendix: Sprint Scorecard (if applicable)
| Sprint | Committed | Completed | Completion Rate | Carried Over |
|--------|-----------|-----------|----------------|--------------|
| Sprint [N-1] | [X pts] | [X pts] | [X%] | [X pts] |
| Sprint [N] (current) | [X pts] | [X pts — partial] | [X% at midpoint] | TBD |
---
*Questions or corrections: [Slack channel or email] | Next report: [Date]*
---
## Quality Checks
- [ ] Every blocked item names a specific owner and states what is concretely needed to unblock it — not just "waiting on X"
- [ ] Decisions-needed table includes a deadline and a named decision owner, not a vague "TBD"
- [ ] Metrics table is either populated with real numbers or explicitly marked `[data needed]` — no fabricated metrics
- [ ] Next week's priorities are written as outcomes ("ship X", "complete Y migration") not as activities ("work on X")
- [ ] Escalations that need leadership attention are called out explicitly in the Risks section — not just buried in a table row
- [ ] The entire report is readable in under 2 minutes — if it is longer than one printed page, trim it
- [ ] Report period (week number and date range) is clearly stated in the header
## Anti-Patterns
- [ ] Do not fabricate metrics — if data is not available, mark the field as `[data needed]` rather than estimating; stakeholders making decisions on invented numbers is actively harmful
- [ ] Do not write next week's priorities as activities ("work on X") — they must be outcomes ("ship X", "complete Y migration") so stakeholders can evaluate whether the team delivered
- [ ] Do not bury escalations inside a risk table row — anything needing leadership attention must be called out explicitly in the Escalations section
- [ ] Do not list blocked items without naming a specific owner and a concrete unblocking action — "waiting on X" is not a blocker entry, it is a placeholder
- [ ] Do not write a report that exceeds two printed pages — length signals the author has not done the editorial work of deciding what matters to stakeholders
@@ -0,0 +1,377 @@
---
name: feature-flag-guide
description: "Write a feature flag management guide and lifecycle playbook for a service or team — covering flag taxonomy, creation checklist, rollout strategy, monitoring requirements, cleanup policy, and governance. Use when asked to document feature flag practices, create a flag rollout plan, write a feature flag policy, or guide a team on flag lifecycle management. Produces a flag lifecycle playbook, taxonomy reference, per-flag creation template, rollout decision tree, and cleanup checklist."
---
# Feature Flag Guide Skill
Produce a complete feature flag management guide for a service or team — covering how flags are named and categorised, how to create and roll out a flag safely, what to monitor during rollout, when and how to clean up flags, and who is responsible for each stage. Feature flags without discipline become permanent technical debt. This guide gives the team a repeatable process so flags are created intentionally, rolled out safely, and removed when done.
## Required Inputs
Ask for these if not already provided:
- **Service or team name** — scope of the guide
- **Feature flag platform** — LaunchDarkly, Split, Unleash, Flagsmith, Flipt, or a custom/in-house solution
- **Flag being documented** (if writing a per-flag guide) or "general guide" (if writing team-wide policy)
- **Rollout constraints** — any compliance, data privacy, or contractual constraints on who can see a feature (e.g. HIPAA, EU-only, enterprise customers only)
## Output Format
---
# Feature Flag Management Guide: [Service / Team Name]
**Team:** [Team name] | **Platform:** [LaunchDarkly / Split / Unleash / Custom]
**Document owner:** [Name] | **Last updated:** [Date]
**Review cycle:** Quarterly, and whenever the flag platform changes
---
## 1. Flag Taxonomy
Every flag belongs to exactly one category. The category determines default behaviour, who can enable it in production, and when it must be cleaned up.
| Type | Purpose | Default state | Production gate | Max lifetime |
|---|---|---|---|---|
| **Release flag** | Controls rollout of a new feature — decouples deploy from release | Off | Tech lead approval | 90 days from feature launch |
| **Experiment flag** | A/B or multivariate test — measures impact of a change | Off (control group) | Product + tech lead | Duration of experiment + 30 days |
| **Ops flag** | Operational control — circuit breaker, kill switch, throttle | On (normal behaviour) | On-call engineer can toggle | Indefinite (review annually) |
| **Permission flag** | Gates access by user segment, tier, or region | Off (restricted) | Product + Account owner | Indefinite (review annually) |
**When in doubt:** If the flag is temporary (tied to a specific feature launch), it is a Release flag. If it will exist forever as a control knob, it is an Ops flag.
---
## 2. Flag Naming Convention
All flags must follow this naming scheme:
```
[type]-[service]-[feature-description]
```
| Segment | Values | Example |
|---|---|---|
| type | `release`, `exp`, `ops`, `perm` | `release` |
| service | Short service identifier, lowercase, hyphenated | `payments` |
| feature-description | Kebab-case description, max 5 words | `new-checkout-flow` |
**Full examples:**
- `release-payments-new-checkout-flow` — release flag for a new checkout feature in the payments service
- `exp-search-personalized-ranking` — experiment on personalized search ranking
- `ops-api-rate-limit-override` — operational flag to override API rate limits
- `perm-dashboard-beta-users-only` — permission flag gating dashboard for beta users
**Do not:**
- Use ticket numbers in flag names (`release-JIRA-1234` → not searchable or self-describing)
- Use dates in flag names (`release-dark-mode-jan-2024` → flags outlive their dates)
- Use vague names (`release-new-thing` → not useful when you have 50 flags)
---
## 3. Flag Creation Checklist
Complete every item before creating a flag in the production environment.
**Before creating the flag:**
- [ ] Flag type determined from taxonomy (Section 1)
- [ ] Flag name follows naming convention (Section 2)
- [ ] Flag owner assigned — one named engineer responsible for cleanup
- [ ] Cleanup date set in the flag description field (for Release and Experiment flags)
- [ ] Rollout strategy defined — see Section 4
- [ ] Monitoring plan defined — see Section 5
- [ ] Code review approved with flag guard in place
**Flag description field (required):**
```
Type: [Release / Experiment / Ops / Permission]
Owner: [Name]
Linked ticket: [JIRA-XXXX or GitHub issue URL]
Purpose: [One sentence — what this flag controls]
Cleanup by: [Date — required for Release and Experiment flags; "Annual review" for Ops/Permission]
Rollout plan: [Link to this document or inline summary]
```
**Code requirements:**
```python
# Good — behaviour is clear when flag is off, and cleanup is obvious
if flag_client.is_enabled("release-[service]-[feature]", user_context):
return new_feature_handler(request)
else:
return existing_handler(request)
# Bad — nested flags, ternaries, and implicit defaults make cleanup error-prone
result = new_handler() if (f1 and not f2) or f3 else old_handler()
```
---
## 4. Rollout Strategy
### Decision Tree
Use this decision tree to pick the right rollout strategy for a Release or Experiment flag:
```
Is the change reversible without a deploy?
├── No → Use an Ops flag with manual enable, not a percentage rollout
└── Yes → Continue
Is there a user-level identifier available (user ID, session ID)?
├── No → Use server-side percentage (stateless, but inconsistent per user)
└── Yes → Use user-based percentage (consistent experience per user) ← preferred
Is the change risky (touches payments, auth, or data writes)?
├── Yes → Start at 1% → 5% → 25% → 50% → 100%, with 24-hour holds
└── No → Start at 10% → 50% → 100%, with 4-hour holds
Does the change affect specific customer tiers or geographies?
├── Yes → Use segment-based targeting, not percentage rollout
└── No → Use percentage rollout
```
### Rollout Stages
| Stage | Percentage | Hold duration | Pass criteria before advancing |
|---|---|---|---|
| Canary | 1% | 24 hours | Error rate within SLO, no P1 incidents |
| Early rollout | 510% | 24 hours | Error rate and latency match control group |
| Partial rollout | 2550% | 2448 hours | Business metrics not degraded vs. control |
| Majority | 75% | 24 hours | Final check — no regressions |
| Full rollout | 100% | 48 hours | Stable — schedule cleanup |
**Do not skip stages for Release flags on production.** Speed of rollout is not worth a production incident.
### Segment-Based Targeting
Use segment targeting when the rollout must be restricted:
```yaml
# LaunchDarkly segment example — adapt for your platform
targeting_rules:
- clause:
attribute: "subscription_tier"
operator: "in"
values: ["enterprise", "team"]
serve: "on"
- clause:
attribute: "country"
operator: "in"
values: ["US", "CA", "GB"]
serve: "on"
default: "off"
```
---
## 5. Monitoring Requirements
Every flag that is not at 0% or 100% rollout requires active monitoring. Do not roll out a flag and walk away.
### Required Metrics Per Flag
| Metric | What to compare | Alert threshold |
|---|---|---|
| Error rate | Flag-on cohort vs. flag-off cohort | >2× baseline error rate in flag-on group |
| p99 latency | Flag-on vs. flag-off | >20% higher latency in flag-on group |
| [Primary business metric] | Flag-on vs. flag-off | >5% degradation in flag-on group |
| [Conversion / completion rate] | Flag-on vs. flag-off | >2% drop in flag-on group |
**Setting up split metric monitoring in [LaunchDarkly / Split / Datadog]:**
```
1. Navigate to the flag → Metrics tab
2. Add metric: [primary business metric]
3. Add metric: error_rate (service-level)
4. Add metric: p99_latency (endpoint-level)
5. Set alert: notify [flag owner] in Slack #[team-channel] if metric degrades by [threshold]
6. Set experiment duration: [N days] if this is an Experiment flag
```
### Guardrail Metrics
These metrics must never degrade, regardless of what the primary metric shows. If a guardrail is breached, roll back immediately — do not wait for investigation.
- Error rate exceeds SLO threshold ([X]%)
- p99 latency exceeds SLO threshold ([Y] ms)
- [Service-specific guardrail — e.g. payment failure rate, auth failure rate]
**Immediate rollback command if guardrail is breached:**
```bash
# [LaunchDarkly CLI]
ld-cli flag update [project-key] [flag-key] --default-variation off
# [Split CLI]
split-cli update-treatment [flag-name] --treatment "off" --percentage 100
# [Unleash CLI / API]
curl -X POST https://[unleash-host]/api/admin/features/[flag-name]/disable \
-H "Authorization: [admin-token]"
# [Custom — adapt to your implementation]
[command or dashboard step]
```
---
## 6. Per-Flag Creation Template
Copy this template into your flag's description field and the linked ticket when creating a new flag:
```markdown
## Flag: [flag-name]
**Type:** [Release / Experiment / Ops / Permission]
**Owner:** [Name] ([Slack handle])
**Created:** [Date]
**Cleanup by:** [Date]
**Linked ticket:** [URL]
### Purpose
[One paragraph: what this flag controls, why it exists, what "on" and "off" mean]
### Rollout Plan
| Stage | Target | Date | Approved by |
|---|---|---|---|
| Canary | 1% | [Date] | [Name] |
| Early | 10% | [Date] | [Name] |
| Partial | 50% | [Date] | [Name] |
| Full | 100% | [Date] | [Name] |
### Monitoring
- Primary metric: [metric name and dashboard link]
- Guardrail metrics: error rate < [X]%, p99 < [Y] ms
- Alert channel: #[team-channel]
### Rollback Procedure
[Exact steps to turn the flag off in an emergency — should take < 2 minutes]
### Cleanup Checklist
- [ ] Flag at 100% for 48+ hours with no incidents
- [ ] Code path for flag-off branch removed from codebase
- [ ] Flag deleted from [platform]
- [ ] Ticket closed
```
---
## 7. Emergency Kill-Switch Procedure
When a flag needs to be disabled immediately due to a production incident:
**Time target: flag disabled within 2 minutes of decision.**
```
1. Go to [platform URL] — bookmark this: [URL]
2. Search for the flag by name: [flag-name]
3. Set to 0% / "off" for ALL users
4. Verify the service error rate drops within 60 seconds
5. Post to #incidents:
"🟡 Feature flag [flag-name] disabled — rolling back [feature description].
Owner: [name]. Error rate before: [X]%. Monitoring for recovery."
6. Page the flag owner if not already aware
```
**For ops flags (kill switches that must turn OFF normally-on behaviour):**
```bash
# These flags are "on" by default and turned "off" to disable a feature
# Confirm the flag polarity before toggling — "off" may mean "disabled" or "enabled" depending on naming
# Flag [flag-name]: OFF = [feature behaviour when off]
[kill switch command for your platform]
```
---
## 8. Stale Flag Policy and Cleanup
Stale flags are flags that are at 100% rollout, have been at 100% for >48 hours, or are past their cleanup date. Stale flags are technical debt.
### Stale Flag Definition
A flag is stale if ANY of the following are true:
- It is a Release flag past its cleanup date
- It has been at 100% (or 0%) rollout for more than 30 days
- Its linked ticket is closed and code cleanup has not happened
- Its owner has left the team
### Cleanup Checklist
```
[ ] Flag is at 100% rollout and has been stable for 48+ hours
[ ] Monitoring shows no issues for the flag-on cohort
[ ] Code changes:
[ ] Remove the flag check from application code
[ ] Remove the "off" code path entirely — do not leave dead code
[ ] Remove any flag-related tests that test the off behaviour
[ ] Update any documentation that references the flag
[ ] PR merged and deployed to production
[ ] Flag deleted from [platform] (do not just disable — delete)
[ ] Cleanup ticket closed
[ ] Flag owner confirms cleanup in Slack: "Flag [name] has been cleaned up — [commit link]"
```
**Automated stale flag detection:**
```bash
# Run weekly — flags past cleanup date or at 100% for > 30 days
# [Platform-specific query — adapt:]
# LaunchDarkly API
curl -s "https://app.launchdarkly.com/api/v2/flags/[project-key]" \
-H "Authorization: [api-key]" | \
jq '.items[] | select(.creationDate < (now - 2592000) * 1000) | {key: .key, created: .creationDate}'
# Notify #engineering-housekeeping with list of stale flags
```
### Stale Flag Escalation
| Age past cleanup date | Action |
|---|---|
| 014 days | Slack reminder to flag owner |
| 1430 days | Slack reminder to flag owner + tech lead |
| 30+ days | Tech lead assigns cleanup, creates ticket with P2 priority |
| 60+ days | Engineering manager reviews — flag may be force-deleted |
---
## 9. Governance
### Who Can Do What
| Action | Who | Approval required |
|---|---|---|
| Create a flag (any environment) | Any engineer | None — but must complete creation checklist |
| Enable a flag in development | Any engineer | None |
| Enable a flag in staging | Any engineer | None |
| Enable a flag in production (010%) | Flag owner | Tech lead awareness |
| Advance rollout in production (10100%) | Flag owner | Tech lead sign-off per stage |
| Enable an Ops flag in production | On-call engineer | None — these are break-glass controls |
| Delete a flag | Flag owner | Tech lead confirmation that code cleanup is done |
| Create a Permission flag | Flag owner | Product manager approval |
### Audit Logging
All flag changes in production must be traceable. Ensure the following are configured in [platform]:
- **Change log:** Every production flag change logs: who changed it, what they changed, and when.
- **Slack notifications:** Production flag changes post to `#[team]-flag-changes` automatically.
- **Quarterly review:** Every quarter, the tech lead reviews the full flag inventory, confirms owners are current, and removes flags with no owner.
---
## Quality Checks
- [ ] Every flag has an owner named in its description — no orphan flags
- [ ] Release and Experiment flags have a cleanup date set — not open-ended
- [ ] Monitoring is configured for every flag currently between 199% rollout
- [ ] The emergency kill-switch procedure has been tested — on-call engineers have bookmarked the platform URL and know the steps
- [ ] Stale flag detection runs automatically and results are reviewed weekly
- [ ] Code review checklist includes: "Does this PR introduce a flag? If yes, is the creation checklist complete?"
- [ ] At least one person other than the flag owner knows how to disable any given flag in an emergency
## Anti-Patterns
- [ ] Do not create release flags without a cleanup date — flags without expiry dates become permanent technical debt that accumulates silently until the codebase is unmaintainable
- [ ] Do not skip monitoring setup for flags between 199% rollout — a partially-rolled-out flag without metric comparison is a risk without a sensor
- [ ] Do not nest flags inside other flags — compound flag logic makes cleanup nearly impossible and creates untestable code paths
- [ ] Do not allow flag owners to leave the team without reassigning ownership — orphan flags with no owner never get cleaned up
- [ ] Do not use feature flags as a permanent configuration system — flags that have been at 100% or 0% for more than 30 days must be cleaned up; using flags as permanent config couples business logic to a feature flag platform
@@ -5,7 +5,7 @@ description: "Write a structured incident postmortem or post-incident review. Us
# Incident Postmortem Skill
This skill produces a complete, blameless incident postmortem document following industry-standard format. Output is ready to share with engineering teams, leadership, and affected stakeholders.
This skill produces a complete, blameless incident postmortem document following industry-standard format. Output enforces blameless framing throughout — system gaps over individual failures — and drives toward specific, closeable action items rather than vague process commitments.
## Required Inputs
@@ -20,8 +20,10 @@ Ask the user for these if not provided:
- **How it was resolved**
- **Initial thoughts on root cause**
- **Action items already identified** (optional)
- **Responders** (who was on-call or responded — names or roles; used for the timeline, not for blame)
- **Customer or external communications sent** (optional — any status page updates, emails, or support messages with timestamps)
## Output Structure
## Output Format
---
@@ -131,13 +133,22 @@ Rules for action items:
- [ ] Timeline has no blame-focused language
- [ ] Root cause is specific (not "human error")
- [ ] Root cause answers "why did this happen?" not just "what happened?" — it names a system or process gap, not a symptom
- [ ] Contributing factors explain the systemic gaps
- [ ] Every action item has an owner and due date
- [ ] "What went well" section is genuine, not token
- [ ] No action item contains vague language like "improve monitoring", "increase resilience", or "better testing" — each must name a specific change
- [ ] Executive summary is readable by non-technical leadership
## Example Trigger Phrases
## Anti-Patterns
- [ ] Do not assign blame to individuals — postmortems must focus on system and process failures
- [ ] Do not write action items with vague language like "improve monitoring" — each must name a specific, ownable change
- [ ] Do not skip the contributing factors — root cause alone misses the systemic issues that enable incidents
- [ ] Do not omit the detection timeline — how long it took to detect matters as much as how long it took to resolve
- [ ] Do not treat the postmortem as closed until all action items have named owners and due dates
## Usage Examples
- "Write a postmortem for the [incident name] outage"
- "Help me write a P1 incident report"
- "Generate an RCA document for [service] going down on [date]"
@@ -0,0 +1,300 @@
---
name: infra-as-code-review
description: "Write an infrastructure-as-code review checklist and conduct a structured review of Terraform, CloudFormation, Pulumi, or Ansible code. Use when asked to review IaC code, audit infrastructure configurations, check cloud security posture, or produce a reusable IaC review checklist. Produces a structured review report with severity-categorized findings, remediation guidance, and a reusable checklist."
---
# Infrastructure-as-Code Review
Produce a structured infrastructure-as-code review that applies security, reliability, and operational quality standards to a specific body of IaC code. The output serves two purposes: an actionable review report for the code at hand (with findings by severity and specific remediation steps), and a reusable checklist the team can apply to every future IaC change. If the user provides actual code, analyze it and populate the findings table with real issues. If no code is provided, produce the checklist and a template findings report.
## Required Inputs
Ask for these if not already provided:
- **IaC tool** — Terraform, CloudFormation, Pulumi, Ansible, or CDK
- **Cloud provider** — AWS, GCP, Azure, or multi-cloud
- **What the code provisions** — a brief description (e.g., "VPC, EKS cluster, and RDS instance for the payments service")
- **Security policies or naming standards in use** — any existing org standards to check against; if none, use sensible defaults
- **The IaC code itself** — paste or describe it; if not provided, produce the checklist template only and note findings require code
## Output Format
---
# IaC Review Report: [What Is Being Provisioned]
**Reviewer:** [Name / Claude]
**IaC Tool:** [Terraform / CloudFormation / Pulumi / Ansible / CDK]
**Cloud Provider:** [AWS / GCP / Azure]
**Code Location:** [Repo path or PR link]
**Review Date:** [Date]
**Overall Risk:** [Critical / High / Medium / Low]
---
## Executive Summary
| Severity | Finding Count | Resolved in This Review | Carry-Over Risk |
|----------|---------------|------------------------|-----------------|
| Critical | [n] | [n] | [Yes/No — explain] |
| High | [n] | [n] | [Yes/No — explain] |
| Medium | [n] | [n] | [Yes/No — explain] |
| Low | [n] | [n] | [Yes/No — explain] |
| **Total** | **[n]** | **[n]** | |
**Recommendation:** [Approve / Approve with Required Changes / Block — one sentence rationale]
---
## Findings
### Critical Findings
#### CRIT-01: [Finding Title]
| Field | Detail |
|-------|--------|
| **Severity** | Critical |
| **Category** | [IAM / Secrets / Encryption / Network / State / Naming / Cost] |
| **Resource** | `[resource_type.resource_name]` |
| **File / Line** | `[path/to/file.tf:42]` |
| **Risk** | [What can go wrong — be specific about the attack vector or failure mode] |
**Current code:**
```hcl
# [paste the problematic snippet]
resource "aws_s3_bucket" "data" {
bucket = "my-bucket"
acl = "public-read" # PROBLEM: public read access
}
```
**Remediation:**
```hcl
resource "aws_s3_bucket" "data" {
bucket = "my-bucket"
}
resource "aws_s3_bucket_public_access_block" "data" {
bucket = aws_s3_bucket.data.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
```
**Why this matters:** [One sentence linking the specific risk to business impact — data exposure, compliance violation, etc.]
---
#### CRIT-02: [Next Critical Finding — repeat structure]
---
### High Findings
#### HIGH-01: [Finding Title]
| Field | Detail |
|-------|--------|
| **Severity** | High |
| **Category** | [Category] |
| **Resource** | `[resource_type.resource_name]` |
| **File / Line** | `[path/to/file.tf:line]` |
| **Risk** | [Specific risk description] |
**Current code:**
```hcl
# [problematic snippet]
```
**Remediation:**
```hcl
# [fixed snippet]
```
---
### Medium Findings
#### MED-01: [Finding Title]
| Field | Detail |
|-------|--------|
| **Severity** | Medium |
| **Category** | [Category] |
| **Resource** | `[resource_type.resource_name]` |
| **File / Line** | `[path/to/file.tf:line]` |
| **Risk** | [Specific risk description] |
**Remediation:** [Prose or code snippet — choose whichever is clearer for this finding]
---
### Low Findings
#### LOW-01: [Finding Title]
| Field | Detail |
|-------|--------|
| **Severity** | Low |
| **Category** | [Category] |
| **Resource** | `[resource_type.resource_name]` |
| **File / Line** | `[path/to/file.tf:line]` |
| **Suggestion** | [What to improve and why] |
---
## Reusable IaC Review Checklist
Use this checklist on every IaC pull request. Check every item; mark N/A only when the item genuinely does not apply to the resources being provisioned.
### 1. IAM and Access Control
- [ ] No wildcard actions (`"*"`) in IAM policies — policies follow least-privilege
- [ ] No wildcard resource (`"*"`) in IAM policies unless explicitly justified with a comment
- [ ] IAM roles use condition keys to restrict scope (e.g., `aws:RequestedRegion`, `sts:ExternalId`)
- [ ] No IAM access keys or credentials hardcoded or in plaintext variables
- [ ] EC2 / compute instances use instance profiles, not hardcoded credentials
- [ ] S3 bucket policies do not allow public access unless the bucket is explicitly a public asset bucket
- [ ] Cross-account trust policies name specific account IDs, not `"*"`
- [ ] Service accounts (GCP) / managed identities (Azure) follow naming conventions and have documented purpose
### 2. Secrets Management
- [ ] No secrets, passwords, tokens, or API keys in plaintext in any `.tf`, `.yaml`, or `.json` file
- [ ] No secrets in variable default values
- [ ] Secrets sourced from Secrets Manager / Parameter Store / Vault — not from environment variables passed at plan time
- [ ] `sensitive = true` is set on all output values and variables that contain secrets (Terraform)
- [ ] State backend is encrypted — no unencrypted state files contain sensitive data
- [ ] `.gitignore` or equivalent excludes `*.tfvars`, `terraform.tfstate`, and any file that may contain resolved secrets
### 3. Encryption at Rest
- [ ] Storage resources (S3, EBS, RDS, DynamoDB, GCS, Azure Blob) have encryption at rest enabled
- [ ] Customer-managed keys (CMK/KMS) are used where required by policy — not solely AWS/GCP/Azure managed keys
- [ ] KMS key rotation is enabled for all CMKs
- [ ] Database snapshots have encryption enabled
- [ ] Encryption is not disabled via `encrypted = false` or equivalent
### 4. Encryption in Transit
- [ ] Load balancers terminate TLS — HTTP-only listeners redirect to HTTPS or are absent
- [ ] Minimum TLS version is 1.2; TLS 1.0 and 1.1 are explicitly disabled
- [ ] RDS / database connections require SSL (`require_ssl = true` or equivalent parameter)
- [ ] Internal service-to-service calls use TLS where the network is not fully private
- [ ] S3 bucket policies include a `Deny` on non-TLS requests (`aws:SecureTransport: false`)
### 5. Network and Public Access
- [ ] Security groups / firewall rules do not permit `0.0.0.0/0` ingress except on ports 80/443 for public-facing services
- [ ] SSH (port 22) and RDP (port 3389) are not open to `0.0.0.0/0`
- [ ] Databases are in private subnets — not directly internet-routable
- [ ] `publicly_accessible = false` on RDS instances unless explicitly required and documented
- [ ] VPC has flow logs enabled
- [ ] Network ACLs and security groups are layered (defense in depth)
- [ ] S3 bucket public access block is enabled at the account and bucket level
### 6. Logging, Monitoring, and Audit
- [ ] CloudTrail / Cloud Audit Logs / Azure Monitor is enabled across all regions
- [ ] S3 access logging is enabled on buckets containing sensitive or regulated data
- [ ] RDS enhanced monitoring or equivalent is enabled
- [ ] CloudWatch alarms or equivalent are defined for critical metrics (CPU, disk, error rate)
- [ ] Log retention periods are defined — logs not retained indefinitely or deleted within 7 days
### 7. Naming and Tagging Standards
- [ ] All resources follow the team's naming convention: `[env]-[team]-[resource-type]-[identifier]`
- [ ] Required tags are present on all taggable resources:
- [ ] `Environment` (e.g., prod / staging / dev)
- [ ] `Team` or `Owner`
- [ ] `Service` or `Application`
- [ ] `CostCenter` (if required by finance policy)
- [ ] `ManagedBy: terraform` (or equivalent IaC tool tag)
- [ ] No resources with default names (e.g., `default-vpc`, `launch-wizard-1`)
### 8. State Management and Backend
- [ ] Remote state backend is configured — no local state in repository
- [ ] State backend uses locking (DynamoDB for S3 backend, etc.)
- [ ] State backend bucket/storage has versioning enabled
- [ ] State backend bucket/storage has access logging enabled
- [ ] Workspaces or separate state files are used per environment — no shared state between prod and non-prod
- [ ] `terraform.tfstate` and `*.tfstate.backup` are in `.gitignore`
### 9. Module and Resource Structure
- [ ] Modules are versioned with explicit version pins — no floating `source = "git::...?ref=main"`
- [ ] Provider versions are pinned in `required_providers` — no unconstrained `>= x.y`
- [ ] Terraform version is pinned in `required_version`
- [ ] Modules have a clear single responsibility — not one module that provisions everything
- [ ] No copy-paste duplication — repeated patterns use modules or loops (`for_each`, `count`)
- [ ] Outputs expose only what downstream consumers need — no unnecessary output sprawl
### 10. Environment Parity
- [ ] Prod and non-prod environments use the same module code, parameterized by environment variable
- [ ] Instance sizes and replica counts differ by environment via variables — not by separate code branches
- [ ] Non-prod does not have security controls disabled "to save money" (encryption off, logging off)
### 11. Cost Impact
- [ ] Large instance types (e.g., `r5.16xlarge`) or storage allocations are justified in a comment
- [ ] Data transfer costs are considered for cross-region or cross-AZ architectures
- [ ] Reserved instance or committed use discount eligibility is noted for long-lived resources
- [ ] Auto-scaling is configured for variable workloads — no fixed oversized fleets for spiky traffic
- [ ] Lifecycle policies are set on S3 buckets storing time-bounded data (logs, backups)
### 12. Drift Risk
- [ ] No resources that are commonly mutated in the console are managed by IaC without import documentation
- [ ] `lifecycle { prevent_destroy = true }` is set on stateful resources in production (databases, state buckets)
- [ ] `ignore_changes` is used sparingly and each instance is documented with a rationale comment
- [ ] A plan is run against the live environment as part of the PR process — no unreviewed drift
---
## Findings Summary Table
| ID | Title | Severity | Category | File | Status |
|----|-------|----------|----------|------|--------|
| CRIT-01 | [Title] | Critical | [Category] | [file:line] | Open |
| HIGH-01 | [Title] | High | [Category] | [file:line] | Open |
| MED-01 | [Title] | Medium | [Category] | [file:line] | Open |
| LOW-01 | [Title] | Low | [Category] | [file:line] | Open |
---
## Required Actions Before Merge
List only Critical and High findings that must be resolved before this code is merged:
1. **CRIT-01 [Title]** — [One-line remediation instruction]
2. **HIGH-01 [Title]** — [One-line remediation instruction]
Medium and Low findings should be tracked as follow-up issues with a committed resolution date.
---
*Review conducted by [Reviewer] on [Date] — checklist version [1.0]*
---
## Quality Checks
- [ ] Every finding includes: severity, category, specific resource name, file and line number, current code, and fixed code
- [ ] Checklist covers all 12 categories: IAM, Secrets, Encryption at Rest, Encryption in Transit, Network, Logging, Naming/Tagging, State, Module Structure, Environment Parity, Cost, and Drift
- [ ] Executive summary table is filled with real counts — not all zeros or all placeholders
- [ ] "Required Actions Before Merge" section lists only Critical and High items
- [ ] Code snippets in findings show both the problematic code AND the corrected version
- [ ] Overall risk rating is justified by the highest-severity open finding
- [ ] Checklist items are binary (checkable) — not narrative observations
## Anti-Patterns
- [ ] Do not mark a finding as Low if it involves hardcoded credentials or secrets in any form — always Critical
- [ ] Do not review IaC in isolation from the deployment context — networking and IAM must be evaluated together
- [ ] Do not produce narrative findings without the specific resource name, file, and line number
- [ ] Do not skip the "Required Actions Before Merge" summary — reviewers need a clear blocking list, not just a full report
- [ ] Do not approve code where encryption at rest or in transit is missing on data stores, even if not explicitly flagged by the requester
@@ -0,0 +1,440 @@
---
name: load-testing-plan
description: "Write a load and performance testing plan for a service. Use when asked to create a performance test plan, write load testing documentation, define stress or soak test scenarios, or set performance regression gates for CI. Produces a complete test plan document with scenario definitions, k6/Locust script skeleton, threshold table, result interpretation guide, and CI integration steps."
---
# Load Testing Plan Skill
Produce a complete load and performance testing plan for a service — covering test objectives, scenario definitions, tooling configuration, success thresholds, and CI integration. A good load testing plan eliminates ambiguity about what "performance is acceptable" means, so engineers can run tests and get a pass/fail answer without having to interpret raw numbers themselves.
## Required Inputs
Ask for these if not already provided:
- **Service name and key endpoints** — which endpoints are under test (path, method, typical request/response shape)
- **Current traffic baseline** — current requests/sec, p50/p99 latency, error rate under normal load
- **Peak traffic expectations** — expected peak RPS (e.g. 10× baseline for flash sales, or seasonality peak)
- **SLO targets** — latency SLOs (p99 < X ms), error rate SLO (< Y%), availability target
- **Preferred testing tool** — k6, Locust, JMeter, Gatling, or no preference
- **Test environment availability** — dedicated load test environment, staging, or production (with traffic shaping)
## Output Format
---
# Load Testing Plan: [Service Name]
**Author:** [Name] | **Team:** [Team name]
**Date:** [Date] | **Review cycle:** Before each major release and quarterly
**Testing tool:** [k6 / Locust / JMeter / Gatling]
**Test environment:** [Environment name and URL]
---
## 1. Objectives and Scope
**What we are testing:** [Service name] handles [describe function — e.g. "user authentication requests from the mobile and web clients"]. This plan validates that the service meets its SLOs under expected and elevated traffic conditions.
**In scope:**
- [Endpoint 1: METHOD /path — description]
- [Endpoint 2: METHOD /path — description]
- [Endpoint 3: METHOD /path — description]
**Out of scope:**
- [Any endpoints explicitly excluded and why — e.g. "admin APIs — low traffic, excluded from load test"]
- [Third-party integrations that cannot be load-tested — mock them instead]
---
## 2. Performance Targets (Success Criteria)
Every scenario has explicit pass/fail thresholds. A test run FAILS if any threshold is breached.
| Metric | Baseline scenario | Stress scenario | Spike scenario | Soak scenario |
|---|---|---|---|---|
| p50 latency | < [X] ms | < [X × 1.5] ms | < [X × 2] ms | < [X] ms |
| p95 latency | < [Y] ms | < [Y × 1.5] ms | < [Y × 2] ms | < [Y] ms |
| p99 latency | < [Z] ms | < [Z × 2] ms | < [Z × 3] ms | < [Z] ms |
| Error rate | < [0.1]% | < [1]% | < [2]% | < [0.1]% |
| Throughput | ≥ [N] RPS | ≥ [N × 3] RPS | N/A | ≥ [N] RPS |
| Failed requests | 0 (5xx) | < [threshold] | < [threshold] | 0 (5xx) |
**SLO reference:** These thresholds are derived from the service SLOs — p99 < [Z ms], error rate < [0.1]%, availability [99.9]%.
---
## 3. Traffic Model
**Baseline traffic (current production):**
- Average RPS: [N] req/sec
- Peak RPS (observed): [N] req/sec
- Request distribution by endpoint:
- [Endpoint 1]: [X]% of traffic
- [Endpoint 2]: [Y]% of traffic
- [Endpoint 3]: [Z]% of traffic
**Simulated user behaviour:**
- Think time between requests: [XY] seconds (randomised)
- Session duration: [N] minutes average
- Authenticated vs anonymous ratio: [X]%/[Y]%
- Geographic distribution: [Region 1 X]%, [Region 2 Y]%
---
## 4. Test Scenarios
### Scenario 1: Baseline (Steady-State)
**Purpose:** Confirm the service performs acceptably under normal production load.
**Duration:** 10 minutes
**Load profile:** Ramp to [N] RPS over 2 minutes, hold for 8 minutes.
**Concurrency:** [N] virtual users
**Pass criteria:** All thresholds in the Baseline column of the targets table above.
---
### Scenario 2: Stress Test
**Purpose:** Find the breaking point — how much load can the service handle before SLOs are breached?
**Duration:** 2030 minutes
**Load profile:** Ramp from [N] RPS (baseline) to [N × 5] RPS in 5-minute steps. Hold each step for 5 minutes. Stop at first SLO breach.
**Concurrency:** Scales with RPS target
**What to record:**
- RPS at which p99 latency first exceeds SLO
- RPS at which error rate first exceeds SLO
- Whether the service recovers when load drops back to baseline
---
### Scenario 3: Spike Test
**Purpose:** Simulate a sudden traffic surge (flash sale, viral event, bot attack).
**Duration:** 15 minutes
**Load profile:** Hold at [N] RPS (baseline) for 3 minutes, spike to [N × 10] RPS instantly, hold for 5 minutes, drop back to baseline for 7 minutes.
**What to record:**
- Latency during spike and recovery
- Whether the service sheds load gracefully (rate limiting, queue depth)
- Time to recover to baseline latency after spike ends
---
### Scenario 4: Soak / Endurance Test
**Purpose:** Detect memory leaks, connection pool exhaustion, and slow degradation over time.
**Duration:** 48 hours (run overnight)
**Load profile:** Steady [N × 1.5] RPS (50% above baseline) for entire duration.
**What to watch:**
- Memory usage trend over time (should not grow unboundedly)
- Error rate trend (should be flat, not creeping up)
- GC pause frequency (JVM/Go services)
- Database connection pool utilisation
- p99 latency trend (should not creep up over hours)
---
## 5. Test Environment Requirements
### Infrastructure
| Component | Requirement | Notes |
|---|---|---|
| Service under test | Isolated from production | [N] replicas, matching prod resource limits |
| Database | Separate instance with production-scale data | Seed script in section 7 |
| Cache (Redis/Memcached) | Empty at test start | Ensures cold-start conditions are tested |
| Load generator | Separate from service under test | [N] vCPUs, [N] GB RAM minimum |
| Network | Low-latency path to service | Do not run generator on same host |
### Data Seeding
Before every test run, ensure the environment has:
```bash
# Seed test users (needed for authenticated endpoint tests)
[seed command or script path — e.g. python scripts/seed_load_test_users.py --count 10000]
# Seed test data for read endpoints
[seed command — e.g. ./scripts/seed_products.sh --count 50000]
# Verify seed completed
[verification command — e.g. psql $DB_URL -c "SELECT COUNT(*) FROM users WHERE load_test=true"]
```
**Test data rules:**
- Never use real production user data in load tests
- Tag all test-generated records with `load_test=true` for easy cleanup
- Run cleanup after each test: `[cleanup command]`
---
## 6. Tooling Setup
### k6 Script Skeleton
```javascript
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';
// Custom metrics
const errorRate = new Rate('error_rate');
const endpointLatency = new Trend('endpoint_latency', true);
// Test configuration — override per scenario
export const options = {
scenarios: {
baseline: {
executor: 'ramping-vus',
startVUs: 0,
stages: [
{ duration: '2m', target: [BASELINE_VUS] },
{ duration: '8m', target: [BASELINE_VUS] },
{ duration: '1m', target: 0 },
],
},
},
thresholds: {
http_req_duration: [
'p(95)<[Y_MS]',
'p(99)<[Z_MS]',
],
error_rate: ['rate<0.01'],
http_req_failed: ['rate<0.01'],
},
};
// Auth helper — get token once per VU
export function setup() {
const loginRes = http.post('[BASE_URL]/auth/login', JSON.stringify({
username: `load_test_user_${Math.floor(Math.random() * 10000)}@example.com`,
password: '[LOAD_TEST_PASSWORD]',
}), { headers: { 'Content-Type': 'application/json' } });
check(loginRes, { 'login ok': (r) => r.status === 200 });
return { token: loginRes.json('access_token') };
}
export default function (data) {
const headers = {
Authorization: `Bearer ${data.token}`,
'Content-Type': 'application/json',
};
// Endpoint 1: [Description]
const res1 = http.get('[BASE_URL]/[endpoint-1]', { headers });
check(res1, {
'[endpoint-1] status 200': (r) => r.status === 200,
'[endpoint-1] latency < [X]ms': (r) => r.timings.duration < [X],
});
errorRate.add(res1.status >= 400);
endpointLatency.add(res1.timings.duration, { endpoint: '[endpoint-1]' });
sleep(Math.random() * [THINK_TIME_MAX] + [THINK_TIME_MIN]);
// Endpoint 2: [Description]
const res2 = http.post('[BASE_URL]/[endpoint-2]',
JSON.stringify({ [key]: '[value]' }),
{ headers }
);
check(res2, {
'[endpoint-2] status 201': (r) => r.status === 201,
});
errorRate.add(res2.status >= 400);
}
```
### Locust Script Skeleton (alternative)
```python
from locust import HttpUser, task, between
import random
class [ServiceName]User(HttpUser):
wait_time = between([THINK_TIME_MIN], [THINK_TIME_MAX])
token = None
def on_start(self):
"""Called once per simulated user — authenticate."""
user_id = random.randint(1, 10000)
response = self.client.post("/auth/login", json={
"username": f"load_test_user_{user_id}@example.com",
"password": "[LOAD_TEST_PASSWORD]",
})
self.token = response.json()["access_token"]
self.headers = {"Authorization": f"Bearer {self.token}"}
@task([WEIGHT_1]) # Weight = relative frequency
def [endpoint_1_task](self):
"""[Endpoint 1 description]"""
with self.client.get(
"/[endpoint-1]",
headers=self.headers,
catch_response=True
) as response:
if response.elapsed.total_seconds() > [LATENCY_THRESHOLD]:
response.failure(f"Too slow: {response.elapsed.total_seconds()}s")
@task([WEIGHT_2])
def [endpoint_2_task](self):
"""[Endpoint 2 description]"""
self.client.post(
"/[endpoint-2]",
json={"[key]": "[value]"},
headers=self.headers,
)
```
### Running Tests
```bash
# k6 — run baseline scenario
k6 run --env BASE_URL=https://[test-env-url] scripts/load_test.js
# k6 — run stress scenario with output to InfluxDB
k6 run --out influxdb=http://[influxdb-host]:8086/k6 \
--env SCENARIO=stress \
scripts/load_test.js
# Locust — headless run
locust -f locustfile.py \
--headless \
--users [N] \
--spawn-rate [N] \
--run-time 10m \
--host https://[test-env-url] \
--csv=results/[run-id]
# Locust — web UI (interactive)
locust -f locustfile.py --host https://[test-env-url]
```
---
## 7. Metrics to Capture
Capture all of the following during every test run. Missing any of these makes result comparison unreliable.
| Metric | Source | Why it matters |
|---|---|---|
| p50, p95, p99, p999 latency per endpoint | Load tool | SLO validation |
| Error rate (4xx, 5xx) per endpoint | Load tool | SLO validation |
| Requests/sec (throughput) | Load tool | Capacity baseline |
| CPU utilisation (%) | Infra monitoring | Saturation signal |
| Memory utilisation (%) | Infra monitoring | Leak detection |
| GC pause time / frequency | JVM/Go metrics | Latency spike root cause |
| DB connection pool: active/idle/waiting | DB metrics | Pool exhaustion detection |
| DB query latency (p99) | DB metrics | Downstream bottleneck |
| Cache hit rate | Cache metrics | Miss storm detection |
| Pod/instance count (if autoscaling) | Infra | Scaling behaviour |
| Network in/out bytes | Infra | Bandwidth saturation |
---
## 8. Result Analysis Framework
After each test run, work through this analysis in order:
**Step 1 — Pass/fail check**
Compare all captured metrics against the thresholds in Section 2. Record pass/fail per scenario.
**Step 2 — Latency distribution**
Plot the full latency histogram, not just percentiles. A bimodal distribution (two humps) indicates two distinct code paths — investigate the slow hump.
**Step 3 — Error correlation**
If errors occurred, correlate them with:
- Time of occurrence (was it during ramp-up, steady state, or spike?)
- Specific endpoint (is it one endpoint or all?)
- Infrastructure events (CPU spike, OOM, DB connection exhaustion?)
**Step 4 — Saturation analysis**
Graph CPU, memory, and connection pool over time. If any resource reached 80%+ of capacity, it is a candidate bottleneck — even if SLOs passed this run.
**Step 5 — Compare to baseline run**
Every run should be compared to the previous run. A 10% regression in p99 latency warrants investigation even if it is still within SLO.
**Regression classification:**
| Change | Classification | Action |
|---|---|---|
| p99 within 5% of previous run | Green — no regression | No action |
| p99 515% worse than previous | Yellow — watch | Investigate before next release |
| p99 >15% worse than previous | Red — regression | Block release, file ticket |
| Error rate increased vs previous | Red — regression | Block release |
| SLO threshold breached | Critical | Block release, page on-call |
---
## 9. CI Integration
Add load tests as a gated step in the release pipeline. Run the baseline scenario on every release candidate; run all scenarios weekly.
```yaml
# Example: GitHub Actions step (adapt for your CI platform)
load-test:
runs-on: ubuntu-latest
needs: [deploy-staging]
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v3
- name: Install k6
run: |
curl -s https://dl.k6.io/key.gpg | sudo apt-key add -
echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update && sudo apt-get install k6
- name: Seed test data
run: [seed command]
- name: Run baseline load test
run: |
k6 run \
--env BASE_URL=${{ secrets.LOAD_TEST_ENV_URL }} \
--out json=results.json \
scripts/load_test.js
env:
LOAD_TEST_ENV_URL: ${{ secrets.LOAD_TEST_ENV_URL }}
- name: Check thresholds
run: |
# k6 exits with non-zero if any threshold fails — this step fails the build
echo "k6 threshold check complete"
- name: Upload results
uses: actions/upload-artifact@v3
if: always()
with:
name: load-test-results-${{ github.run_id }}
path: results.json
- name: Cleanup test data
if: always()
run: [cleanup command]
```
**CI gates summary:**
- Baseline scenario runs on every release to staging
- Full scenario suite (stress, spike, soak) runs weekly on a schedule
- Any threshold failure blocks promotion to production
- Results are archived for trend analysis
---
## Quality Checks
- [ ] All key endpoints are covered by at least one test scenario — no production endpoint is untested
- [ ] Thresholds are derived from actual SLO targets, not guesses
- [ ] Test data seeding is scripted and reproducible — tests do not rely on pre-existing environment state
- [ ] The load generator runs on separate infrastructure from the service under test
- [ ] CI integration blocks promotion on threshold failure — not just records results
- [ ] Soak test has been run at least once to establish a memory and connection pool baseline
- [ ] Results comparison to previous run is part of the analysis — not just absolute pass/fail
## Anti-Patterns
- [ ] Do not set thresholds without grounding them in actual SLO targets or production baselines — arbitrary numbers produce meaningless pass/fail results
- [ ] Do not run the load generator on the same host as the service under test — this contaminates both the test results and the service metrics
- [ ] Do not use production user data in load test seeding — all test data must be synthetic, tagged, and cleaned up after each run
- [ ] Do not skip the soak test on first deployment — only a soak test reveals slow memory leaks and connection pool exhaustion that short tests miss
- [ ] Do not treat a passing baseline test as evidence the service handles spikes — baseline, stress, spike, and soak scenarios test fundamentally different failure modes
@@ -0,0 +1,492 @@
---
name: local-dev-setup
description: "Write a local development environment setup guide for a service or project — covering prerequisites, repository setup, environment variables, local service dependencies, database seeding, running the service, running tests, common gotchas, IDE recommendations, and first-contribution checklist. Use when asked to write a dev setup guide, create onboarding documentation for engineers, document local environment setup, or write a getting-started guide for a codebase. Produces a complete setup guide that a new engineer can follow from zero to running tests in under 30 minutes, with a troubleshooting section for the most common setup failures."
---
# Local Dev Setup Skill
Produce a complete local development environment setup guide for a service or project — walking a new engineer from zero (a clean laptop) to a working local environment with passing tests in under 30 minutes. A good setup guide reduces onboarding time, prevents the "it works on my machine" problem, and lets engineers make their first contribution with confidence. Write every step as a concrete command or action — not a description of what needs to happen.
## Required Inputs
Ask for these if not already provided:
- **Service name** and what it does
- **Tech stack** — language, framework, database, cache, message queue, and any external services
- **Dependencies** — databases, caches, message queues, and external services (mocked or real)
- **Test framework** — how tests are run and what the test suite covers
- **CI/CD platform** — GitHub Actions, CircleCI, Jenkins, etc. (for context on what "passing CI" means locally)
## Output Format
---
# Local Development Setup: [Service Name]
**Tech stack:** [Language + version] | [Framework] | [Database] | [Cache]
**Estimated setup time:** [2030 minutes] on a clean machine
**Last verified:** [Date] on [macOS Ventura 13.x / Ubuntu 22.04]
**Questions?** Ask in [Slack: #[team-channel]] or ping [@tech-lead-handle]
> **First contribution?** Complete setup first (this doc), then read [CONTRIBUTING.md] for code standards and PR process.
---
## Prerequisites
Install these tools before starting. The versions listed are the minimum required — newer patch versions are fine, newer major versions may have compatibility issues.
### Required Tools
| Tool | Required version | Install |
|---|---|---|
| [Git] | 2.x+ | Pre-installed on most systems; or `brew install git` |
| [Language runtime — e.g. Go] | [1.22+] | [https://go.dev/dl/ or `brew install go`] |
| [Docker] | 24.x+ | [https://docs.docker.com/get-docker/] |
| [Docker Compose] | 2.x+ | Included with Docker Desktop; or `brew install docker-compose` |
| [Make] | Any | Pre-installed on macOS/Linux |
| [Tool — e.g. Node.js] | [20.x+] | [`brew install node` or https://nodejs.org] |
| [Tool — e.g. psql client] | [15+] | `brew install postgresql@15` (client only) |
### Optional but Recommended
| Tool | Purpose | Install |
|---|---|---|
| [direnv] | Auto-load `.envrc` environment variables | `brew install direnv` + [setup instructions](https://direnv.net) |
| [jq] | Pretty-print JSON in terminal | `brew install jq` |
| [k9s] | Kubernetes cluster UI (if using K8s locally) | `brew install k9s` |
| [mkcert] | Local HTTPS certificates | `brew install mkcert` |
### Required Accounts and Access
Before starting, make sure you have:
- [ ] GitHub access to [org/repo] — request via [access request process / Slack: #it-help]
- [ ] [AWS / GCP / Azure] account with [dev environment] access — request via [process]
- [ ] [Internal tool — e.g. 1Password] for retrieving development secrets — request via [process]
- [ ] [VPN access] if required to reach internal services — request via [process]
---
## 1. Repository Setup
```bash
# Clone the repository
git clone git@github.com:[org]/[repo-name].git
cd [repo-name]
# Install git hooks (required — enforces commit message format and runs pre-commit checks)
make install-hooks
# Or manually:
# cp scripts/hooks/pre-commit .git/hooks/pre-commit && chmod +x .git/hooks/pre-commit
# Verify your git setup
git config user.name # should be your name
git config user.email # should be your work email
```
**If you see a permission denied error on clone:** Your SSH key is not added to GitHub. Follow [GitHub's SSH key guide](https://docs.github.com/en/authentication/connecting-to-github-with-ssh) or use HTTPS with a personal access token instead.
---
## 2. Environment Variables
The service requires environment variables for configuration. **Never commit actual secrets to the repository.**
### Step 1 — Copy the example file
```bash
cp .env.example .env.local
```
### Step 2 — Fill in the values
Open `.env.local` in your editor. Below is a description of every variable and where to get its value:
| Variable | Description | Where to get it | Example (not real) |
|---|---|---|---|
| `APP_ENV` | Environment name | Set to `development` | `development` |
| `APP_PORT` | Port the service listens on | Set to `8080` for local | `8080` |
| `DATABASE_URL` | PostgreSQL connection string | Use value from Docker Compose (Section 3) | `postgres://app:password@localhost:5432/[service]_dev` |
| `REDIS_URL` | Redis connection string | Use value from Docker Compose | `redis://localhost:6379` |
| `SECRET_KEY` | Application secret key | Generate with: `openssl rand -hex 32` | `[random 64-char hex]` |
| `[EXTERNAL_SERVICE]_API_KEY` | API key for [External Service] | Retrieve from [1Password vault: "Dev API Keys"] or ask [name] | — |
| `[EXTERNAL_SERVICE]_BASE_URL` | Base URL for [External Service] | Use sandbox URL: `https://sandbox.[external-service].com` | `https://sandbox.stripe.com` |
| `LOG_LEVEL` | Logging verbosity | Set to `debug` for local development | `debug` |
| `[FEATURE_FLAG_SDK_KEY]` | Feature flag platform SDK key | Retrieve from [LaunchDarkly/Split dev project] | — |
**Using direnv (recommended):** Rename `.env.local` to `.envrc`, add `dotenv` at the top, and run `direnv allow`. Variables will load automatically when you `cd` into the project.
---
## 3. Local Service Dependencies
All infrastructure dependencies run in Docker Compose. You do not need to install PostgreSQL, Redis, or Kafka locally.
```bash
# Start all dependencies (PostgreSQL, Redis, and any other services)
docker compose up -d
# Verify all containers are healthy
docker compose ps
# Expected output: all services show "healthy" status
# View logs if something is not healthy
docker compose logs [service-name]
```
### What Docker Compose Starts
| Service | Port | Purpose | Health check |
|---|---|---|---|
| PostgreSQL [version] | `5432` | Primary database | `pg_isready -U app` |
| Redis [version] | `6379` | Cache and session store | `redis-cli ping` |
| [Kafka + Zookeeper] | `9092` / `2181` | Message queue | `kafka-topics.sh --list` |
| [Mock server — e.g. WireMock] | `8089` | Mocks for external APIs in tests | `curl localhost:8089/__admin` |
| [LocalStack] | `4566` | AWS service emulation (S3, SQS, etc.) | `aws --endpoint-url=http://localhost:4566 s3 ls` |
**If a container exits immediately:** See Troubleshooting section — common causes are port conflicts and Docker memory limits.
### Stopping Dependencies
```bash
# Stop containers (preserves data volumes)
docker compose stop
# Stop and remove containers (clears data — use when you want a fresh start)
docker compose down -v
```
---
## 4. Install Dependencies and Build
```bash
# Install language dependencies
# Go:
go mod download
# Node.js:
npm install # or: yarn install / pnpm install
# Python:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements-dev.txt
# Verify build compiles cleanly
make build
# Expected: no errors; binary or compiled output in [./bin/ or ./dist/]
```
---
## 5. Database Setup and Seeding
```bash
# Run database migrations (creates tables and schema)
make db-migrate
# Or directly:
# [Migration command — e.g. "go run ./cmd/migrate up" or "alembic upgrade head" or "npm run db:migrate"]
# Verify migrations applied
# psql $DATABASE_URL -c "\dt" # should list all tables
# Seed the database with development data
make db-seed
# Or directly:
# [Seed command — e.g. "go run ./cmd/seed" or "python scripts/seed.py" or "npm run db:seed"]
# Verify seed data is present
# psql $DATABASE_URL -c "SELECT COUNT(*) FROM [primary-table]"
# Expected: [N] rows
```
**What the seed creates:**
- [N] test user accounts (credentials in [scripts/seed/README.md or .env.example])
- [N] sample [resources] for development and testing
- Admin account: `[admin@example.com]` / password: see `.env.example` for dev password variable
**To reset to a clean state:**
```bash
docker compose down -v # wipe database volume
docker compose up -d # start fresh
make db-migrate
make db-seed
```
---
## 6. Running the Service
```bash
# Run the service locally
make run
# Or directly:
# [Run command — e.g. "go run ./cmd/server" or "python app.py" or "npm run dev"]
# Expected output:
# [Example of healthy startup log lines — e.g.:]
# {"level":"info","message":"Database connected","host":"localhost","port":5432}
# {"level":"info","message":"Redis connected","host":"localhost","port":6379}
# {"level":"info","message":"Server listening","port":8080}
```
### Verify It's Working
```bash
# Health check
curl http://localhost:8080/health
# Expected: {"status":"ok","version":"[git-sha]"}
# Test a key endpoint (authenticated)
# First, get a dev token:
curl -X POST http://localhost:8080/api/v1/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"[dev-user-from-seed]@example.com","password":"[dev-password-from-env]"}'
# Copy the token from the response, then:
curl http://localhost:8080/api/v1/[resource] \
-H "Authorization: Bearer [token-from-above]"
# Expected: 200 with JSON response
```
### Hot Reload (for Development)
```bash
# Run with hot reload — service restarts automatically on file changes
make run-dev
# Or:
# [Hot reload command — e.g. "air" for Go / "uvicorn --reload" for Python / "npm run dev" for Node]
```
---
## 7. Running Tests
```bash
# Run the full test suite
make test
# Or:
# [Test command — e.g. "go test ./..." or "pytest" or "npm test"]
# Run tests with coverage report
make test-coverage
# Coverage report: [./coverage.html or stdout]
# Run a specific test file or test case
# Go: go test ./pkg/[package]/... -run TestFunctionName
# Python: pytest tests/test_[module].py::TestClass::test_method -v
# Node: npm test -- --testPathPattern=[filename]
# Run only unit tests (fast — no external dependencies)
make test-unit
# Run only integration tests (requires Docker Compose dependencies running)
make test-integration
```
**Expected test results:**
- Unit tests: [N] tests, all pass, [<30] seconds
- Integration tests: [N] tests, all pass, [<2] minutes
- Coverage: [≥80]% (enforced in CI — tests fail below this threshold)
**Before pushing a PR, always run:**
```bash
make lint # code linting — must pass
make test # full test suite — must pass
make build # verify compilation — must pass
```
---
## 8. IDE Setup
### VS Code (Recommended)
Install the recommended extensions (VS Code will prompt you automatically):
```json
// .vscode/extensions.json — already in the repository
{
"recommendations": [
"[language-extension — e.g. golang.go]",
"dbaeumer.vscode-eslint",
"esbenp.prettier-vscode",
"ms-azuretools.vscode-docker",
"eamodio.gitlens"
]
}
```
Workspace settings are in `.vscode/settings.json` — format on save is enabled, linter is configured automatically.
**[Language]-specific setup:**
```
[e.g. Go: The gopls language server is installed automatically by the Go extension.
Run "Go: Install/Update Tools" from the command palette after installing the extension.]
```
### JetBrains (IntelliJ / GoLand / PyCharm / WebStorm)
- Open the project root as the project directory
- [Language SDK]: set to [version] — File → Project Structure → SDKs
- Run configurations are checked into `.idea/runConfigurations/` — they appear automatically
- Enable "Run formatters on save" in Settings → Tools → Actions on Save
---
## 9. Common Gotchas and Troubleshooting
### Docker container exits immediately on startup
**Symptom:** `docker compose ps` shows a container as `Exited (1)` seconds after starting.
```bash
# Check the container logs for the error
docker compose logs [container-name]
# Common causes:
# 1. Port already in use — find and kill the conflicting process:
lsof -ti tcp:[port] | xargs kill -9
# 2. Docker doesn't have enough memory — allocate at least 4GB in Docker Desktop:
# Docker Desktop → Settings → Resources → Memory → 4GB
# 3. M1/M2 Mac architecture mismatch — add platform directive to docker-compose.yml:
# platform: linux/amd64
```
### Database connection refused
**Symptom:** Service fails to start with "connection refused" or "dial tcp localhost:5432: connect: connection refused"
```bash
# Is PostgreSQL actually running?
docker compose ps postgres
# If not running: docker compose up -d postgres
# Is it on the right port?
lsof -i :5432
# Can you connect manually?
psql postgres://app:password@localhost:5432/[service]_dev -c "SELECT 1"
# If using a custom DATABASE_URL, verify it matches the docker-compose.yml settings exactly
```
### Migrations fail with "relation already exists"
**Symptom:** `make db-migrate` errors with "ERROR: relation [table] already exists"
```bash
# Check current migration state
[migration status command — e.g. "go run ./cmd/migrate status" or "alembic current"]
# The database may be in a partial state — reset it:
docker compose down -v
docker compose up -d
make db-migrate # should now succeed on a clean database
```
### Tests fail with "connection refused" or dependency errors
**Symptom:** Integration tests fail because they cannot connect to PostgreSQL or Redis.
```bash
# Integration tests need Docker Compose running
docker compose up -d
# Verify all containers are healthy before running tests
docker compose ps # all should show "healthy"
# If containers are running but tests still fail, check environment variables:
make test-integration # should pick up .env.local automatically
# If not: source .env.local && make test-integration
```
### `make lint` fails on a fresh checkout
**Symptom:** Lint errors on files you have not modified.
```bash
# Formatting issue — auto-fix with:
# Go:
gofmt -w .
goimports -w .
# Python:
black .
isort .
# Node/TypeScript:
npm run lint:fix
# Or: npx eslint --fix . && npx prettier --write .
# Re-run lint to confirm
make lint
```
### Environment variables not loading
**Symptom:** Service starts but immediately fails with "missing required environment variable: [VAR]"
```bash
# Verify .env.local exists and has all required variables
cat .env.local | grep "^[A-Z]" | awk -F= '{print $1}'
# Compare against required variables in .env.example
diff <(grep "^[A-Z_]*=" .env.example | cut -d= -f1 | sort) \
<(grep "^[A-Z_]*=" .env.local | cut -d= -f1 | sort)
# Missing variables are shown in left column only (< prefix)
```
---
## 10. First Contribution Checklist
Before opening your first pull request, verify:
**Setup complete:**
- [ ] `make build` passes with no errors
- [ ] `make test` passes — all tests green
- [ ] `make lint` passes — no lint errors
- [ ] Service starts and health check returns 200
- [ ] You can authenticate and call at least one API endpoint
**Git and GitHub:**
- [ ] You have read [CONTRIBUTING.md] — code standards, commit message format, PR process
- [ ] Your git user.name and user.email are set correctly
- [ ] Pre-commit hooks are installed (`ls .git/hooks/pre-commit` should exist)
- [ ] You have branched from `main` (not committing directly to main)
**Development workflow:**
- [ ] You know how to run a specific test: `[test command for single test]`
- [ ] You know how to reset the database: `docker compose down -v && docker compose up -d && make db-migrate && make db-seed`
- [ ] You have joined [Slack: #[team-channel]] and [#[service-consumers-channel] if applicable]
- [ ] You have read the [architecture overview doc / README] — you understand what this service does
**First PR:**
- [ ] Changes are small and focused — one logical change per PR
- [ ] Tests are added or updated for your change
- [ ] `make test && make lint && make build` all pass locally before requesting review
- [ ] PR description explains what changed and why (use the [pr-description-writer skill] if needed)
---
## Quality Checks
- [ ] A new engineer with no prior knowledge of the project can follow this guide from start to finish without asking anyone for help
- [ ] Every command is tested on a clean environment — not written from memory and assumed to work
- [ ] Environment variables table covers every variable in `.env.example` — no undocumented variables
- [ ] The troubleshooting section covers the 5 most common real failures observed during onboarding — not theoretical issues
- [ ] Docker Compose version and Docker Desktop memory requirements are stated explicitly
- [ ] "Expected output" is shown for key commands so engineers know whether a step succeeded
- [ ] Setup time estimate is honest — verified by timing a real onboarding session, not estimated
## Anti-Patterns
- [ ] Do not write setup steps from memory without testing them on a clean machine — steps that skip implicit knowledge break for new engineers
- [ ] Do not leave environment variables undocumented — every variable in .env.example must appear in the Variables table with a description and source
- [ ] Do not write troubleshooting entries for theoretical issues — only include problems that have actually occurred during real onboarding sessions
- [ ] Do not assume Docker Desktop is configured correctly — memory limits and platform (M1/M2) compatibility must be explicitly called out
- [ ] Do not omit expected output for key commands — without "expected output", engineers cannot tell whether a step succeeded or silently failed
@@ -0,0 +1,298 @@
---
name: microservices-decomposition
description: "Design a microservices decomposition for a monolith or new system, defining service boundaries, ownership, communication patterns, and migration plan. Use when asked to decompose a monolith, define service boundaries, design a microservices architecture, or plan a strangler-fig migration. Produces a bounded context map, service inventory table, communication pattern decisions, data ownership matrix, migration roadmap, and risk register."
---
# Microservices Decomposition
Produce a complete microservices decomposition design for a system — whether decomposing an existing monolith or designing service boundaries for a new system. Ground the decomposition in Domain-Driven Design (DDD) concepts: identify bounded contexts first, then derive service boundaries from them. Include communication pattern decisions (sync vs. async, event vs. RPC), data ownership rules, and a pragmatic migration plan if decomposing a monolith. Conway's Law is real — include an organizational alignment section. The deliverable should be specific enough that a team can begin implementation, not an abstract architectural diagram.
## Required Inputs
Ask for these if not already provided:
- **System or domain description** — what the system does, its core domain, and the key business processes it supports
- **Current architecture** — monolith (describe the tech stack and rough module structure), partial services (list existing services), or greenfield
- **Team structure** — number of teams, team names if known, and approximate team sizes; this drives service ownership
- **Performance and scalability requirements** — any specific SLAs, load characteristics, or scaling constraints per domain area
- **Migration constraints** — what cannot be rewritten all at once, hard deadlines, zero-downtime requirements, budget constraints
- **Integration points** — external systems, third-party APIs, or legacy systems that cannot be changed
If decomposing a monolith, also ask for: approximate codebase size, what is most painful to change today, and where the team experiences the most coupling-related friction.
## Output Format
---
# Microservices Decomposition: [System Name]
**Author:** [Name / Team]
**Date:** [Date]
**Architecture type:** [Monolith decomposition / New system design]
**Current state:** [One sentence describing what exists today]
**Target state:** [One sentence describing the desired end state]
---
## 1. Domain Analysis
### Core Domain
[One paragraph: what is the core domain of this system? What does the business fundamentally do? What gives it competitive differentiation? The core domain gets the most investment and the cleanest service boundaries.]
### Domain Map
List every significant subdomain before assigning service boundaries. Classify each subdomain:
| Subdomain | Type | Description | Current Location in Monolith |
|-----------|------|-------------|------------------------------|
| [Subdomain, e.g., Order Management] | Core | [What it does and why it matters] | [Module/package name or "new"] |
| [Subdomain, e.g., Inventory] | Core | [Description] | [Location] |
| [Subdomain, e.g., Notifications] | Supporting | [Description] | [Location] |
| [Subdomain, e.g., Billing] | Supporting | [Description] | [Location] |
| [Subdomain, e.g., Reporting] | Generic | [Description — candidates for off-the-shelf solutions] | [Location] |
| [Subdomain, e.g., User Auth] | Generic | [Description] | [Location] |
**Subdomain types:** Core = competitive differentiation, build with care; Supporting = necessary but not differentiating, build pragmatically; Generic = commodity, buy or use open source.
---
## 2. Bounded Context Map (ASCII)
```
┌─────────────────────────────────────────────────────────────────┐
│ [System Name] │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ [Context A] │ │ [Context B] │ │
│ │ │─ ─►│ │ │
│ │ [key concepts] │ │ [key concepts] │ │
│ └──────────────────┘ └──────────────────┘ │
│ │ │ │
│ │ event │ sync │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ [Context C] │ │ [Context D] │ │
│ │ │ │ │ │
│ │ [key concepts] │ │ [key concepts] │ │
│ └──────────────────┘ └──────────────────┘ │
│ │ │
│ ┌────────┘ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ [Context E] │ │
│ │ [key concepts] │ │
│ └──────────────────┘ │
│ │
│ External: [Third-party system] ──► [Context that owns it] │
└─────────────────────────────────────────────────────────────────┘
Legend: ──► sync call - -► async event ═══ shared kernel
```
Render this map using the actual bounded contexts derived from the domain analysis. Place contexts that communicate frequently closer together. Label relationship types on arrows.
### Context Relationships
| Upstream Context | Downstream Context | Relationship Type | Integration Pattern |
|-----------------|-------------------|------------------|---------------------|
| [Context A] | [Context B] | Customer-Supplier | REST API call |
| [Context B] | [Context C] | Published Language | Domain events via message bus |
| [Context X] | [Context Y] | Conformist | [Downstream conforms to upstream's model] |
| [Context X] | [Context Y] | Anti-Corruption Layer | [ACL translates upstream model to local model] |
---
## 3. Proposed Service Inventory
| Service Name | Bounded Context | Core Responsibility | Team Owner | Tech Stack | Priority |
|-------------|----------------|--------------------|-----------|-----------|---------|
| [service-name] | [Context] | [One sentence: what this service owns and does] | [Team] | [Language/framework] | [P1/P2/P3] |
| [service-name] | [Context] | [Responsibility] | [Team] | [Stack] | [Priority] |
| [service-name] | [Context] | [Responsibility] | [Team] | [Stack] | [Priority] |
| [service-name] | [Context] | [Responsibility] | [Team] | [Stack] | [Priority] |
| [service-name] | [Context] | [Responsibility] | [Team] | [Stack] | [Priority] |
**Service count:** [N proposed services] for [M bounded contexts]. [Note if any context maps to multiple services and why — e.g., "the Orders context splits into order-intake and order-fulfillment because they have different scalability requirements."]
### Service Responsibility Rules (applied to every service above)
- Single bounded context ownership — a service does not straddle two bounded contexts
- Owns its own data — no direct database access by other services
- Independently deployable — no coordinated deploys required with other services
- Has a named team owner — no shared ownership of a single service across teams
- Exposes a defined API contract — not internal implementation
---
## 4. Inter-Service Communication Patterns
### Pattern Decision Matrix
| Communication Need | Recommended Pattern | Rationale |
|-------------------|--------------------|-----------|
| Query another service's current state | Synchronous REST / gRPC | Low latency required; caller needs immediate response |
| Notify other services of a state change | Async domain event | Decouples services; multiple consumers; sender doesn't care when it's processed |
| Long-running workflow spanning services | Async saga (choreography or orchestration) | No single service owns the full workflow; rollback needed if steps fail |
| Read-heavy cross-service aggregation | CQRS read model / materialized view | Avoid chatty sync calls at read time; build purpose-fit read models |
| Real-time push to clients | WebSocket gateway service | Centralizes connection management; services emit events, gateway pushes |
### Per-Service Communication Decisions
| Service | Calls (sync) | Publishes (events) | Subscribes to (events) |
|---------|-------------|-------------------|----------------------|
| [service-name] | [service-name (endpoint)] | [EventName] | [EventName] |
| [service-name] | — | [EventName], [EventName] | [EventName] |
| [service-name] | [service-name (endpoint)] | — | [EventName] |
### Event Catalog
| Event Name | Producer | Consumers | Payload (key fields) | Trigger |
|-----------|---------|---------|---------------------|---------|
| [OrderPlaced] | [order-service] | [inventory-service, notification-service] | `orderId, customerId, lineItems, totalAmount` | Customer submits order |
| [InventoryReserved] | [inventory-service] | [order-service] | `orderId, reservationId, items` | Inventory successfully reserved |
| [PaymentProcessed] | [payment-service] | [order-service, notification-service] | `orderId, paymentId, amount, status` | Payment confirmed |
---
## 5. Data Ownership Matrix
Each piece of data has exactly one owning service. Other services may cache or project a read model, but they do not write to the owner's database.
| Data Entity | Owner Service | Authoritative Store | Consumers | Access Pattern |
|-------------|--------------|--------------------|-----------| ---------------|
| [Order] | [order-service] | [PostgreSQL] | [fulfillment-service, reporting-service] | Event subscription + read API |
| [Customer] | [customer-service] | [PostgreSQL] | [order-service, notification-service] | Sync API call |
| [Product Catalog] | [catalog-service] | [PostgreSQL] | [order-service, inventory-service] | Sync API + cached local copy |
| [Inventory Level] | [inventory-service] | [Redis + PostgreSQL] | [catalog-service (read only)] | Event subscription |
| [Payment Record] | [payment-service] | [PostgreSQL] | [order-service] | Event subscription |
### Data Migration (if decomposing a monolith)
| Data Entity | Current Location | Target Service | Migration Approach | Data Volume | Risk |
|-------------|-----------------|---------------|-------------------|-------------|------|
| [Entity] | [monolith.orders table] | [order-service] | Dual-write then cut over | [X rows] | [High/Med/Low] |
| [Entity] | [monolith.users table] | [customer-service] | Extract and sync via CDC | [X rows] | [High/Med/Low] |
---
## 6. API Contract Definitions
Define the surface area for each service. Full OpenAPI specs are written separately; this section establishes the contract boundaries.
### [service-name] API
**Base path:** `/api/v1/[resource]`
**Owner team:** [Team]
**SLA:** [p99 latency target, availability target]
| Endpoint | Method | Description | Auth Required | Rate Limit |
|----------|--------|-------------|--------------|------------|
| `/[resources]` | GET | List [resources] with pagination | Yes | [X req/min] |
| `/[resources]/{id}` | GET | Get single [resource] by ID | Yes | [X req/min] |
| `/[resources]` | POST | Create new [resource] | Yes | [X req/min] |
| `/[resources]/{id}` | PUT | Update [resource] | Yes | [X req/min] |
| `/[resources]/{id}` | DELETE | Soft-delete [resource] | Yes — elevated | [X req/min] |
[Repeat for each service.]
---
## 7. Strangler Fig Migration Plan (for monolith decomposition)
Use the strangler fig pattern: extract services incrementally, route traffic through a facade, and retire monolith modules one at a time.
### Migration Phases
```
Phase 1: Foundation (Weeks 1[N])
- Deploy service infrastructure (CI/CD, observability, service mesh)
- Extract lowest-risk, highest-value service first
- Monolith continues to serve all traffic
Phase 2: First Extractions (Weeks [N][M])
- Extract P1 services
- API gateway routes selected traffic to new services
- Monolith handles remaining traffic via facade pattern
- Both paths write to shared DB during transition (dual-write)
Phase 3: Core Domain Services (Weeks [M][P])
- Extract P1 core domain services
- Data migration for extracted services
- Remove dual-write paths for completed migrations
Phase 4: Monolith Retirement (Weeks [P][Q])
- Extract remaining services
- Monolith serves no production traffic
- Decommission monolith infrastructure
```
### Phase-by-Phase Roadmap
| Phase | Service to Extract | Migration Approach | Team | Duration | Dependencies | Success Criteria |
|-------|------------------|--------------------|------|----------|-------------|-----------------|
| 1 | [service-name] | [Strangler facade / Branch by abstraction / Event interception] | [Team] | [X weeks] | [Infra ready, CI/CD pipeline] | [Traffic fully on new service, zero errors for 2 weeks] |
| 2 | [service-name] | [Approach] | [Team] | [X weeks] | [Phase 1 complete] | [Success metric] |
| 3 | [service-name] | [Approach] | [Team] | [X weeks] | [Phase 2 complete] | [Success metric] |
### Rollback Plan
For each migration phase, define the rollback trigger and mechanism:
- **Rollback trigger:** Error rate on new service > [X%] sustained for [Y minutes], or p99 latency > [threshold]
- **Rollback mechanism:** API gateway feature flag reverts all traffic to monolith path in < 5 minutes
- **Data rollback:** Dual-write maintained for [X weeks] after cutover to allow replay if needed
---
## 8. Organizational Alignment (Conway's Law)
Conway's Law: the architecture of a system mirrors the communication structure of the organization that builds it. Design service ownership to match team boundaries — or change the team boundaries.
| Service | Proposed Owner Team | Current Team Assignment | Change Required |
|---------|--------------------|-----------------------|-----------------|
| [service-name] | [Team A] | [Same / Different] | [No change / Transfer to Team A / New team needed] |
| [service-name] | [Team B] | [Team A currently] | [Transfer ownership] |
**Misalignments identified:**
- [Misalignment 1: e.g., "The notification service spans two teams today. Assign it entirely to Team B which already owns the messaging domain."]
- [Misalignment 2: e.g., "The reporting service is owned by Data Eng but consumers are Product teams — establish a clear API contract and SLA."]
**Team topology recommendation:** [Describe the recommended team structure — stream-aligned teams, platform team, enabling team — and how it maps to the proposed services.]
---
## 9. Risk Register
| Risk | Likelihood | Impact | Mitigation | Owner |
|------|-----------|--------|-----------|-------|
| Data consistency across services during migration | High | High | Dual-write with reconciliation job; event sourcing for critical domains | [Name] |
| Distributed transaction complexity (sagas) | Medium | High | Start with choreography; add orchestration only when choreography becomes unmanageable | [Name] |
| Service mesh operational overhead | Medium | Medium | Start without a mesh; add after 5+ services deployed | [Name] |
| Network latency replacing in-process calls | Medium | Medium | Cache aggressively; design read models to avoid chatty sync calls | [Name] |
| Conway's Law friction during transition | High | Medium | Align team structure before starting extraction, not after | [Name] |
| Over-decomposition (nanoservices) | Medium | High | Enforce minimum service size rule: a service must justify its own team/deployment overhead | [Name] |
| Observability gaps during migration | High | High | Deploy distributed tracing before first extraction; establish correlation IDs | [Name] |
| [Context-specific risk] | [Level] | [Level] | [Mitigation] | [Owner] |
---
*Questions about this design: [Slack channel or contact]*
---
## Quality Checks
- [ ] Bounded context map is an ASCII diagram with labeled relationships — not a prose description of the contexts
- [ ] Every service in the inventory table has a named team owner and a clear single-sentence responsibility statement
- [ ] Data ownership matrix assigns every key entity to exactly one owning service — no shared ownership
- [ ] Communication pattern decisions explain WHY sync vs. async was chosen for each interaction type
- [ ] If decomposing a monolith, the strangler fig migration plan has phases with durations, dependencies, and success criteria
- [ ] Risk register addresses at minimum: data consistency, distributed transactions, and Conway's Law alignment
- [ ] Organizational alignment section maps services to teams and identifies misalignments that need to be resolved
## Anti-Patterns
- [ ] Do not define service boundaries before completing the domain analysis — services derived without bounded context mapping will split the wrong things and couple the wrong things
- [ ] Do not assign multiple teams as co-owners of a single service — shared ownership is no ownership; every service needs exactly one team accountable for it
- [ ] Do not default to synchronous REST calls for all inter-service communication — using sync calls where async events would decouple services creates cascading failure modes
- [ ] Do not propose more than one service per bounded context without a clear justification — over-decomposition (nanoservices) creates operational overhead that exceeds the decomposition benefit
- [ ] Do not begin migration without deploying distributed tracing first — migrating without observability means flying blind when the first extraction causes a production incident
@@ -0,0 +1,444 @@
---
name: monitoring-setup-guide
description: "Write a monitoring setup guide for a service — defining what to measure, how to alert on it, and how to build the observability stack covering the four golden signals, business metrics, log strategy, distributed tracing, alerting rules, dashboard layout, and observability debt. Use when asked to set up monitoring for a service, define alerting strategy, write an observability plan, create a dashboard specification, or document logging standards for a team. Produces a metric definitions table, alert rules specification, dashboard layout wireframe, log schema, tracing setup checklist, and monitoring gap analysis."
---
# Monitoring Setup Guide Skill
Produce a complete monitoring setup guide for a service — defining exactly what to measure, how to structure logs, how to configure alerts with actionable thresholds, and how to build dashboards that answer real operational questions. A good monitoring guide eliminates "we don't know what's happening in production" as a root cause category, and gives on-call engineers a single source of truth for what healthy looks like.
## Required Inputs
Ask for these if not already provided:
- **Service name and description** — what the service does and its role in the system
- **Tech stack** — language, framework, and infrastructure (e.g. Go/gRPC on Kubernetes, Python/FastAPI on ECS)
- **Current monitoring tooling** — Datadog, Prometheus + Grafana, CloudWatch, New Relic, Honeycomb, or none yet
- **Key user journeys** — the 24 most important things a user or consumer does with the service (these drive what to alert on)
- **Existing alerts** — paste any existing alert configurations or describe what's currently monitored
## Output Format
---
# Monitoring Setup Guide: [Service Name]
**Team:** [Team name] | **Tech lead:** [Name]
**Stack:** [Language/Framework] on [Infrastructure]
**Monitoring platform:** [Datadog / Prometheus+Grafana / CloudWatch / etc.]
**Date:** [Date] | **Review cycle:** Quarterly
---
## 1. Monitoring Philosophy
Good monitoring answers three questions:
1. **Is the service healthy right now?** (alerting)
2. **Was it healthy in the past, and is it trending worse?** (dashboards + SLO tracking)
3. **Why did something fail?** (logs + traces)
This guide defines the answers for [Service Name]. Every alert must be actionable — if an on-call engineer cannot take a specific action in response to the alert, the alert should not exist.
**Key user journeys monitored:**
- Journey 1: [e.g. "User submits a payment — POST /charges, receives confirmation"]
- Journey 2: [e.g. "User views transaction history — GET /transactions"]
- Journey 3: [e.g. "Subscription renewal job runs — background worker processes billing events"]
---
## 2. The Four Golden Signals
Apply the four golden signals specifically to [Service Name]:
### Latency
Latency measures how long requests take to complete. Track it separately for successful and failed requests — slow failures hide behind fast errors if you only measure aggregate latency.
| Metric | Description | Source | Dimensions |
|---|---|---|---|
| `[service].request.duration_ms` | End-to-end request latency | Application instrumentation | `endpoint`, `method`, `status_code` |
| `[service].db.query_duration_ms` | Database query latency | ORM / query instrumentation | `query_name`, `table` |
| `[service].external.request_duration_ms` | Outbound call latency to dependencies | HTTP client instrumentation | `target_service`, `endpoint` |
| `[service].queue.processing_duration_ms` | Time to process one message (if applicable) | Consumer instrumentation | `queue_name`, `message_type` |
**Latency SLO targets:**
| Endpoint / operation | p50 target | p95 target | p99 target |
|---|---|---|---|
| `GET /api/v1/[resource]` | < [50] ms | < [200] ms | < [500] ms |
| `POST /api/v1/[resource]` | < [100] ms | < [400] ms | < [1000] ms |
| `GET /health` | < [10] ms | < [20] ms | < [50] ms |
| [Background job name] | < [5] sec | < [15] sec | < [60] sec |
### Traffic
Traffic measures demand on the system. Use it to detect unexpected spikes, traffic drops (which can indicate upstream failures), and to capacity-plan.
| Metric | Description | Source |
|---|---|---|
| `[service].request.count` | Requests per second | Application / load balancer |
| `[service].request.count_by_endpoint` | RPS broken down by endpoint | Application |
| `[service].queue.messages_consumed_per_second` | Consumer throughput | Queue consumer |
| `[service].queue.depth` | Messages waiting in queue | Queue metrics |
**Traffic baselines (update after observing production for 2+ weeks):**
| Time period | Expected RPS | Low-traffic floor | Spike ceiling |
|---|---|---|---|
| Peak (weekday business hours) | [N] RPS | [N × 0.5] RPS | [N × 5] RPS |
| Off-peak (nights/weekends) | [N × 0.2] RPS | [N × 0.05] RPS | [N] RPS |
### Errors
Errors measure the fraction of requests that fail. Distinguish between client errors (4xx — caller is doing something wrong) and server errors (5xx — the service is broken).
| Metric | Description | Alert on? |
|---|---|---|
| `[service].request.error_rate` | 5xx errors / total requests | Yes — see alert rules |
| `[service].request.client_error_rate` | 4xx errors / total requests | Threshold alert — sudden spike may indicate API misuse |
| `[service].dependency.error_rate` | Errors calling downstream dependencies | Yes — upstream health signal |
| `[service].queue.dlq_depth` | Messages in dead-letter queue | Yes — indicates processing failures |
### Saturation
Saturation measures how "full" the service is — how close to maximum capacity are the constrained resources.
| Resource | Metric | Alert threshold | Source |
|---|---|---|---|
| CPU | `[service].cpu.utilisation_pct` | >80% sustained 5 min | Container / VM metrics |
| Memory | `[service].memory.utilisation_pct` | >85% sustained 5 min | Container / VM metrics |
| DB connections | `[service].db.connection_pool.utilisation_pct` | >75% | Application / DB metrics |
| Thread pool / goroutines | `[service].runtime.goroutine_count` / `thread_count` | >N (establish baseline) | Runtime metrics |
| Disk (if applicable) | `[service].disk.utilisation_pct` | >75% | Infrastructure |
| Queue depth (if applicable) | `[service].queue.depth` | >[backlog threshold] | Queue metrics |
---
## 3. Business Metrics
Beyond the golden signals, track metrics that measure whether the service is delivering business value. These matter for SLO reporting and product dashboards.
| Metric | Description | Source | Alert? |
|---|---|---|---|
| `[service].[primary_action].success_rate` | [e.g. "Payment success rate"] | Application | Yes — if drops >5% vs 1h average |
| `[service].[primary_action].count` | [e.g. "Payments processed per minute"] | Application | Yes — sudden drop (traffic anomaly) |
| `[service].[resource].created_per_hour` | [e.g. "New accounts created"] | Application / DB | No — informational |
| `[service].cache.hit_rate` | Fraction of requests served from cache | Cache instrumentation | Yes — if drops below [60]% |
| `[service].job.[name].success_rate` | [Background job success rate] | Job framework | Yes — if drops below [99]% |
---
## 4. Log Strategy
### Structured Logging Schema
All logs must be structured JSON. Do not emit unstructured text logs in production. Every log line must include the mandatory fields.
**Mandatory fields (every log line):**
```json
{
"timestamp": "2024-01-15T10:23:45.123Z",
"level": "info",
"service": "[service-name]",
"version": "[git-sha-short]",
"trace_id": "[uuid-from-request-context]",
"span_id": "[span-uuid]",
"request_id": "[uuid-per-request]",
"message": "[human readable description]"
}
```
**Request log (emit for every HTTP request):**
```json
{
"timestamp": "...",
"level": "info",
"service": "[service-name]",
"event": "http_request",
"method": "POST",
"path": "/api/v1/[resource]",
"status_code": 201,
"duration_ms": 45,
"user_id": "[uuid — DO NOT log PII directly]",
"request_id": "[uuid]",
"trace_id": "[uuid]"
}
```
**Error log (emit for every error with context):**
```json
{
"timestamp": "...",
"level": "error",
"service": "[service-name]",
"event": "error",
"error_code": "[application-error-code]",
"error_message": "[description — no sensitive data]",
"stack_trace": "[stack trace]",
"request_id": "[uuid]",
"trace_id": "[uuid]",
"context": {
"[key]": "[relevant context without PII]"
}
}
```
### Log Levels — When to Use Each
| Level | Use when | Example |
|---|---|---|
| `error` | Something failed that requires attention — this should page on-call eventually | Database query failed, external API returned 5xx, required config missing |
| `warn` | Something unexpected happened but service is still functioning | Retry succeeded after failure, cache miss on expected hit, rate limit approaching |
| `info` | Significant business events and request lifecycle | Request received, payment processed, user authenticated, job started/completed |
| `debug` | Detailed diagnostic information — off in production by default | Query parameters, intermediate computation results, cache key lookups |
### What NOT to Log
**Never log:**
- Passwords, tokens, API keys, or secrets (even hashed)
- Full credit card numbers or PAN data
- Social security numbers or government IDs
- Full names + dates of birth + contact info in the same log line (PII aggregation)
- Request/response bodies in full (use field-level extraction instead)
- Health check requests (too noisy — exclude `GET /health` from access logs)
---
## 5. Distributed Tracing Setup
Distributed tracing is mandatory for any service that calls other services. It enables root-cause analysis across service boundaries.
### Instrumentation Checklist
```
[ ] Tracing library installed:
- Go: go.opentelemetry.io/otel
- Python: opentelemetry-sdk, opentelemetry-instrumentation
- Node: @opentelemetry/sdk-node
- Java: opentelemetry-java-instrumentation
[ ] Tracer initialized at service startup with service name and version
[ ] Trace context propagated via W3C Trace Context headers:
traceparent: 00-[trace-id]-[span-id]-01
tracestate: [optional vendor-specific]
[ ] Automatic instrumentation enabled for:
[ ] Inbound HTTP/gRPC requests (creates root span)
[ ] Outbound HTTP/gRPC calls (creates child spans)
[ ] Database queries (creates child spans with sanitized query)
[ ] Cache operations (Redis, Memcached)
[ ] Message queue produce/consume
[ ] Custom spans added for:
[ ] Key business operations ([e.g. payment processing, user lookup])
[ ] Background jobs (each job execution = root span)
[ ] Third-party API calls with custom attributes
[ ] Span attributes to capture on all spans:
- user.id (if authenticated — no PII)
- deployment.environment (production/staging)
- service.version (git SHA)
- [service-specific key attributes]
[ ] Trace exporter configured to: [Datadog / Jaeger / Tempo / OTLP endpoint]
[ ] Sampling rate configured:
- Production: [110]% of requests (adjust based on volume and cost)
- Always sample: errors, slow requests (>p99 threshold), and 100% of [critical endpoint]
```
### Trace Instrumentation Examples
```python
# Python — OpenTelemetry example
from opentelemetry import trace
tracer = trace.get_tracer("[service-name]")
def process_payment(payment_data):
with tracer.start_as_current_span("process_payment") as span:
span.set_attribute("payment.amount_cents", payment_data["amount"])
span.set_attribute("payment.currency", payment_data["currency"])
# Never: span.set_attribute("payment.card_number", ...)
try:
result = _do_process(payment_data)
span.set_status(trace.StatusCode.OK)
return result
except PaymentError as e:
span.set_status(trace.StatusCode.ERROR, str(e))
span.record_exception(e)
raise
```
---
## 6. Alert Rules Specification
Every alert must have: a name, a condition, a threshold, a severity, and a clear on-call action. Alerts without a clear action should not exist.
### Alert Definitions
| Alert name | Condition | Threshold | Severity | On-call action |
|---|---|---|---|---|
| `[Service]HighErrorRate` | 5xx error rate, 5-min rolling window | >1% for 2 consecutive windows | P1 | Check recent deploys; inspect error logs; see runbook [link] |
| `[Service]CriticalErrorRate` | 5xx error rate, 2-min rolling window | >5% | P1 — immediate | Same as above — page immediately, do not wait |
| `[Service]HighP99Latency` | p99 latency on key endpoints | >2× SLO target for 3 min | P2 | Check DB latency, cache hit rate, and upstream dependencies |
| `[Service]LatencySLOBreach` | p99 latency | >SLO target for 5 consecutive minutes | P1 | SLO burn — page on-call, escalate if not resolved in 20 min |
| `[Service]HighCPU` | CPU utilisation | >80% sustained for 5 min | P2 | Check for traffic spike; scale up if needed; check for runaway processes |
| `[Service]HighMemory` | Memory utilisation | >85% sustained for 5 min | P2 | Check for memory leak (especially after deploys); restart pod if OOM imminent |
| `[Service]DBConnectionPoolHigh` | DB connection pool utilisation | >75% | P2 | Check for long-running queries; consider scaling service or increasing pool size |
| `[Service]DLQDepthHigh` | Dead-letter queue depth | >10 messages | P2 | Inspect DLQ messages for error pattern; fix bug and replay if safe |
| `[Service]TrafficDropAnomaly` | RPS, compared to same hour yesterday | >50% drop sustained 5 min | P1 | Upstream may be down; check caller health; check load balancer |
| `[Service]PrimaryActionSuccessRateDrop` | [Business metric success rate] | <[95]% over 10 min | P1 | [Service-specific action — e.g. "Check payment provider status"] |
| `[Service]DownstreamDependencyErrors` | Error rate calling [dependency] | >5% over 5 min | P2 | Check [dependency] status page; enable fallback if available |
### Alert Configuration Examples
```yaml
# Prometheus / Grafana alerting rules (adapt for your platform)
groups:
- name: [service-name]-alerts
rules:
- alert: [Service]HighErrorRate
expr: |
(
sum(rate([service]_http_requests_total{status=~"5.."}[5m]))
/
sum(rate([service]_http_requests_total[5m]))
) > 0.01
for: 2m
labels:
severity: critical
team: [team-name]
annotations:
summary: "High error rate on [Service Name]"
description: "Error rate is {{ $value | humanizePercentage }} (threshold: 1%)"
runbook_url: "[runbook link]"
- alert: [Service]HighP99Latency
expr: |
histogram_quantile(0.99,
sum(rate([service]_http_request_duration_seconds_bucket[5m])) by (le, endpoint)
) > [0.5]
for: 3m
labels:
severity: warning
team: [team-name]
annotations:
summary: "p99 latency elevated on [Service Name]"
description: "p99 latency on {{ $labels.endpoint }} is {{ $value | humanizeDuration }}"
runbook_url: "[runbook link]"
```
```python
# Datadog monitor configuration (Python SDK or Terraform)
import datadog
datadog.initialize(api_key="[key]", app_key="[key]")
datadog.api.Monitor.create(
type="metric alert",
query=f"sum(last_5m):sum:{{service}}.http.errors{{service:[service-name]}} / sum:{{service}}.http.requests{{service:[service-name]}} > 0.01",
name="[Service] High Error Rate",
message="Error rate exceeded 1%. @pagerduty-[service-oncall]\n\nRunbook: [link]",
tags=["service:[service-name]", "team:[team-name]"],
options={
"thresholds": {"critical": 0.01, "warning": 0.005},
"notify_no_data": False,
"evaluation_delay": 60,
}
)
```
---
## 7. Dashboard Layout Specification
The primary service dashboard must answer "is the service healthy right now?" at a glance. Use this layout:
```
┌─────────────────────────────────────────────────────────────────────┐
│ [SERVICE NAME] — Service Health Dashboard [Time range ▼] │
├───────────────┬───────────────┬───────────────┬─────────────────────┤
│ Error rate │ p99 Latency │ RPS (current)│ SLO budget remaining│
│ [BIG NUMBER] │ [BIG NUMBER] │ [BIG NUMBER] │ [BIG NUMBER / days] │
│ vs SLO: 0.1% │ vs SLO: 500ms│ vs avg: [N] │ [Error budget gauge]│
├───────────────┴───────────────┴───────────────┴─────────────────────┤
│ Error rate over time (24h) │
│ [Time series: 5xx rate line, SLO threshold line] │
├─────────────────────────────────┬───────────────────────────────────┤
│ Latency percentiles over time │ Request throughput over time │
│ [Lines: p50, p95, p99, p999] │ [Bars: RPS by endpoint] │
│ [SLO threshold horizontal line]│ │
├─────────────────────────────────┴───────────────────────────────────┤
│ Latency heatmap (all requests — shows distribution shape) │
├─────────────────────────────────┬───────────────────────────────────┤
│ CPU utilisation over time │ Memory utilisation over time │
│ [All instances/pods — lines] │ [All instances/pods — lines] │
│ [Alert threshold: 80%] │ [Alert threshold: 85%] │
├─────────────────────────────────┴───────────────────────────────────┤
│ DB: connection pool utilisation│ DB: query latency (p99 per query)│
├─────────────────────────────────┴───────────────────────────────────┤
│ [Business metric 1 over time] │ [Business metric 2 over time] │
│ e.g. Payment success rate │ e.g. Orders created/min │
└─────────────────────────────────┴───────────────────────────────────┘
```
**Second dashboard — Dependency Health:**
```
┌─────────────────────────────────────────────────────────────────────┐
│ [SERVICE NAME] — Dependency Health │
├─────────────────────────────────────────────────────────────────────┤
│ For each dependency: error rate | latency | current status │
│ [Database] [N]% errors | [N]ms p99 | ● Healthy / ⚠ Degraded │
│ [Redis] [N]% errors | [N]ms p99 | ● Healthy │
│ [External API][N]% errors | [N]ms p99 | ● Healthy │
├─────────────────────────────────────────────────────────────────────┤
│ Outbound call latency over time (one line per dependency) │
├─────────────────────────────────────────────────────────────────────┤
│ Circuit breaker / fallback state (if implemented) │
└─────────────────────────────────────────────────────────────────────┘
```
---
## 8. Observability Debt Analysis
Honest assessment of what is missing today and what the priority to add it is:
| Gap | Impact | Priority | Effort | Owner | Target date |
|---|---|---|---|---|---|
| [e.g. No distributed tracing — can't see cross-service latency] | High — blind to dependency issues | P1 | [2 days] | [Name] | [Date] |
| [e.g. No business metric alerts — only infra alerts] | High — silent business failures | P1 | [1 day] | [Name] | [Date] |
| [e.g. Logs are unstructured text — not searchable] | Medium — slow incident investigation | P2 | [3 days] | [Name] | [Date] |
| [e.g. No dead-letter queue monitoring] | Medium — failed messages go unnoticed | P2 | [4 hours] | [Name] | [Date] |
| [e.g. Alert thresholds not calibrated to production baseline] | Medium — alert fatigue or missed alerts | P2 | [1 day] | [Name] | [Date] |
| [e.g. No latency heatmap — outliers invisible in averages] | Low — harder to spot tail latency issues | P3 | [2 hours] | [Name] | [Date] |
**Total observability debt: [N] items | Estimated effort: [N days]**
---
## Quality Checks
- [ ] Every alert has a named on-call action — no alert says "investigate" without specifying what to investigate first
- [ ] Alert thresholds are calibrated against production baselines, not set to default values from a template
- [ ] Structured logging is implemented — no unstructured text log lines in production
- [ ] PII is explicitly excluded from logs — a named engineer has verified this
- [ ] Distributed tracing is propagating trace IDs across all service boundaries (verify with a test request)
- [ ] The primary dashboard answers "is the service healthy?" in under 10 seconds — no hunting for the right panel
- [ ] Business metrics are tracked alongside infrastructure metrics — not just four golden signals
- [ ] Observability debt items have owners and dates — not just "would be nice to have"
## Anti-Patterns
- [ ] Do not create alerts without a specific on-call action — an alert that just says "investigate" trains engineers to ignore it
- [ ] Do not set alert thresholds from a template without calibrating against production baselines — uncalibrated thresholds cause either alert fatigue or missed incidents
- [ ] Do not log PII, tokens, or secrets — a logging standard is incomplete without an explicit list of what must never be logged
- [ ] Do not measure only the four golden signals without adding at least one business metric alert — infrastructure health can be green while the business-critical path is silently failing
- [ ] Do not deploy distributed tracing without verifying that trace IDs propagate across all service boundaries — partial tracing is worse than no tracing because it produces misleading incomplete traces
@@ -0,0 +1,372 @@
---
name: oncall-runbook
description: "Write an on-call runbook for a service — covering alert definitions, escalation paths, common incident responses, and on-call handoff procedures. Use when asked to write an on-call guide, create alert runbooks, document escalation procedures, or prepare an on-call handoff document. Produces a structured on-call runbook with per-alert response procedures, escalation matrix, diagnostic commands, and handoff template."
---
# On-Call Runbook Skill
Produce a complete on-call runbook for a service — giving the on-call engineer everything they need to respond confidently to alerts at 3am, without having to ask anyone for help.
A good on-call runbook reduces mean time to resolution (MTTR) by eliminating the "what do I do first?" problem. It is written for the on-call engineer who has just been paged and needs to act, not for someone calmly reading documentation.
## Required Inputs
Ask for these if not already provided:
- **Service name** and what it does
- **Team** and tech lead name
- **Alert list** — names of alerts that currently page on-call
- **Monitoring setup** — Datadog / Grafana / CloudWatch / PagerDuty / etc.
- **Common failure modes** — what breaks most often, and what fixes it
- **Escalation contacts** — who to call when on-call can't resolve it
- **Deployment setup** — can on-call roll back? How?
- **Service dependencies** — what does this service depend on, and what depends on it?
## Output Format
---
# On-Call Runbook: [Service Name]
**Team:** [Team name] | **Tech lead:** [Name]
**PagerDuty service:** [Link] | **Escalation policy:** [Policy name]
**Last updated:** [Date] | **Next review:** [Date + 90 days]
> **First time on-call for this service?** Read the [developer onboarding doc] first — it covers the architecture and how things work. This runbook assumes you understand the service.
---
## Quick Reference
**Dashboard:** [Link — the first thing to open when paged]
**Logs:** [Link — where to find logs]
**Runbook index:** Jump to the alert that paged you → [Alert list below]
**Can't resolve in 30 min?** Escalate to: [Name] via [Slack / PagerDuty]
**Rollback command (memorise this):**
```bash
[rollback command — e.g. kubectl rollout undo deployment/[service-name]]
```
---
## Escalation Matrix
| Situation | Escalate to | How | After how long |
|---|---|---|---|
| Can't diagnose the alert | [Tech lead name] | Slack DM / Phone | 30 minutes |
| Alert requires infra change | [Platform team] | `#platform` Slack | Immediately |
| Customer-facing impact | [CSM / Support lead] | `#incidents` Slack | Immediately (P1) |
| Database issue | [DBA or data team] | Slack / PagerDuty | Immediately |
| [Specific dependency] down | [[Dependency] on-call] | PagerDuty / Slack | Immediately |
| Extended outage (>1 hour) | [Engineering manager] | Phone | 1 hour |
**Contacts:**
| Name | Role | Slack | Phone |
|---|---|---|---|
| [Name] | Tech lead | @[handle] | [Number] |
| [Name] | Engineering manager | @[handle] | [Number] |
| [Name] | Platform / infra | @[handle] | [Number] |
| [Platform team] | Infra on-call | `#platform` | PagerDuty |
---
## Service Architecture (Quick View)
```
[Upstream callers]
[This Service]
├──→ [Primary Database]
├──→ [Cache — e.g. Redis]
└──→ [Downstream Service / Queue]
```
**If this service is down, these are affected:** [List downstream consumers]
**If these are down, this service is affected:** [List upstream dependencies]
---
## Alert Runbooks
### ALERT: [Alert Name 1 — e.g. HighErrorRate]
**What it means:** [Plain English — e.g. "More than 5% of API requests are returning 5xx errors in the last 5 minutes"]
**Severity:** P1 / P2 / P3
**SLO impact:** Yes / No — [If yes: this alert means the error budget is burning at [X]× rate]
**Step 1 — Acknowledge and assess**
```bash
# Check current error rate
[query or dashboard link]
# Check which endpoints are erroring
[query or command]
```
**Step 2 — Check recent changes**
```bash
# Any deploys in the last hour?
[command or link to deployment log]
# Recent config changes?
[where to check]
```
**Step 3 — Check dependencies**
```bash
# Is the database healthy?
[health check command or link]
# Is [downstream service] healthy?
[health check command or link]
```
**Step 4 — Diagnose**
| If you see | It means | Do this |
|---|---|---|
| [Error pattern 1] | [Cause] | [Action] |
| [Error pattern 2] | [Cause] | [Action] |
| [Error pattern 3] | [Cause] | [Action] |
| No clear pattern | Unknown cause | Escalate to [name] |
**Step 5 — Fix or mitigate**
```bash
# If caused by bad deploy — roll back:
[rollback command]
# If caused by [specific issue]:
[fix command]
# If caused by upstream dependency:
[mitigation — e.g. enable circuit breaker, reduce traffic, etc.]
```
**After resolving:**
- [ ] Confirm error rate has returned to baseline
- [ ] Check no downstream services were affected
- [ ] If P1: open a post-incident review — see [incident-postmortem skill]
- [ ] Update `#incidents` with resolution summary
---
### ALERT: [Alert Name 2 — e.g. HighLatency]
**What it means:** [e.g. "P99 response time has exceeded 1s for more than 3 consecutive minutes"]
**Severity:** P1 / P2 / P3
**SLO impact:** Yes — latency SLO breach
**Step 1 — Assess scope**
```bash
# Check which endpoints are slow
[query or dashboard — broken down by endpoint]
# Check if latency is across all regions or localised
[query or command]
```
**Step 2 — Common causes and fixes**
| Cause | Signal | Fix |
|---|---|---|
| Database slow queries | DB latency spike on dashboard | [Check slow query log: `command`] |
| Cache miss storm | Cache hit rate drops on dashboard | [command or action] |
| Memory pressure / GC | High memory on service dashboard | [command or action — e.g. restart, scale up] |
| Upstream service slow | Trace shows time in external call | Escalate to [service] on-call |
| Traffic spike | Request rate spike on dashboard | [Scale up: `command`] |
**Step 3 — Escalate if unresolved in 20 minutes**
Page [Tech lead] via PagerDuty / Slack.
---
### ALERT: [Alert Name 3 — e.g. DatabaseConnectionPoolExhausted]
**What it means:** [e.g. "The service has used all available database connections — new requests will fail"]
**Severity:** P1
**SLO impact:** Yes — will cause errors immediately
**Immediate mitigation:**
```bash
# Restart the service to flush stale connections
[restart command]
# Check current connection count
[DB connection query]
```
**Diagnose root cause after stabilising:**
```bash
# Check for long-running queries holding connections
[query]
# Check if a recent deploy changed connection pool config
[where to check]
```
**Resolution:** [e.g. "Increase pool size in config / kill long-running queries / scale the service"]
---
### ALERT: [Alert Name 4 — e.g. QueueBacklogHigh / ConsumerLag]
**What it means:** [e.g. "The message queue backlog exceeds 10,000 messages — consumers are not keeping up"]
**Severity:** P2
**SLO impact:** Depends — if queue backs up, downstream systems will receive delayed data
**Step 1 — Check consumer health**
```bash
# Are consumers running?
[command]
# Consumer error rate?
[dashboard or query]
```
**Step 2 — Check message contents**
```bash
# Are there poison messages causing retries?
[command to inspect dead-letter queue or failed messages]
```
**Step 3 — Options**
| If | Then |
|---|---|
| Consumers are down | Restart consumers: `[command]` |
| Poison message in queue | Move to DLQ: `[command]` |
| Consumers healthy but slow | Scale consumers: `[command]` |
| Upstream producing too fast | Escalate to [upstream service] owner |
---
### ALERT: [Add additional alerts following the same pattern]
---
## Diagnostic Cheat Sheet
Common commands for quick diagnosis. Paste and run without modification.
```bash
# Service health
[health check command]
# Recent logs (last 100 lines)
[log command]
# Error logs only
[error log filter command]
# Current pod / instance status
[kubectl get pods / aws ecs describe-tasks / etc.]
# Restart the service
[restart command]
# Roll back to previous version
[rollback command]
# Database connection count
[DB query]
# Cache hit rate
[cache stats command]
# Current request rate
[metrics query]
```
---
## Useful Dashboard Links
| Dashboard | URL | Use it to |
|---|---|---|
| Service overview | [Link] | First stop — error rate, latency, request rate |
| Database | [Link] | Connection count, slow queries, replication lag |
| Infrastructure | [Link] | CPU, memory, disk |
| Queue / consumers | [Link] | Backlog depth, consumer throughput |
| Upstream dependencies | [Link] | Dependency health at a glance |
---
## Incident Communication
When you declare an incident:
**Post to `#incidents` immediately:**
```
🔴 INCIDENT — [Service Name]
Status: Investigating
Impact: [Who is affected and how]
Paged: [Your name]
Next update: [Time — max 30 min from now]
```
**Update every 30 minutes while active:**
```
🔴 UPDATE — [Service Name] — [Time]
Status: [Investigating / Identified / Mitigating / Resolved]
Latest: [One sentence on what you found or did]
Next update: [Time]
```
**On resolution:**
```
✅ RESOLVED — [Service Name] — [Time]
Duration: [X minutes]
Impact: [Summary of who was affected]
Cause: [One sentence]
Follow-up: [PIR required? Yes/No — link when created]
```
---
## On-Call Handoff
Use this template at the end of every on-call shift:
```
--- ON-CALL HANDOFF: [Service Name] ---
Date: [Date]
Outgoing: [Your name]
Incoming: [Next on-call name]
INCIDENTS THIS SHIFT:
- [Incident summary — date, duration, cause, resolution, follow-up required]
OPEN ISSUES TO WATCH:
- [Anything not fully resolved / trending in the wrong direction]
CHANGES SINCE LAST HANDOFF:
- [Deploys, config changes, infra changes that affect on-call awareness]
RUNBOOK GAPS FOUND:
- [Anything you had to figure out that isn't documented — please add it]
ANYTHING ELSE:
- [Notes for incoming on-call]
```
---
## Quality Checks
- [ ] Every alert that pages on-call has a runbook entry — no alert is missing
- [ ] Rollback command is accurate and tested recently
- [ ] Escalation contacts have current phone numbers and Slack handles
- [ ] Diagnostic commands work — they have been run by at least one person recently
- [ ] Handoff template is used at every shift change — not just during incidents
- [ ] "Things I had to figure out that weren't documented" are added to this runbook after every incident
## Anti-Patterns
- [ ] Do not write alert runbooks with vague diagnostic steps like "check the logs" — every step must specify the exact command, dashboard link, or query to run
- [ ] Do not include an alert in the runbook that has no specific on-call action — an alert that pages someone with no defined response path creates panic, not resolution
- [ ] Do not leave the rollback command undocumented or untested — a rollback procedure that has never been run will fail when needed most
- [ ] Do not list escalation contacts without phone numbers and Slack handles — email-only escalation paths are useless during a 3am incident
- [ ] Do not write the runbook once and treat it as permanent — runbooks go stale after incidents; every incident must trigger a review of the relevant runbook entries
@@ -0,0 +1,285 @@
---
name: performance-budget
description: "Define and document performance budgets for a web service or application. Use when asked to set performance targets, define SLOs for latency or throughput, establish Core Web Vitals targets, create a performance baseline, or document performance regression policy. Produces a structured performance budget covering key user journeys, Core Web Vitals, backend latency SLOs, measurement tooling, CI enforcement, and breach response process."
---
# Performance Budget Skill
Produce a complete, actionable performance budget document for a web service or application. A performance budget is not a wishlist — it is a set of measurable, enforced constraints that define what "acceptable performance" means and who is responsible when those constraints are violated.
A good performance budget answers: what are the targets, how are they measured, what triggers an investigation, and what happens when a budget is breached.
## Required Inputs
Ask for these if not already provided:
- **Service name and type** — web app, API service, mobile app, or combination
- **Key user journeys** — the 35 most important flows users take (e.g. "search → product page → checkout")
- **Current baseline metrics** — P50/P95/P99 latency, LCP, CLS, INP if available (state "no baseline" if not collected yet)
- **Tech stack** — frontend framework, backend language/framework, CDN, database
- **Deployment environment** — cloud provider, region(s), edge/CDN configuration
- **Cost constraints** — any budget or infrastructure limits that affect headroom
## Output Format
---
# Performance Budget: [Service Name]
**Service:** [Name] | **Team:** [Team name]
**Last updated:** [Date] | **Owner:** [Name / role]
**Environment:** [Production / Staging baseline] | **Review cadence:** [Quarterly / per-sprint]
---
## Overview
[23 sentences describing the service, its user-facing performance requirements, and why performance is a priority. Reference the business impact of latency — e.g. conversion rate, user retention, SLA obligations.]
**Performance philosophy:** [e.g. "Performance is a feature. Every engineer is responsible for keeping the service within budget. Regressions must be caught in CI before they reach production."]
---
## Key User Journeys
Define the critical paths that the performance budget is designed to protect.
| Journey ID | Journey name | Entry point | Exit point | Criticality |
|---|---|---|---|---|
| UJ-1 | [e.g. New user sign-up] | [Landing page] | [Dashboard] | Critical |
| UJ-2 | [e.g. Core workflow task] | [e.g. /app/tasks] | [e.g. Task complete] | High |
| UJ-3 | [e.g. Search and select] | [e.g. /search] | [e.g. Detail page] | High |
| UJ-4 | [e.g. API data fetch] | [e.g. GET /api/items] | [e.g. 200 response] | Medium |
---
## Frontend Performance Budget
*Complete this section for web and mobile applications. Skip for API-only services.*
### Core Web Vitals Targets
Targets apply to the 75th percentile of real user sessions (field data), measured on a mid-range Android device on a 4G connection unless otherwise stated.
| Metric | Description | Good | Needs Improvement | Poor | **Our Target** | Current baseline |
|---|---|---|---|---|---|---|
| **LCP** | Largest Contentful Paint — perceived load speed | ≤2.5s | 2.54.0s | >4.0s | **[≤X.Xs]** | [Xs / not measured] |
| **INP** | Interaction to Next Paint — responsiveness | ≤200ms | 200500ms | >500ms | **[≤Xms]** | [Xms / not measured] |
| **CLS** | Cumulative Layout Shift — visual stability | ≤0.1 | 0.10.25 | >0.25 | **[≤0.X]** | [X.XX / not measured] |
| **FCP** | First Contentful Paint | ≤1.8s | 1.83.0s | >3.0s | **[≤X.Xs]** | [Xs / not measured] |
| **TTFB** | Time to First Byte | ≤800ms | 800ms1.8s | >1.8s | **[≤Xms]** | [Xms / not measured] |
### Page Weight Budget
| Asset type | Max size (compressed) | Current | Status |
|---|---|---|---|
| Total page weight | [e.g. 500KB] | [XKB / unknown] | [Within / Over / Unknown] |
| JavaScript (initial load) | [e.g. 200KB] | [XKB / unknown] | [Within / Over / Unknown] |
| CSS | [e.g. 50KB] | [XKB / unknown] | [Within / Over / Unknown] |
| Images (above fold) | [e.g. 150KB] | [XKB / unknown] | [Within / Over / Unknown] |
| Web fonts | [e.g. 50KB] | [XKB / unknown] | [Within / Over / Unknown] |
| Third-party scripts | [e.g. 100KB] | [XKB / unknown] | [Within / Over / Unknown] |
### Per-Journey Frontend Targets
| Journey | LCP | INP | CLS | FCP | TTFB |
|---|---|---|---|---|---|
| UJ-1: [Journey name] | [≤Xs] | [≤Xms] | [≤0.X] | [≤Xs] | [≤Xms] |
| UJ-2: [Journey name] | [≤Xs] | [≤Xms] | [≤0.X] | [≤Xs] | [≤Xms] |
| UJ-3: [Journey name] | [≤Xs] | [≤Xms] | [≤0.X] | [≤Xs] | [≤Xms] |
---
## Backend Performance Budget
### API Latency SLOs
Targets measured at the service boundary (not including client-side network latency).
| Endpoint / operation | Method | P50 | P95 | P99 | Max (hard limit) | Error rate |
|---|---|---|---|---|---|---|
| [e.g. /api/auth/login] | POST | [≤Xms] | [≤Xms] | [≤Xms] | [≤Xms] | [<X%] |
| [e.g. /api/items] | GET | [≤Xms] | [≤Xms] | [≤Xms] | [≤Xms] | [<X%] |
| [e.g. /api/items/:id] | GET | [≤Xms] | [≤Xms] | [≤Xms] | [≤Xms] | [<X%] |
| [e.g. /api/items] | POST | [≤Xms] | [≤Xms] | [≤Xms] | [≤Xms] | [<X%] |
| [e.g. Background job: sync] | — | [≤Xs] | [≤Xs] | [≤Xs] | [≤Xs] | [<X%] |
**Overall service SLOs:**
| SLO | Target | Measurement window |
|---|---|---|
| Availability | [99.X%] | 30-day rolling |
| P95 latency (all endpoints) | [≤Xms] | 30-day rolling |
| Error rate (5xx) | [<X%] | 30-day rolling |
| Throughput (sustained) | [≥X req/s] | Peak hour |
### Database Query Budget
| Query / operation | P50 | P95 | Max | Notes |
|---|---|---|---|---|
| [e.g. User lookup by ID] | [≤Xms] | [≤Xms] | [≤Xms] | Index on `user_id` |
| [e.g. List items for user] | [≤Xms] | [≤Xms] | [≤Xms] | Paginated, max 100 rows |
| [e.g. Full-text search] | [≤Xms] | [≤Xms] | [≤Xms] | Elasticsearch / pg_trgm |
---
## Measurement Methodology
### Real User Monitoring (RUM)
**Tool:** [e.g. Google CrUX, SpeedCurve, Datadog RUM, Sentry Performance, custom]
**Data source:** [Field data from real users / Lab data from synthetic tests / Both]
**Sample rate:** [X% of sessions]
**How to access:** [Dashboard URL or tool access instructions]
**What is measured:**
- [ ] Core Web Vitals (LCP, INP, CLS) per page and journey
- [ ] Custom performance marks for business-critical interactions
- [ ] Resource timing for key assets
- [ ] Long tasks (>50ms on main thread)
### Synthetic Monitoring
**Tool:** [e.g. Lighthouse CI, WebPageTest, k6, Artillery, Playwright with performance assertions]
**Frequency:** [Every X minutes / on every deploy / nightly]
**Test location(s):** [e.g. eu-west-1, us-east-1]
**Device profile:** [Desktop 10Mbps / Mobile 4G Moto G4 / both]
**Synthetic test suite location:** [Link to test files]
### Backend Observability
**APM tool:** [e.g. Datadog, Grafana + Prometheus, New Relic, AWS X-Ray]
**Metrics collected:**
- Request rate, error rate, duration (RED metrics) per endpoint
- Database query duration and connection pool utilisation
- Cache hit/miss rates
- Background job queue depth and processing latency
**Dashboard:** [Link to primary performance dashboard]
---
## CI/CD Performance Enforcement
Performance budgets are enforced at two gates:
### Gate 1 — Build-time Bundle Analysis
**Tool:** [e.g. bundlesize, size-limit, webpack-bundle-analyzer with CI assertion]
**Config file:** [`[.bundlesizerc / .size-limit.js / etc.]`]
**Trigger:** Every PR targeting `main`
**Blocking:** Yes — PR cannot merge if bundle size budget is exceeded
```json
// Example .size-limit.js
[
{
"path": "dist/js/*.js",
"limit": "200 KB"
},
{
"path": "dist/css/*.css",
"limit": "50 KB"
}
]
```
### Gate 2 — Synthetic Performance Tests in CI
**Tool:** [e.g. Lighthouse CI, k6, Artillery]
**Trigger:** On deploy to staging
**Blocking:** Yes — production deploy is blocked if thresholds fail
**Thresholds checked:**
- LCP ≤ [Xs]
- CLS ≤ [0.X]
- P95 API latency ≤ [Xms]
- Error rate < [X%]
**CI config location:** [`[.github/workflows/perf.yml / ci/performance.yaml]`]
**How to run locally:**
```bash
# Run Lighthouse CI against local build
[command — e.g. lhci autorun --config=lighthouserc.js]
# Run load test locally
[command — e.g. k6 run load-tests/api-smoke.js]
```
---
## Budget Breach Response Process
A budget breach is when a measured metric exceeds its target for [X consecutive measurements / X minutes sustained / a single deploy].
### Breach Severity Levels
| Severity | Condition | Response time | Who acts |
|---|---|---|---|
| P1 — Critical | >2× budget threshold in production | Immediate | On-call engineer + team lead |
| P2 — High | >1.5× budget threshold in production | Within 4 hours | On-call engineer |
| P3 — Medium | Threshold exceeded in production | Within 1 sprint | PR author + team |
| P4 — Low | Threshold exceeded in staging only | Before merge | PR author |
### Breach Investigation Checklist
When a breach is detected, work through this checklist in order:
**1. Identify the regression commit**
```bash
# Compare performance across recent deploys
[command — e.g. datadog metrics query, lighthouse-ci compare, git bisect]
```
**2. Classify the breach**
- [ ] Is this a code change? (new feature, refactor, dependency bump)
- [ ] Is this an infrastructure change? (new instance type, config change)
- [ ] Is this an external factor? (CDN issue, DNS, upstream dependency)
- [ ] Is this a measurement anomaly? (test environment issue, sample size)
**3. Immediate actions**
- If P1/P2 in production and a code cause is confirmed: roll back or disable the feature flag
- If cause is unknown: do not roll back immediately — gather more data first
- Notify [#performance / #incidents Slack channel] with: metric name, current value, budget target, suspected cause
**4. Resolution**
- Fix the root cause — do not just adjust the budget threshold
- Budget thresholds should only change after a team discussion and explicit approval from [tech lead / EM]
- Document the breach in the [performance log / incident record]
**Budget change policy:** Budget thresholds may only be relaxed if: (a) the feature delivering the regression has measurable business value that outweighs the performance cost, and (b) the change is reviewed and approved by [tech lead].
---
## Performance Review Cadence
| Trigger | Action |
|---|---|
| Every sprint | Review P95/P99 latency trends; flag any creeping degradation |
| Every quarter | Full performance budget review — update baselines, adjust targets, audit tooling |
| After major feature launch | Re-measure all Core Web Vitals and API SLOs; update baselines |
| After infrastructure change | Re-run full synthetic test suite; confirm no regression |
| After dependency upgrade | Run bundle size diff; confirm no unexpected size increase |
**Next scheduled review:** [Date]
**Review owner:** [Name / role]
---
## Quality Checks
- [ ] Every budget threshold is a specific number — not a range or "TBD"
- [ ] Both frontend (if applicable) and backend targets are defined — not just one or the other
- [ ] Measurement tooling is named with a link to the dashboard or config file
- [ ] CI enforcement is configured for at least one gate (build-time or deploy-time)
- [ ] Budget breach response process names specific Slack channels and owners
- [ ] Budget thresholds are anchored to baseline measurements or a justified target — not pulled from thin air
- [ ] Per-journey targets are defined for critical user journeys, not just global averages
## Anti-Patterns
- [ ] Do not set budget thresholds without measuring a current baseline first — targets must be anchored to reality
- [ ] Do not define global averages only — critical user journeys need individual budgets as they may diverge significantly
- [ ] Do not omit CI enforcement — a performance budget that is not enforced in the build pipeline will not be respected
- [ ] Do not leave the breach response process without named owners and escalation channels
- [ ] Do not set budgets that apply only to one environment — production and staging targets should be documented separately if they differ
@@ -0,0 +1,98 @@
---
name: pr-description-writer
description: "Write a clear, structured pull request description from a git diff, branch summary, or commit list. Use when asked to write a PR description, draft a pull request, or document code changes. Produces a description with summary, motivation, changes made, testing steps, and reviewer guidance."
---
# PR Description Writer Skill
Writes structured, reviewer-friendly pull request descriptions from a diff, commit list, or informal notes. Covers the what, why, and how-to-review so reviewers can start immediately.
## Required Inputs
Ask for these if not provided:
- **What changed** (paste a git diff, `git log --oneline`, or describe the changes in plain English)
- **Why it was changed** (the problem being solved or feature being added)
- **How to test it** (any specific steps a reviewer needs to verify it works)
- **Risk level** (low / medium / high — affects how much reviewer guidance to include)
- **PR type** (feature / bug fix / refactor / dependency upgrade / config change / hotfix)
- **Target branch** (e.g. main / develop / release/2.4 — affects risk framing and reviewer guidance)
- **Linked issue or ticket** (e.g. JIRA-1234, GitHub #567 — or "none")
## Output Format
### Title
A clear, imperative-mood title under 72 characters:
`[type]: [concise description of what changed]`
Examples:
- `feat: add rate limiting to the public API`
- `fix: resolve race condition in session expiry`
- `refactor: extract payment logic into PaymentService`
### Summary
23 sentences covering:
- What this PR does (the change)
- Why it was needed (the problem or goal)
- The approach taken (at a high level)
### Changes Made
Bullet list of specific changes — one bullet per logical change, not per file:
- Added [X] to handle [Y]
- Refactored [A] to reduce [B]
- Removed [C] as it was replaced by [D]
- Updated [E] to fix [F]
### Screenshots / Demo
[If UI change: include before/after screenshots or a screen recording]
[If API change: include example request/response]
[If no visual change and no API contract change: omit this section entirely — do not leave it as a placeholder]
### How to Test
Step-by-step instructions a reviewer can follow:
1. [Setup step if needed]
2. [Action to take]
3. [What to verify]
4. [Edge case to check]
Include any specific commands, test data, or environment flags needed.
### Testing Checklist
- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Edge cases covered
- [ ] Manual testing completed
- [ ] No regressions in existing tests
### Reviewer Notes
Flag anything that warrants extra attention:
- Areas of uncertainty where a second opinion is welcome
- Deliberate trade-offs made (and why)
- Out-of-scope items noticed but not addressed
- Dependencies on other PRs (link them)
### Related
- Closes #[issue number] (if applicable)
- Related to #[PR/issue number]
## Quality Checks
- [ ] Title is imperative mood and under 72 characters
- [ ] Summary explains what AND why (not just what)
- [ ] Changes list describes logical changes (not file-by-file changes)
- [ ] Title starts with a valid type prefix (feat / fix / refactor / chore / deps / config / hotfix) and is under 72 characters
- [ ] Testing steps are reproducible by someone unfamiliar with the code
- [ ] For high-risk PRs, Reviewer Notes flags at least one specific area of concern or deliberate trade-off; for low-risk PRs, Reviewer Notes is either omitted or kept to one line
## Anti-Patterns
- [ ] Do not write a description that only restates what changed — explain why the change was made
- [ ] Do not skip the testing steps — reviewers need to know how to verify the change works
- [ ] Do not omit the reviewer notes for high-risk PRs — flag deliberate trade-offs and areas needing careful review
- [ ] Do not describe implementation details that are obvious from the diff — add context that the diff cannot convey
- [ ] Do not produce a single paragraph — structure with headers so reviewers can navigate to what they need
## Usage Examples
- "Write a PR description for these changes" + [paste diff or description]
- "Draft a pull request for [feature]"
- "I need a PR description — here's what I changed"
- "Summarise these commits into a PR description"
- "Write the PR body for this branch"
@@ -0,0 +1,407 @@
---
name: rfc-writer
description: "Write an engineering RFC (Request for Comments) for a technical decision, architectural change, or significant implementation approach. Use when asked to write an RFC, document a technical proposal, create a design doc, write an architecture decision for review, or produce a technical specification for team feedback. Produces a complete RFC document covering problem statement, motivation, proposed solution, alternatives rejected, implementation plan, migration plan, security and performance implications, observability changes, rollout plan, and open questions."
---
# RFC Writer Skill
Produce a complete engineering RFC (Request for Comments) for a technical decision or architectural change. An RFC is a structured proposal document — not a persuasion document. Its purpose is to expose a decision to scrutiny, surface trade-offs, document alternatives considered, and create a permanent record of why a choice was made.
A good RFC makes it possible for someone who wasn't in the room to understand years later why the team built something the way they did.
## Required Inputs
Ask for these if not already provided:
- **RFC title and author** — what this RFC is about and who is proposing it
- **Problem being solved** — what is broken, missing, or inadequate today; why action is needed now
- **Proposed solution** — the approach the author is recommending, at least at a high level
- **Context and constraints** — team size, existing architecture, timeline pressures, budget limits, compliance requirements
- **Alternatives considered** — at least 2 alternative approaches the author has thought about
- **Current status** — is this pre-decision (seeking feedback) or post-decision (documenting a made decision)?
## Output Format
---
# RFC [Number]: [Title]
**Author:** [Name] | **Team:** [Team name]
**Created:** [Date] | **Last updated:** [Date]
**Status:** Draft | In Review | Approved | Rejected | Superseded by RFC-[X]
**Ticket:** [JIRA-XXX] | **Slack thread:** [#channel link]
**Review deadline:** [Date — when comments should be submitted by]
---
## Abstract
[24 sentences summarising the entire RFC. Should stand alone — someone reading only this should understand what is being proposed, why, and what the main trade-off is. Write this last.]
---
## 1. Problem Statement
[Describe the problem being solved. Focus on the *problem*, not the solution. Be specific and quantified where possible.]
**Current state:**
[Describe how things work today — the existing system, process, or architecture. Include any relevant constraints or limitations.]
**Why this is a problem now:**
[Why is this being addressed now rather than earlier or later? Reference metrics, incidents, product requirements, or scaling thresholds that make this urgent or timely.]
**Example of the problem in practice:**
[A concrete scenario or incident that illustrates the problem. This helps reviewers understand the real-world impact, not just the abstract description.]
```
// Example: current behaviour that illustrates the problem
[code snippet, log output, or sequence description showing the problem]
```
**Impact of not solving this:**
- [Impact 1 — e.g. "New tenant onboarding requires 3 hours of manual configuration per account"]
- [Impact 2 — e.g. "Auth service handles 400 req/s; projected to hit capacity within 8 weeks at current growth"]
- [Impact 3 — e.g. "Current approach is incompatible with the upcoming multi-region requirement"]
---
## 2. Goals and Non-Goals
**Goals:**
- [ ] [Specific, measurable outcome — e.g. "Reduce tenant onboarding time from 3 hours to <5 minutes"]
- [ ] [e.g. "Support 2,000 req/s on the auth service with P99 latency ≤50ms"]
- [ ] [e.g. "Enable multi-region deployment without changes to the application layer"]
**Non-goals:** *(what this RFC explicitly does not address)*
- [e.g. "This RFC does not address authentication for internal service-to-service calls — see RFC-042"]
- [e.g. "Performance improvements to the existing system — this RFC replaces it"]
- [e.g. "Migration of historical data — covered in a follow-on RFC"]
**Success metrics:**
| Metric | Current | Target | Measurement method |
|---|---|---|---|
| [e.g. Onboarding time] | [3 hours] | [<5 minutes] | [Prometheus histogram on onboarding job duration] |
| [e.g. Auth latency P99] | [120ms] | [≤50ms] | [Datadog APM] |
| [e.g. Engineer setup time] | [4 hours] | [<30 minutes] | [Onboarding survey] |
---
## 3. Background and Motivation
[Provide the context a reviewer needs to evaluate the proposal. This is not a repeat of the problem statement — it is the surrounding technical and business context.]
**Existing system overview:**
[Describe the relevant parts of the current architecture. Include an ASCII diagram if the relationships between components help understanding.]
```
[ASCII diagram of current architecture — optional but strongly recommended for architectural RFCs]
┌──────────┐ ┌──────────────┐ ┌──────────────┐
│ Client │────▶│ [Service A] │────▶│ [Service B] │
└──────────┘ └──────────────┘ └──────────────┘
┌──────────────┐
│ [Database] │
└──────────────┘
```
**Prior work and related decisions:**
- [RFC-XXX: Title — relevant previous decision; link]
- [ADR-XXX: Title — architectural decision record]
- [Any external standards, blog posts, or vendor documentation that informs this proposal]
**Constraints:**
- [e.g. Must remain backward compatible with v1 API clients for 12 months]
- [e.g. Team has no Rust expertise — solution must be in Python or Go]
- [e.g. Must be deployable without a maintenance window]
---
## 4. Proposed Solution
[Describe the proposed approach clearly and specifically. Include enough detail that an engineer could begin implementing from this document, but don't write the code — that is for the PR.]
### 4.1 High-Level Approach
[13 paragraphs describing the overall solution. Explain the key idea and why it solves the problem.]
### 4.2 Architecture
```
[ASCII diagram of the proposed architecture — what the system looks like after this RFC is implemented]
┌──────────┐ ┌──────────────────┐ ┌──────────────┐
│ Client │────▶│ [New Component] │────▶│ [Service B] │
└──────────┘ └──────────────────┘ └──────────────┘
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ [Store A] │ │ [Store B] │
└──────────────┘ └──────────────┘
```
### 4.3 Detailed Design
[Break the solution into its key components or decisions. For each, explain what it does and why it was designed this way.]
**Component / Decision 1: [Name]**
[Description of this component — what it does, how it works, why this approach was chosen.]
```
// Example interface, API contract, or pseudocode (not implementation code)
[Relevant schema, API definition, data flow, or pseudocode]
```
**Component / Decision 2: [Name]**
[Description]
**Component / Decision 3: [Name]**
[Description]
### 4.4 API Changes
*Complete this section if the RFC introduces or modifies any API endpoints, events, or interfaces.*
**New endpoints / events:**
```
[HTTP method + path or event name]
Request: { ... }
Response: { ... }
```
**Modified endpoints:**
- `[endpoint]`: [what changes and why; backward compatibility note]
**Deprecated endpoints:**
- `[endpoint]`: deprecated in favour of `[new endpoint]` — removal timeline: [date/version]
### 4.5 Data Model Changes
*Complete this section if any database schema or data structure changes are required.*
[Describe schema changes at a high level. Reference the database-migration-plan skill for detailed migration steps.]
```sql
-- Key schema changes (abbreviated — full migration in [link])
[DDL statements for key additions/changes]
```
---
## 5. Alternatives Considered
*Every alternative must include an explicit reason why it was rejected. "We went with the proposed solution" is not a reason.*
### Alternative 1: [Name]
**Description:**
[What this alternative would involve.]
**Pros:**
- [Pro 1]
- [Pro 2]
**Cons:**
- [Con 1]
- [Con 2]
**Why rejected:**
[Specific reason — e.g. "Requires 3× the infrastructure cost", "Incompatible with multi-region requirement", "Team has no expertise in this technology and the ramp-up would miss the Q3 deadline"]
---
### Alternative 2: [Name]
**Description:**
[What this alternative would involve.]
**Pros:**
- [Pro 1]
- [Pro 2]
**Cons:**
- [Con 1]
- [Con 2]
**Why rejected:**
[Specific reason]
---
### Alternative 3: Do nothing / defer
**Description:**
Accept the current state and revisit the problem in [timeframe].
**Why rejected:**
[Why deferring is not acceptable — reference the impact of not solving this from Section 1.]
---
## 6. Implementation Plan
**Estimated effort:** [X engineer-weeks] | **Target completion:** [Date / Quarter]
**Team:** [Who is building this — names or roles]
| Phase | Description | Duration | Dependencies | Owner |
|---|---|---|---|---|
| 1 | [e.g. Core implementation — new component built and tested] | [X weeks] | [None] | [Name] |
| 2 | [e.g. Integration — connect new component to existing services] | [X weeks] | [Phase 1 complete] | [Name] |
| 3 | [e.g. Rollout — canary deploy, then full rollout] | [X weeks] | [Phase 2 + staging validated] | [Name] |
| 4 | [e.g. Cleanup — deprecate old system, remove feature flags] | [X weeks] | [Phase 3 stable for X weeks] | [Name] |
**Key milestones:**
- [ ] [Date]: [Milestone — e.g. "Core implementation complete and code-reviewed"]
- [ ] [Date]: [Milestone — e.g. "Staging environment validation complete"]
- [ ] [Date]: [Milestone — e.g. "10% canary traffic without regression"]
- [ ] [Date]: [Milestone — e.g. "Full rollout complete"]
- [ ] [Date]: [Milestone — e.g. "Old system decommissioned"]
---
## 7. Migration Plan
*Complete this section if the RFC requires migrating existing users, data, or API consumers.*
**Migration strategy:** [Big-bang / Phased / Parallel-run / Opt-in]
**Who is affected:**
- [e.g. All existing API v1 consumers — requires updated client libraries]
- [e.g. X million rows in the `orders` table require backfilling]
**Migration steps:**
1. [Step 1 — describe action, who does it, estimated duration]
2. [Step 2]
3. [Step 3]
**Backward compatibility window:** [How long will the old system/API remain available?]
**Communication plan:**
- [Who needs to be notified, when, and how — e.g. "API consumers will receive a deprecation notice 3 months before the old endpoint is removed"]
---
## 8. Security Implications
[Describe the security impact of this change. If there are no security implications, state that explicitly with reasoning — do not leave this section blank.]
| Concern | Impact | Mitigation |
|---|---|---|
| [e.g. New API endpoint exposed to internet] | [e.g. New attack surface] | [e.g. Rate limiting, auth required, WAF rules] |
| [e.g. New data stored — user PII] | [e.g. GDPR scope expanded] | [e.g. Encrypted at rest, access log, data retention policy] |
| [e.g. Service-to-service communication] | [e.g. Token forgery risk] | [e.g. mTLS between services] |
**Has a threat model been produced or updated?** [Yes — link / No — required before implementation / Not required — reason]
---
## 9. Performance Implications
[Describe the expected performance impact. Include projections for the new system and how it was estimated.]
| Metric | Current | Projected | Measurement method |
|---|---|---|---|
| [e.g. P99 latency — /api/auth] | [120ms] | [≤50ms] | [Load test results — link] |
| [e.g. Database query count per request] | [12] | [3] | [Query logging in staging] |
| [e.g. Memory per instance] | [512MB] | [768MB] | [Profiling — link] |
| [e.g. Infrastructure cost] | [$X/month] | [$Y/month] | [AWS cost calculator estimate] |
**Load testing:** [Has load testing been done? Link to results. If not, when will it be done?]
**Performance risks:**
- [Risk 1 — e.g. "New component adds a network hop that may increase tail latency under congestion — needs validation at 2× peak load"]
---
## 10. Observability Changes
*Describe what new or changed metrics, logs, traces, and alerts this RFC introduces.*
**New metrics:**
| Metric name | Type | Description | Alert threshold |
|---|---|---|---|
| `[service].[component].[metric]` | [counter/gauge/histogram] | [What it measures] | [e.g. P99 > 100ms for 5 min] |
**New log events:**
| Event | Level | When emitted | Key fields |
|---|---|---|---|
| `[event.name]` | INFO | [When] | `user_id`, `duration_ms`, `result` |
**Distributed tracing:** [Are spans added for new components? Which operations are instrumented?]
**Dashboard changes:** [New dashboard / updated existing dashboard — link]
---
## 11. Rollout Plan
**Rollout strategy:** [Feature flag / Canary / Blue-green / Gradual traffic shift / Full deploy]
| Stage | Traffic % | Duration | Success criteria | Rollback trigger |
|---|---|---|---|---|
| Internal testing | 0% (dogfood) | [X days] | [No errors in internal usage] | Any error |
| Canary | 1% | [X hours] | [Error rate <0.1%; P99 latency within budget] | Error rate >0.5% |
| Limited rollout | 10% | [X days] | [As above + business metrics stable] | Error rate >0.2% |
| Full rollout | 100% | — | [All success metrics from Section 2 met] | Any SLO breach |
**Feature flag:** [Name of feature flag, if applicable] — managed in [LaunchDarkly / Unleash / config]
**Rollback procedure:**
```
// How to roll back if the rollout needs to be reversed
1. [Step 1 — e.g. Toggle feature flag to off]
2. [Step 2 — e.g. Deploy previous version]
3. [Step 3 — e.g. Notify stakeholders]
```
---
## 12. Open Questions
[List any unresolved questions, design decisions not yet made, or areas where the author is specifically seeking feedback. Assign an owner and a resolution deadline for each.]
| # | Question | Owner | Deadline | Resolution |
|---|---|---|---|---|
| 1 | [e.g. Should we use optimistic or pessimistic locking for concurrent updates to [resource]?] | [Name] | [Date] | [Pending / [Answer]] |
| 2 | [e.g. What is the retention policy for [new data type]?] | [Name] | [Date] | [Pending / [Answer]] |
| 3 | [e.g. Do we need a read replica for this query pattern at launch, or can we defer it?] | [Name] | [Date] | [Pending / [Answer]] |
---
## 13. Decision
*To be filled in after the review period closes.*
**Decision:** [Approved / Rejected / Approved with modifications]
**Decision date:** [Date]
**Decision makers:** [Names]
**Summary of key feedback addressed:**
- [Feedback item and how it was resolved]
**Conditions of approval (if any):**
- [e.g. Must complete load testing before Phase 2 begins]
---
## Quality Checks
- [ ] The problem statement is specific and quantified — not "the current system is slow" but "P99 latency is 800ms; budget is 200ms"
- [ ] Goals section includes measurable success metrics, not aspirational statements
- [ ] Every alternative has an explicit rejection reason — not just a list of cons
- [ ] Security implications section is completed, not left blank
- [ ] Performance implications include projected numbers, not just "should be better"
- [ ] Open questions are assigned to named owners with deadlines — not floating
- [ ] The RFC is written to be read by someone who was not in the planning conversations
- [ ] Migration plan addresses all affected parties — users, API consumers, data — not just the technical steps
## Anti-Patterns
- [ ] Do not write the RFC as a persuasion document — its purpose is to expose trade-offs, not sell a decision
- [ ] Do not list alternatives without explicit rejection reasons — "we preferred the proposed solution" is not a reason
- [ ] Do not leave the security implications section blank or write "N/A" without a reasoned explanation
- [ ] Do not write open questions without assigning a named owner and a resolution deadline
- [ ] Do not skip the "impact of not solving this" section — without it, reviewers cannot assess urgency
@@ -0,0 +1,155 @@
---
name: runbook-writer
description: "Write an operational runbook for a service, incident type, or deployment procedure. Use when asked to write a runbook, create an ops guide, document an operational procedure, or prepare an incident response playbook. Produces a runbook with overview, prerequisites, step-by-step procedures, rollback steps, troubleshooting table, and escalation paths."
---
# Runbook Writer Skill
Produces operational runbooks for services, incident types, and deployment procedures — structured so an on-call engineer who's never touched the system can follow them under pressure.
## Required Inputs
Ask for these if not provided:
- **What the runbook is for** (e.g. deploying the payment service, responding to a database failover, rotating API keys)
- **Runbook type** (Deployment / Incident Response / Maintenance / Disaster Recovery)
- **System/service name and what it does** (brief description)
- **Audience** (new on-call engineers / experienced SREs / DevOps team)
- **Tech stack** (where relevant — e.g. Kubernetes, AWS RDS, Node.js)
- **Monitoring tools** (e.g. Grafana, Datadog, CloudWatch, Splunk — used to name specific dashboards and alert links in the steps)
- **Key environment details** (e.g. Kubernetes cluster name, AWS account/region, relevant namespaces or resource names — paste what's relevant for exact commands)
## Output Format
---
**Runbook:** [Runbook Title]
**Service:** [Service Name]
**Type:** [Deployment / Incident Response / Maintenance / DR]
**Last Updated:** [Insert today's date in YYYY-MM-DD format]
**Owner:** [Team or person]
**Severity:** [P1 / P2 / P3 — if incident-type]
---
### Overview
**What this runbook covers:**
[12 sentences on the scenario this runbook handles]
**When to use this runbook:**
- [Specific trigger condition 1 — e.g. PagerDuty alert: `high-error-rate-payment-service`]
- [Specific trigger condition 2 — e.g. Deploy needed after PR merged to `main`]
**Estimated time to complete:** [X minutes / XY minutes depending on outcome]
**Impact if not completed correctly:** [e.g. Payment processing degraded / Data loss risk / Users locked out]
---
### Prerequisites
**Access required:**
- [ ] [System/tool access — e.g. AWS Console: `production-account`]
- [ ] [Credential — e.g. `vault read secret/payment-service`]
- [ ] [VPN / bastion access if needed]
**Tools required:**
- [ ] [Tool name and version — e.g. `kubectl` v1.28+]
- [ ] [CLI or dashboard name]
**Before you start:**
- [ ] [Prerequisite check — e.g. Verify current deployment is healthy in Grafana]
- [ ] [Prerequisite action — e.g. Announce in `#ops-live` that you're starting]
---
### Procedure
Number every step. Use exact commands. Do not paraphrase tool names or flags.
**Step 1: [Action name]**
[What you're doing and why — one sentence]
```bash
# Exact command
[command here]
```
**Expected output:** `[what should appear if this worked]`
**If this fails:** [Exact error message to look for] → [What to do, or see Troubleshooting]
**Step 2: [Action name]**
[Same structure as Step 1]
**Step 3: Verify**
Always include a verification step after the main procedure:
```bash
[verification command]
```
**Expected state:** [What a healthy system looks like after this runbook completes]
---
### Rollback
How to undo this procedure if something went wrong:
**Step R1: [Rollback action]**
```bash
[rollback command]
```
**Verify rollback:** `[command to confirm rollback succeeded]`
---
### Troubleshooting
| Symptom | Likely Cause | Resolution |
|---|---|---|
| [Error message or observable symptom] | [Why this happens] | [Exact fix or next step] |
| [Another symptom] | [Cause] | [Resolution] |
---
### Escalation
If this runbook does not resolve the issue:
| Condition | Who to Contact | How |
|---|---|---|
| [e.g. DB unavailable after 10 min] | [DBA on-call] | [PagerDuty policy: `db-oncall`] |
| [e.g. Payment provider unresponsive] | [Vendor contact] | [Contact in 1Password: `vendor-escalation`] |
**Always update the incident timeline in [tool] before escalating.**
---
### Post-Procedure Checklist
After completing the runbook:
- [ ] Announce completion in `#ops-live` with outcome
- [ ] Update the incident ticket / deploy log
- [ ] Verify alerts have resolved in monitoring dashboard
- [ ] If this revealed a gap in this runbook — update it now (link to edit process)
---
## Quality Checks
- [ ] Every step has an exact command (no "run the deploy script")
- [ ] Expected output is specified for each step so engineer knows if it worked
- [ ] Failure path is explicit for each step (not "if it fails, investigate")
- [ ] Rollback procedure is complete and independently testable
- [ ] Escalation table has no cells containing only "[Team name]" — every row must either have a real contact or be explicitly flagged as [FILL IN: on-call rotation link]
- [ ] Rollback section contains at least one concrete command (not left as "[rollback command]" placeholder)
- [ ] Runbook can be followed by someone who has never touched this system
## Usage Examples
- "Write a runbook for [service] deployment"
- "Create an incident response runbook for [alert type]"
- "I need a runbook for [procedure]"
- "Document the operational procedure for [X]"
- "Write an ops playbook for [scenario]"
## Anti-Patterns
- [ ] Do not write steps as vague actions like "run the deploy script" — every step must include the exact command
- [ ] Do not leave the rollback section as a placeholder — a runbook without a tested rollback procedure is incomplete and dangerous
- [ ] Do not omit expected output for each step — without it, the on-call engineer cannot tell if the step succeeded
- [ ] Do not write escalation contacts as "[Team name]" — every escalation row must have a real contact or an explicit flag to fill in
- [ ] Do not assume the reader knows the system — write for someone who has never touched it before
@@ -0,0 +1,261 @@
---
name: security-threat-model
description: "Write a STRIDE-based threat model for a service or feature. Use when asked to produce a threat model, document security risks, identify attack vectors, assess a service's security posture, or prepare for a security design review. Produces a structured threat model covering assets, trust boundaries, STRIDE threat enumeration per component, risk scores, mitigation controls, and residual risk sign-off."
---
# Security Threat Model Skill
Produce a complete STRIDE-based threat model for a service or feature. A threat model is not a list of things that could go wrong — it is a structured analysis of attackers, assets, boundaries, and controls that lets an engineering team make informed, documented security decisions.
A good threat model is specific enough that a new engineer can understand what is being protected, why each control exists, and what risk the team has accepted.
## Required Inputs
Ask for these if not already provided:
- **Service name and description** — what the service does, who uses it
- **Architecture overview** — components, dependencies, data flows (a diagram description or ASCII diagram is fine)
- **Deployment environment** — cloud provider, VPC/network topology, where it runs (Kubernetes, ECS, VMs, serverless)
- **Data sensitivity** — what data does this service handle? PII, payment data, credentials, internal-only?
- **Existing controls** — authentication method, encryption in transit/at rest, current WAF/firewall, existing security scanning
- **Trust levels** — who are the principals? (anonymous public, authenticated users, internal services, admins)
## Output Format
---
# Security Threat Model: [Service Name]
**Service:** [Name] | **Team:** [Team name]
**Author:** [Name] | **Reviewed by:** [Security lead / peer]
**Date:** [Date] | **Next review:** [Date — recommend 6 months or after major architecture change]
**Classification:** [Internal / Confidential]
---
## 1. Overview
[23 sentences describing the service, its role in the system, and the scope of this threat model. State what is in scope and what is explicitly out of scope.]
**In scope:**
- [Component or data flow]
- [Component or data flow]
**Out of scope:**
- [e.g. Third-party payment processor internals]
- [e.g. Corporate network / end-user devices]
---
## 2. Asset Register
Assets are the things worth protecting — data, capabilities, and reputational value.
| Asset | Description | Sensitivity | Owner |
|---|---|---|---|
| [e.g. User PII] | Names, email addresses, profile data | High — GDPR-regulated | [Team] |
| [e.g. API credentials] | Service-to-service auth tokens | Critical | [Team] |
| [e.g. Session tokens] | User authentication state | High | [Team] |
| [e.g. Audit logs] | Record of user and admin actions | Medium | [Team] |
| [e.g. Service availability] | Uptime of the [X] endpoint | Medium | [Team] |
**Data classification key:**
- **Critical** — Credential material; exposure enables direct system compromise
- **High** — PII, financial data, health data; regulated or high reputational impact
- **Medium** — Internal configuration, non-sensitive business data
- **Low** — Public information, anonymised data
---
## 3. Trust Boundaries and Architecture
Trust boundaries are the lines that separate zones with different trust levels. Threats often occur when data or requests cross a boundary.
```
┌─────────────────────────────────────────────────────────────────┐
│ INTERNET (Untrusted) │
│ │
│ [Public User] [Bot / Attacker] │
└──────────────────────────────┬──────────────────────────────────┘
│ HTTPS
─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─
Trust Boundary: Public → DMZ
─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─
┌──────────────────────────────────────────────────────────────────┐
│ DMZ / Edge Layer │
│ ┌────────────┐ ┌──────────────┐ │
│ │ WAF / CDN │────▶│ API Gateway │ │
│ └────────────┘ └──────┬───────┘ │
└──────────────────────────────┼───────────────────────────────────┘
─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─
Trust Boundary: Edge → Application VPC
─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─
┌──────────────────────────────────────────────────────────────────┐
│ Application VPC (Private) │
│ ┌──────────────┐ ┌────────────┐ ┌──────────────────┐ │
│ │ [Service A] │────▶│ [Service B]│────▶│ [Database] │ │
│ └──────────────┘ └────────────┘ └──────────────────┘ │
│ ▲ │
│ │ │
│ ┌──────────────┐ │ │
│ │ Admin (IAM) │─────────────┘ │
└──────────────────────────────────────────────────────────────────┘
```
**Trust Boundaries identified:**
| Boundary | From | To | Auth mechanism | Encrypted |
|---|---|---|---|---|
| TB-1 | Public internet | API Gateway | [JWT / OAuth / API key] | TLS 1.2+ |
| TB-2 | API Gateway | Service A | [mTLS / internal JWT / IAM role] | [Yes/No] |
| TB-3 | Service A | Database | [Connection string + IAM / username+password] | [Yes/No] |
| TB-4 | Admin | Service B | [IAM role / VPN + MFA] | TLS |
---
## 4. STRIDE Threat Analysis
STRIDE is a threat classification framework. For each significant component, enumerate threats in each category.
**STRIDE key:**
- **S** — Spoofing: Impersonating another user, service, or system
- **T** — Tampering: Modifying data or code without authorisation
- **R** — Repudiation: Denying an action occurred; insufficient audit trail
- **I** — Information Disclosure: Exposing data to unauthorised parties
- **D** — Denial of Service: Making the service unavailable
- **E** — Elevation of Privilege: Gaining capabilities beyond what is authorised
### Component: [API Gateway / Auth Layer]
| ID | Category | Threat | Attack vector | Existing control |
|---|---|---|---|---|
| T-001 | S | Attacker forges a JWT token to authenticate as another user | Weak signing key or algorithm confusion (alg:none) | [e.g. RS256 with key rotation / none] |
| T-002 | S | Attacker replays a stolen session token | Theft via XSS or network sniff | [e.g. Token expiry + refresh rotation] |
| T-003 | T | Attacker modifies request headers to bypass tenant isolation | Missing validation of tenant ID header | [e.g. Server-side tenant resolution / none] |
| T-004 | R | No audit trail for admin authentication events | Logging not configured for auth failures | [e.g. CloudTrail enabled / none] |
| T-005 | I | Auth error messages reveal whether an email exists | Verbose error responses | [e.g. Normalised error responses / none] |
| T-006 | D | Credential stuffing exhausts rate limits and blocks legitimate users | Automated login attempts | [e.g. Rate limiting per IP + CAPTCHA / none] |
| T-007 | E | Compromised low-privilege token used to call admin endpoint | Missing role check on admin routes | [e.g. RBAC middleware on all routes / none] |
### Component: [Application Service / Business Logic]
| ID | Category | Threat | Attack vector | Existing control |
|---|---|---|---|---|
| T-008 | T | SQL/NoSQL injection via unsanitised user input | Unparameterised queries | [e.g. ORM with parameterised queries / none] |
| T-009 | T | Mass assignment — attacker sets fields they should not (e.g. `isAdmin: true`) | API accepts extra fields without allowlist | [e.g. Input validation / none] |
| T-010 | I | Insecure direct object reference — user accesses another user's resource | Missing ownership check on resource ID | [e.g. Ownership middleware / none] |
| T-011 | I | Sensitive data in application logs (PII, tokens) | Over-logging in debug mode | [e.g. Log scrubbing / none] |
| T-012 | D | Unprotected expensive endpoint triggers large DB scan | No pagination or query cost limit | [e.g. Pagination enforced / none] |
| T-013 | R | Business-critical state changes not logged | No audit event on [operation] | [e.g. Audit log table / none] |
### Component: [Database]
| ID | Category | Threat | Attack vector | Existing control |
|---|---|---|---|---|
| T-014 | I | Database exposed to internet (misconfigured security group) | Direct connection from outside VPC | [e.g. No public IP, security group restricts to app subnet] |
| T-015 | I | Backup snapshots not encrypted or accessible to wrong accounts | Unencrypted snapshot, public S3 | [e.g. Encrypted snapshots, private S3 bucket] |
| T-016 | T | Privilege escalation via DB account with excessive permissions | App uses a superuser DB account | [e.g. Least-privilege DB role per service / none] |
| T-017 | D | Runaway query or bulk delete causes data loss or outage | No query timeout or soft-delete | [e.g. Statement timeout, soft-delete on critical tables / none] |
### Component: [Internal Service-to-Service Communication]
| ID | Category | Threat | Attack vector | Existing control |
|---|---|---|---|---|
| T-018 | S | Rogue internal service impersonates a trusted service | No mutual authentication between services | [e.g. mTLS / service mesh / none] |
| T-019 | I | Internal traffic sniffed on shared network | Unencrypted service-to-service calls | [e.g. Service mesh with TLS / none] |
| T-020 | E | Compromised internal service calls privileged endpoints | No scoping on internal tokens | [e.g. Scoped service tokens / none] |
---
## 5. Risk Register
Score each threat: **Likelihood (15)** × **Impact (15)** = **Risk Score (125)**
Priority bands: Critical (2025) | High (1219) | Medium (611) | Low (15)
| ID | Threat summary | Likelihood | Impact | Score | Priority | Status |
|---|---|---|---|---|---|---|
| T-001 | JWT forgery — auth bypass | 2 | 5 | 10 | Medium | [Open / Mitigated / Accepted] |
| T-002 | Session token replay | 3 | 4 | 12 | High | [Open / Mitigated / Accepted] |
| T-007 | Privilege escalation via missing role check | 3 | 5 | 15 | High | [Open / Mitigated / Accepted] |
| T-008 | SQL injection | 2 | 5 | 10 | Medium | [Open / Mitigated / Accepted] |
| T-010 | IDOR — cross-user data access | 3 | 4 | 12 | High | [Open / Mitigated / Accepted] |
| T-014 | Database exposed to internet | 1 | 5 | 5 | Low | [Open / Mitigated / Accepted] |
| T-018 | Rogue internal service impersonation | 2 | 4 | 8 | Medium | [Open / Mitigated / Accepted] |
---
## 6. Mitigations Table
For every Open threat with priority Medium or above, define a specific mitigation.
| ID | Threat | Mitigation | Owner | Target date | Ticket |
|---|---|---|---|---|---|
| T-002 | Session token replay | Implement token rotation on refresh — invalidate old token server-side immediately | [Engineer name] | [Date] | [JIRA-123] |
| T-007 | Privilege escalation | Add RBAC middleware to all `/admin/*` routes; write integration test for role boundary | [Engineer name] | [Date] | [JIRA-124] |
| T-010 | IDOR | Add ownership assertion to all resource-fetching service methods; add to code review checklist | [Engineer name] | [Date] | [JIRA-125] |
| T-011 | PII in logs | Audit logging calls for PII fields; add scrubbing to logger middleware | [Engineer name] | [Date] | [JIRA-126] |
| T-018 | Rogue service impersonation | Enable mTLS via service mesh or issue scoped service tokens per service | [Engineer name] | [Date] | [JIRA-127] |
---
## 7. Accepted Risks
Accepted risks are threats the team has decided not to mitigate right now. Every accepted risk must have a named owner and a review date.
| ID | Threat | Reason for acceptance | Risk owner | Review date |
|---|---|---|---|---|
| T-014 | Database public exposure | Database has no public IP assigned; control already in place — accepted as low likelihood | [Name] | [Date] |
| [ID] | [Threat] | [Reason — e.g. "Effort exceeds risk at current scale; re-evaluate at 10× traffic"] | [Name] | [Date] |
---
## 8. Security Controls Summary
| Control | Type | Covers threats | Implemented |
|---|---|---|---|
| JWT RS256 with 15-min expiry | Preventive | T-001, T-002 | [Yes / Partial / No] |
| RBAC middleware on all routes | Preventive | T-007, T-020 | [Yes / Partial / No] |
| Parameterised queries (ORM) | Preventive | T-008 | [Yes / Partial / No] |
| Rate limiting (100 req/min per IP) | Preventive | T-006, T-012 | [Yes / Partial / No] |
| CloudTrail / audit logging | Detective | T-004, T-013 | [Yes / Partial / No] |
| Automated SAST in CI pipeline | Detective | T-008, T-009 | [Yes / Partial / No] |
| Encrypted backups + private S3 | Preventive | T-015 | [Yes / Partial / No] |
| Least-privilege DB role | Preventive | T-016 | [Yes / Partial / No] |
| Incident response runbook | Corrective | All | [Yes / Partial / No] |
---
## 9. Review Cadence
| Trigger | Action |
|---|---|
| Every 6 months | Full threat model review — update risk scores, close mitigated items |
| Major architecture change | Update trust boundary diagram and re-run STRIDE for new components |
| Security incident | Review relevant threats; add any newly discovered vectors |
| New data classification | Add assets to register; assess whether new STRIDE categories apply |
| Third-party dependency added | Assess supply chain threats for the new dependency |
**Next scheduled review:** [Date]
**Review owner:** [Name / Security lead]
---
## Quality Checks
- [ ] Every trust boundary is named and its authentication mechanism is specified — not left as "TBD"
- [ ] Every Critical and High risk in the risk register has a mitigation with a named owner and a target date
- [ ] Every accepted risk has a named risk owner and a review date — no unowned accepted risks
- [ ] The asset register includes data sensitivity levels and at least one entry for credential material
- [ ] STRIDE analysis covers all major components — not just the API layer
- [ ] Mitigation actions are specific enough to become a ticket (not "improve security")
- [ ] The ASCII trust boundary diagram matches the architecture description provided
## Anti-Patterns
- [ ] Do not restrict STRIDE analysis to only the API layer — threats exist at every component including the database and internal services
- [ ] Do not leave mitigations as vague directives like "improve security" — every mitigation must be specific enough to become a ticket
- [ ] Do not accept risks without a named owner and a review date — unowned accepted risks are not managed risks
- [ ] Do not write a threat model that covers only theoretical threats — prioritise by likelihood and impact using the risk register
- [ ] Do not omit the asset register — without knowing what is being protected, the STRIDE analysis has no anchor
@@ -0,0 +1,300 @@
---
name: service-catalog-entry
description: "Write a service catalog entry for a microservice or internal platform service — covering service identity, purpose, architecture context, SLAs, API contract summary, data classification, dependencies, operational runbooks, and known limitations. Use when asked to document a service for an internal developer portal, write a service README for a platform catalog, create a service overview page, or onboard a new service to a service registry. Produces a complete service catalog entry suitable for an internal developer portal or wiki."
---
# Service Catalog Entry Skill
Produce a complete service catalog entry for a microservice or internal platform service — giving any engineer at the company the context they need to understand what the service does, how to depend on it, what its reliability characteristics are, and where to go when something goes wrong. A well-written catalog entry eliminates "who owns this?" and "is this safe to use?" questions that slow down teams depending on shared services.
## Required Inputs
Ask for these if not already provided:
- **Service name** — the canonical identifier used in code, monitoring, and deployments
- **Team and owner** — team name, tech lead name, and on-call contact
- **Architecture overview** — what the service does, what calls it, and what it calls
- **SLA requirements** — availability target, latency SLO, support tier, and maintenance window
- **Key APIs** — the most important endpoints other teams use (method, path, brief description)
- **Data handled** — what data the service stores or processes, sensitivity classification, retention
## Output Format
---
# Service Catalog: [Service Name]
> **[One sentence — what this service does for consumers, in plain language]**
>
> *e.g. "The Payments Service processes charge, refund, and subscription billing events for all Acme products."*
---
## Identity
| Field | Value |
|---|---|
| **Service name** | `[service-name]` |
| **Canonical repository** | [https://github.com/[org]/[repo]] |
| **Owner team** | [Team name] |
| **Tech lead** | [Name] ([Slack: @handle]) |
| **On-call rotation** | [PagerDuty service link] |
| **Slack channel** | `#[team-channel]` |
| **Support tier** | [Tier 1 — 24/7 / Tier 2 — business hours / Tier 3 — best effort] |
| **Status** | [Active / Deprecated / Sunset date: YYYY-MM-DD] |
| **Language / runtime** | [e.g. Go 1.22 / Python 3.12 / Node 20] |
| **Deployment platform** | [Kubernetes / ECS / Lambda / etc.] |
| **Environments** | [Production: URL] | [Staging: URL] | [Dev: URL] |
---
## What It Does
[Two to three paragraphs in plain language — no jargon or acronyms without explanation.]
[Paragraph 1: The business problem this service solves. What would break or be missing if this service did not exist?]
[Paragraph 2: How it works at a high level — the main processing model (e.g. request/response API, event-driven consumer, batch processor), what triggers it, and what it produces.]
[Paragraph 3: What this service is NOT responsible for — the explicit boundaries. This prevents other teams from building incorrect assumptions about scope.]
---
## Architecture Context
### System Diagram
```
[Upstream callers] [This Service] [Downstream dependencies]
[Web App] ──────────→ ──→ [Primary Database — PostgreSQL]
[Mobile API] ────────→ [Service Name] ──→ [Cache — Redis]
[Partner API] ────────→ (Port 8080/gRPC) ──→ [Message Queue — Kafka/SQS]
──→ [External Service / API]
↓ emits events to
[Event Bus / SNS]
↓ consumed by
[Downstream Service A]
[Downstream Service B]
```
### Who Depends on This Service
| Caller | How they use it | Contact |
|---|---|---|
| [Service / Team A] | [e.g. "Calls POST /charges to initiate payments"] | [Slack: #team-a] |
| [Service / Team B] | [e.g. "Subscribes to payment.completed events via Kafka topic"] | [Slack: #team-b] |
| [Service / Team C] | [e.g. "Calls GET /subscriptions for billing status"] | [Slack: #team-c] |
### What This Service Depends On
| Dependency | Type | Criticality | Their on-call |
|---|---|---|---|
| [PostgreSQL instance] | Database | Critical — all writes fail without it | [DBA team: #db-oncall] |
| [Redis cluster] | Cache | High — latency degrades without it | [Infra team: #infra-oncall] |
| [Kafka cluster] | Message queue | High — async events queue | [Infra team: #infra-oncall] |
| [Stripe API] | External API | Critical — payment processing fails | [vendor status: status.stripe.com] |
| [Auth Service] | Internal service | Critical — all auth fails | [Auth team: #auth-oncall] |
---
## Service Level Agreement
### Availability and Latency
| SLO | Target | Measurement window | Error budget |
|---|---|---|---|
| Availability | [99.9%] | Rolling 30 days | [43 min/month] |
| p50 latency (key endpoints) | < [50] ms | Rolling 24 hours | — |
| p99 latency (key endpoints) | < [500] ms | Rolling 24 hours | — |
| p99.9 latency (key endpoints) | < [2000] ms | Rolling 24 hours | — |
| Error rate | < [0.1]% | Rolling 1 hour | — |
**SLO dashboard:** [Link to monitoring dashboard]
**Current error budget remaining:** [Link to SLO dashboard or inline value]
### Support Tiers
| Tier | Scope | Response time | Resolution time |
|---|---|---|---|
| P1 — Service down | All authenticated requests failing | 15 minutes | 1 hour |
| P2 — Significant degradation | Error rate >1% or p99 >2× SLO | 30 minutes | 4 hours |
| P3 — Minor issues | Non-critical endpoints degraded | Next business day | 3 business days |
| Feature requests / bugs | Via standard ticket process | [Ticket SLA] | Per roadmap |
**To raise an incident:** Page via [PagerDuty service link] or post in `#incidents`.
**To raise a feature request or bug:** File a ticket in [JIRA project / GitHub repo Issues].
### Maintenance Windows
- **Planned downtime:** [e.g. "Sundays 02:0004:00 UTC — advance notice posted to #[team-channel] 48h before"]
- **Deployment window:** [e.g. "Weekdays 10:0016:00 UTC — no deploys on Fridays or the day before a public holiday"]
- **Breaking changes notice:** [e.g. "Minimum 30 days notice for breaking API changes — see versioning policy below"]
---
## API Contract
### Authentication
All API calls require: [e.g. "Bearer token via Authorization header. Tokens are issued by the Auth Service (`/api/v1/token`)"]
```
Authorization: Bearer [jwt-token]
Content-Type: application/json
```
### Base URL
| Environment | Base URL |
|---|---|
| Production | `https://[service-name].internal.[company].com` |
| Staging | `https://[service-name].staging.[company].com` |
| Local development | `http://localhost:[port]` |
### Key Endpoints
| Method | Path | Description | Auth required | Rate limit |
|---|---|---|---|---|
| `GET` | `/health` | Liveness and readiness check | No | None |
| `GET` | `/api/v1/[resource]` | [Description — e.g. "List resources for the authenticated user"] | Yes | [100 req/min] |
| `GET` | `/api/v1/[resource]/:id` | [Description — e.g. "Get a single resource by ID"] | Yes | [500 req/min] |
| `POST` | `/api/v1/[resource]` | [Description — e.g. "Create a new resource"] | Yes | [50 req/min] |
| `PUT` | `/api/v1/[resource]/:id` | [Description — e.g. "Update an existing resource"] | Yes | [50 req/min] |
| `DELETE` | `/api/v1/[resource]/:id` | [Description] | Yes | [20 req/min] |
**Full API documentation:** [OpenAPI/Swagger spec URL] | [Postman collection URL]
### Versioning Policy
- API version is in the URL path (`/api/v1/`, `/api/v2/`)
- Minor additions (new optional fields, new endpoints) are non-breaking — no version bump
- Breaking changes (removed fields, changed types, authentication changes) require a new major version
- Deprecated versions are supported for [90 days] after the successor reaches GA
- Deprecation notices are posted to `#[team-channel]` and emailed to registered consumers
### Error Response Format
```json
{
"error": {
"code": "[ERROR_CODE]",
"message": "[Human-readable description]",
"request_id": "[UUID — include in support tickets]",
"details": {}
}
}
```
Common error codes:
| HTTP status | Error code | Meaning |
|---|---|---|
| 400 | `INVALID_REQUEST` | Request body or parameters fail validation |
| 401 | `UNAUTHENTICATED` | Missing or invalid auth token |
| 403 | `FORBIDDEN` | Token valid but lacks permission for this resource |
| 404 | `NOT_FOUND` | Resource does not exist |
| 409 | `CONFLICT` | Duplicate resource or state conflict |
| 422 | `UNPROCESSABLE_ENTITY` | Request is valid but violates business rules |
| 429 | `RATE_LIMITED` | Too many requests — back off and retry |
| 500 | `INTERNAL_ERROR` | Unexpected server error — include request_id in support ticket |
| 503 | `SERVICE_UNAVAILABLE` | Downstream dependency unavailable — retry with backoff |
### Events Published (if event-driven)
| Event | Topic / Queue | Schema | Published when |
|---|---|---|---|
| `[resource].created` | `[kafka-topic / sns-arn]` | [Schema URL] | [When a new resource is created] |
| `[resource].updated` | `[kafka-topic / sns-arn]` | [Schema URL] | [When a resource is modified] |
| `[resource].deleted` | `[kafka-topic / sns-arn]` | [Schema URL] | [When a resource is deleted] |
---
## Data Classification
| Data element | Sensitivity | Stored in | Retention | Encrypted at rest |
|---|---|---|---|---|
| [User PII — e.g. email, name] | [PII / Restricted] | [PostgreSQL `users` table] | [Until account deletion] | Yes |
| [Financial data — e.g. card last 4] | [PCI / Highly restricted] | [PostgreSQL `payment_methods` table] | [7 years per regulations] | Yes — field-level encryption |
| [Operational logs] | [Internal] | [CloudWatch / Datadog] | [90 days] | Yes (at rest, not searched) |
| [Anonymised analytics] | [Public] | [Data warehouse] | [Indefinite] | Yes |
**Data residency:** [e.g. "All data stored in us-east-1. EU customer data stored in eu-west-1 per GDPR requirements."]
**Compliance scope:** [e.g. SOC 2 Type II / PCI DSS Level 2 / HIPAA / GDPR]
**Data access policy:** [e.g. "Production database access requires [approval process]. Access logged and reviewed quarterly."]
---
## Operational Runbooks
| Runbook | Location | Use when |
|---|---|---|
| On-call runbook | [Wiki / GitHub link] | Responding to PagerDuty alerts |
| Deployment runbook | [Wiki / GitHub link] | Deploying a new version to production |
| Database migration runbook | [Wiki / GitHub link] | Running schema migrations |
| Rollback runbook | [Wiki / GitHub link] | Rolling back a bad deploy |
| Incident response runbook | [Wiki / GitHub link] | Declaring and managing incidents |
| Disaster recovery plan | [Wiki / GitHub link] | Zone/region failure or data loss |
**Monitoring dashboards:**
| Dashboard | Link | Use it for |
|---|---|---|
| Service overview | [Datadog / Grafana link] | Error rate, latency, throughput |
| Infrastructure | [Link] | CPU, memory, pod health |
| Database | [Link] | Query performance, connection pool |
| SLO / error budget | [Link] | Budget burn rate, availability |
| Dependency health | [Link] | Upstream dependency status |
---
## Known Limitations
Document limitations honestly — this section prevents other teams from building on incorrect assumptions.
| Limitation | Impact | Workaround | Planned fix |
|---|---|---|---|
| [e.g. No bulk write API — items must be created one at a time] | [Slow for large imports — N HTTP calls required] | [Use the batch import CLI tool for >100 items] | [Bulk API in Q3 — ticket: [URL]] |
| [e.g. List endpoints have a maximum page size of 100] | [Cannot retrieve more than 100 items in a single call] | [Paginate using `cursor` parameter] | [No current plan to increase — by design] |
| [e.g. Rate limits are per-token, not per-service] | [High-traffic consumers may hit limits for other consumers on the same token] | [Request dedicated service-account token] | [Per-service rate limits in roadmap] |
| [e.g. Eventual consistency on read-after-write for list endpoints] | [Record may not appear in list immediately after creation (<500ms lag)] | [Use GET /:id to confirm creation; do not rely on list for immediate consistency] | [Read-your-writes consistency available via `?consistent=true` — in progress] |
---
## Getting Started
**To start using this service:**
1. Request access: [Link to access request form or instructions]
2. Get your service account credentials: [Link to process]
3. Read the API docs: [OpenAPI spec URL]
4. Try the sandbox environment: `https://[service-name].sandbox.[company].com`
5. Join the consumer Slack channel: `#[service-name]-consumers`
**Client libraries (if available):**
| Language | Package | Installation |
|---|---|---|
| [Python] | [`[package-name]`] | `pip install [package-name]` |
| [Go] | [`github.com/[org]/[package]`] | `go get github.com/[org]/[package]` |
| [TypeScript/JS] | [`@[org]/[package]`] | `npm install @[org]/[package]` |
---
## Quality Checks
- [ ] "What It Does" is written without jargon — a new engineer from another team can understand it in under 2 minutes
- [ ] SLO targets are specific numbers agreed with stakeholders — not aspirational or copied from a template
- [ ] All direct upstream consumers are listed in the "Who Depends on This" table — no omissions
- [ ] API error codes are accurate and tested — not aspirational documentation
- [ ] Known limitations are honest — nothing is glossed over to make the service look better than it is
- [ ] All runbook links are live — not broken references or TODO placeholders
- [ ] Data classification includes retention period and encryption status — not just sensitivity level
- [ ] The entry has been reviewed by at least one consumer team to confirm it matches their experience of the service
## Anti-Patterns
- [ ] Do not write aspirational SLO targets — targets must be agreed with stakeholders and based on historical data, not copied from a template
- [ ] Do not leave runbook links as TODO placeholders — broken or missing links make the catalog entry worse than useless during an incident
- [ ] Do not omit the "Known Limitations" section to make the service look better — undisclosed limitations cause incorrect integrations and downstream incidents
- [ ] Do not list API error codes without testing them — aspirational error documentation misleads consumers
- [ ] Do not write the "What It Does" section with jargon — a new engineer from another team must understand it in under 2 minutes
@@ -0,0 +1,239 @@
---
name: slo-error-budget
description: "Define Service Level Objectives (SLOs) and an error budget policy for a service. Use when asked to write SLOs, define SLIs, calculate an error budget, set reliability targets, or create an error budget policy. Produces a complete SLO document with SLI definitions, target calculation, error budget policy, burn rate alerts, and review cadence."
---
# SLO and Error Budget Skill
Produce a complete, implementable SLO document for a service — covering what to measure, what target to set, how to calculate the error budget, and what to do when it burns.
A good SLO is not a target to hit. It is an agreement about what reliability means for your users — and a framework for making principled trade-offs between reliability and velocity.
## Required Inputs
Ask for these if not already provided:
- **Service name** and brief description of what it does
- **Primary users** — who depends on this service and how
- **User-facing interactions** to protect — e.g. API calls, page loads, transactions
- **Current reliability data** — error rate, latency, uptime (last 3090 days if available)
- **Existing on-call setup** — who responds to alerts?
- **Deployment frequency** — how often does the team ship?
- **Any existing SLAs** with customers — these constrain SLO targets
## Key Definitions
Always establish these before writing the SLO:
| Term | Definition |
|---|---|
| **SLI** (Service Level Indicator) | The metric being measured — e.g. "% of requests completing successfully in <500ms" |
| **SLO** (Service Level Objective) | The target for that metric — e.g. "99.5% of requests" |
| **SLA** (Service Level Agreement) | The contractual commitment to customers — must be looser than the SLO |
| **Error budget** | The allowed headroom below 100% — the budget for planned and unplanned downtime |
| **Burn rate** | How fast the error budget is being consumed |
---
## Output Format
---
# SLO Document: [Service Name]
**Service:** [Name] | **Team:** [Team name]
**Owner:** [Name / role] | **Approved by:** [Name]
**Effective date:** [Date] | **Review date:** [Date + 3 months]
**Version:** [1.0]
---
## Why This SLO Exists
[23 sentences. What reliability problem are we solving? What was happening before this SLO that made us need it? What decision-making does this SLO enable?]
---
## Service Overview
**What this service does:** [One sentence]
**Who depends on it:** [Internal teams / external customers / both — describe]
**Critical user journeys protected by this SLO:**
1. [Journey 1 — e.g. "User completes a payment"]
2. [Journey 2]
3. [Journey 3]
---
## SLIs — What We Measure
Define one SLI per user journey or reliability dimension. Keep it to 35 SLIs maximum.
### SLI 1: [Name — e.g. Request Success Rate]
| Field | Detail |
|---|---|
| **What it measures** | [e.g. "% of API requests that return a non-5xx response"] |
| **Good event definition** | [e.g. "HTTP response with status 2xx or 4xx, completed within 500ms"] |
| **Bad event definition** | [e.g. "HTTP response with status 5xx, or any response taking >500ms"] |
| **Measurement source** | [e.g. "Application load balancer access logs / Datadog APM / Prometheus"] |
| **Measured over** | Rolling 28-day window |
| **Exclusions** | [e.g. "Health check endpoints excluded / Requests during planned maintenance excluded"] |
### SLI 2: [Name — e.g. Latency]
| Field | Detail |
|---|---|
| **What it measures** | [e.g. "P99 response time for the /checkout endpoint"] |
| **Good event definition** | [e.g. "Request completes in ≤500ms at P99"] |
| **Bad event definition** | [e.g. "Request takes >500ms at P99"] |
| **Measurement source** | [Source] |
| **Measured over** | Rolling 28-day window |
| **Exclusions** | [Any exclusions] |
### SLI 3: [Name — e.g. Data Freshness / Queue Depth / etc.]
[Same structure]
---
## SLO Targets
| SLI | Target | Window | Error Budget |
|---|---|---|---|
| [SLI 1 name] | [X]% | 28-day rolling | [100 - X]% = [Y minutes/month] |
| [SLI 2 name] | [X]% | 28-day rolling | [100 - X]% = [Y minutes/month] |
| [SLI 3 name] | [X]% | 28-day rolling | [100 - X]% = [Y minutes/month] |
**How targets were set:**
- Historical baseline (last 90 days): [X]%
- Target is set [above / at] historical baseline to [improve reliability / reflect current reality while formalising the commitment]
- Rationale: [12 sentences]
**What 100% is NOT the target:** [Brief explanation of why targeting 100% is counterproductive — it discourages feature development and doesn't reflect user reality]
---
## Error Budget Calculation
**For SLI 1 ([Name]), at [X]% target:**
```
Error budget = (100% - SLO target) × measurement window
= (100% - [X]%) × 28 days × 24 hours × 60 minutes
= [Y]% × [Z total minutes]
= [N] minutes of allowed failure per 28-day window
```
**In plain terms:** We can afford [N] minutes of [bad events] in any rolling 28-day window before we breach the SLO.
---
## Burn Rate Alerts
Burn rate = how fast the error budget is being consumed relative to the budget window.
A burn rate of 1 = consuming the budget at exactly the rate that would exhaust it over 28 days.
| Alert | Burn rate | Window | Severity | Response |
|---|---|---|---|---|
| Page (critical) | >14× | 1 hour | P1 | Page on-call immediately — budget exhausted in <2 hours |
| Page (high) | >6× | 6 hours | P2 | Page on-call — budget exhausted in <5 days |
| Ticket (warning) | >3× | 3 days | P3 | Create ticket — review at next team meeting |
| Info | >1× | 28 days | Info | Log only — budget on track to exhaust by end of window |
**Alert implementation:** [Link to alert config in monitoring tool — e.g. Datadog, Prometheus/Alertmanager, Grafana]
---
## Error Budget Policy
This policy defines what to do with the error budget — both when it's healthy and when it's burning.
### When budget is healthy (>50% remaining)
- Feature development and deployments proceed at normal pace
- The team may take on riskier experiments
- Reliability improvements are scheduled but not urgent
### When budget is at risk (2550% remaining)
- Deployment frequency reduced — team ships only well-tested changes
- One reliability improvement added to current sprint
- Weekly error budget review added to team standup
### When budget is nearly exhausted (<25% remaining)
- Feature work paused in favour of reliability improvements
- No new deployments without explicit on-call approval
- Daily review of error budget burn rate
- CSM / support notified to manage customer expectations
### When budget is exhausted (0% remaining — SLO breached)
- All feature work stops
- On-call engineer and engineering manager notified immediately
- Post-incident review (PIR) required within 5 business days
- SLO target may be temporarily relaxed (with stakeholder approval) while root cause is addressed
---
## Dashboard and Reporting
**SLO dashboard:** [Link to Datadog / Grafana / etc. dashboard]
**Metrics exposed:**
- Current SLO compliance (rolling 28-day)
- Error budget remaining (% and minutes)
- Burn rate (current and trend)
- Incident count and MTTR this window
**Reporting cadence:**
| Audience | Frequency | Format |
|---|---|---|
| Engineering team | Weekly | Slack summary — #[service]-slo |
| Engineering manager | Monthly | SLO review meeting |
| Stakeholders / customers | Quarterly | SLO compliance summary |
---
## Exclusions and Edge Cases
**Planned maintenance:** Error budget is not consumed during pre-announced maintenance windows. Maintenance must be communicated [X hours] in advance via [channel].
**Dependency failures:** If SLO breach is caused by an upstream dependency outside our control, document it — but it still counts against our error budget (our users don't distinguish between our failures and our dependencies' failures).
**Force majeure:** [Policy for cloud provider outages, major infrastructure events]
---
## SLO Review Cadence
| Review | When | Who | Output |
|---|---|---|---|
| Error budget review | Weekly | Team | Budget health check — adjust if burning fast |
| SLO target review | Quarterly | Team + EM | Adjust targets if baseline has shifted significantly |
| Annual SLO audit | Annually | Team + Stakeholders | Review SLIs — are we measuring the right things? |
**When to change the SLO target:**
- Historical baseline has improved significantly and target no longer reflects real reliability
- User feedback indicates the target is misaligned with what users actually experience
- The SLO is being gamed (metric is healthy but users are unhappy)
---
## Quality Checks
- [ ] SLIs are user-facing — they measure what users experience, not internal system metrics
- [ ] Good and bad events are precisely defined — no ambiguity about what counts
- [ ] Targets are based on historical data, not aspirational round numbers
- [ ] Error budget policy has clear triggers and clear actions — not "discuss as a team"
- [ ] Burn rate alerts have different windows to catch both fast burns and slow burns
- [ ] Exclusions are documented so they don't silently inflate the SLO number
## Anti-Patterns
- [ ] Do not set SLO targets at 100% — this discourages feature development and does not reflect how users experience reliability
- [ ] Do not measure internal system metrics as SLIs — SLIs must reflect what users directly experience, not internal CPU or memory
- [ ] Do not write an error budget policy with vague triggers — "discuss as a team" is not an actionable policy; triggers must be specific percentages
- [ ] Do not base targets on aspirational round numbers — always derive from historical baseline data
- [ ] Do not configure only one burn-rate alert window — a single window misses both fast burns and slow burns that exhaust the budget quietly
@@ -0,0 +1,271 @@
---
name: sprint-velocity-analysis
description: "Analyze sprint velocity data and produce an engineering team health report covering delivery trends, capacity utilization, and improvement recommendations. Use when asked to analyze sprint velocity, review team delivery health, identify delivery risks, or produce a retrospective data analysis. Produces a velocity trend analysis, health diagnosis table, top improvement recommendations with implementation steps, and a next-sprint capacity forecast."
---
# Sprint Velocity Analysis
Analyze sprint velocity data to produce an honest engineering team health report. The goal is not to generate optimistic-looking charts — it is to surface delivery patterns, identify dysfunction early, and give the team and their manager actionable recommendations. Look for: velocity trends (improving, declining, flat, erratic), story point calibration consistency, carry-over patterns that indicate chronic over-commitment, and capacity-related signals. Produce text-based trend visualizations, a health diagnosis, and specific improvement recommendations with measurable targets.
## Required Inputs
Ask for these if not already provided:
- **Sprint history** — for each sprint: sprint name/number, committed story points, completed story points, and number of items carried over to next sprint; ideally 68 sprints minimum
- **Team size and any changes** — current team size and any additions or departures during the data window
- **Known disruptions** — holidays, company all-hands, on-call incidents, or other events that affected specific sprints
- **Cycle time data (optional)** — if available, p50 and p90 cycle time per sprint (time from start to done)
- **Definition of Done** — what "completed" means for this team (merged to main? deployed to prod? accepted by PO?)
If cycle time data is not provided, omit that section and note it as a recommended data source to add.
## Output Format
---
# Sprint Velocity Analysis: [Team Name]
**Analysis period:** Sprint [N] through Sprint [N+7] ([Date range])
**Team size:** [X engineers] ([note any changes during period])
**Report date:** [Date]
**Data source:** [Where this data came from — Jira, Linear, spreadsheet, etc.]
---
## Velocity Trend
### Raw Data
| Sprint | Committed | Completed | Completion Rate | Carried Over | Notes |
|--------|-----------|-----------|----------------|--------------|-------|
| [Sprint N] | [X pts] | [X pts] | [X%] | [X pts / X items] | [disruption or context] |
| [Sprint N+1] | [X pts] | [X pts] | [X%] | [X pts / X items] | |
| [Sprint N+2] | [X pts] | [X pts] | [X%] | [X pts / X items] | |
| [Sprint N+3] | [X pts] | [X pts] | [X%] | [X pts / X items] | |
| [Sprint N+4] | [X pts] | [X pts] | [X%] | [X pts / X items] | |
| [Sprint N+5] | [X pts] | [X pts] | [X%] | [X pts / X items] | |
| [Sprint N+6] | [X pts] | [X pts] | [X%] | [X pts / X items] | |
| [Sprint N+7] | [X pts] | [X pts] | [X%] | [X pts / X items] | |
| **Average** | **[X pts]** | **[X pts]** | **[X%]** | **[X pts]** | |
### Velocity Chart (Completed Points per Sprint)
```
Points
60 |
55 | ●
50 | ● ●
45 | ● ● ●
40 | ● ●
35 |
30 |
+--+--+--+--+--+--+--+--
N N+1 N+2 N+3 N+4 N+5 N+6 N+7
Sprint
● = Completed points — = Average ([X pts])
```
Generate this chart using ASCII characters based on the actual data provided. Scale the Y-axis to the data range. Plot completed (not committed) points. Mark the average as a dashed line.
### Trend Diagnosis
| Metric | Value | Interpretation |
|--------|-------|----------------|
| Average velocity | [X pts/sprint] | [Baseline for planning] |
| Velocity std deviation | [±X pts] | [Low < 15% of avg = stable; High > 25% = erratic] |
| Trend direction | [Improving / Flat / Declining / Erratic] | [3-sprint trailing average vs. 3-sprint leading average] |
| Average completion rate | [X%] | [Healthy: 8095%; < 75% = chronic over-commitment] |
| Carry-over rate | [X% of committed points carried over per sprint] | [Healthy: < 15%; > 25% = systemic issue] |
| Sprints with completion rate < 75% | [X of 8 sprints] | [> 3 of 8 = structural problem, not noise] |
---
## Story Point Calibration
Story points are only useful if they are applied consistently. Look for these calibration signals in the data:
| Signal | Observed | Interpretation |
|--------|----------|----------------|
| High variance in velocity despite stable team size | [Yes / No] | Suggests inconsistent estimation — same effort scored differently week to week |
| Consistent over-commitment (committed >> completed) | [Yes / No — by avg X pts per sprint] | Team is sandbagging estimates or ignoring historical capacity |
| Consistent under-commitment (completed >> committed by > 20%) | [Yes / No] | Team is over-padding estimates or pulling in unplanned work frequently |
| Frequent large items (> 13 pts) in carry-over | [Yes / No] | Items are too large to estimate reliably — need better decomposition |
| Velocity cliff after team change | [Yes / No — Sprint N+X] | Team did not re-baseline capacity after composition changed |
**Calibration verdict:** [Well-calibrated / Needs recalibration / Severely uncalibrated — one sentence explanation tied to the signals above]
**If recalibration is needed:** [Specific recommendation — e.g., "Run a calibration session using the last 20 completed items, re-score them as a team, and use the resulting relative sizes to anchor future estimates."]
---
## Carry-Over Pattern Analysis
Carry-over is the most reliable leading indicator of commitment reliability problems.
| Sprint | Carried-Over Items | Common Themes in Carry-Over |
|--------|-------------------|----------------------------|
| [Sprint N] | [X items / X pts] | [Technical debt, dependency blocked, scoped wrong, etc.] |
| [Sprint N+1] | [X items / X pts] | [Theme] |
| [Sprint N+2] | [X items / X pts] | [Theme] |
**Carry-over root causes identified:**
- [Root cause 1: e.g., "5 of 12 carry-overs were blocked on a third-party API integration — external dependency, not estimation failure"]
- [Root cause 2: e.g., "4 of 12 carry-overs were items estimated at 8+ points that were later found to be 23x larger than expected"]
- [Root cause 3: e.g., "3 of 12 carry-overs were interruptions from on-call incidents consuming unplanned capacity"]
---
## Capacity Utilization
| Sprint | Team Size | Available Capacity (pts) | Committed | Utilization % | Disruptions |
|--------|-----------|--------------------------|-----------|--------------|-------------|
| [Sprint N] | [X engineers] | [X pts] | [X pts] | [X%] | [Holiday / incident / none] |
| [Sprint N+1] | [X engineers] | [X pts] | [X pts] | [X%] | |
**Capacity calculation used:** [X engineers × Y pts/person/sprint = Z pts available. Adjust: if team capacity changed during the window, note which sprints used which team size.]
**Average utilization:** [X%]
**Utilization interpretation:** [< 70% = team is under-loaded or over-padding | 7090% = healthy range | > 90% = no slack for unplanned work — fragile]
---
## Health Diagnosis
| Dimension | Score | Evidence | Priority |
|-----------|-------|----------|----------|
| Delivery predictability | [Green / Yellow / Red] | [Average completion rate X%, std dev Y pts] | [High / Med / Low] |
| Commitment accuracy | [Green / Yellow / Red] | [Team over-commits by avg X pts/sprint] | |
| Estimation consistency | [Green / Yellow / Red] | [Velocity std dev ±X pts, calibration verdict] | |
| Carry-over hygiene | [Green / Yellow / Red] | [X% carry-over rate, root causes] | |
| Capacity management | [Green / Yellow / Red] | [Avg utilization X%, disruption handling] | |
| Trend direction | [Green / Yellow / Red] | [Trailing 3-sprint avg vs. leading 3-sprint avg] | |
**Scoring guide:** Green = operating within healthy range; Yellow = marginal — watch closely or single-sprint anomaly; Red = chronic issue requiring active intervention.
**Overall health:** [Green / Yellow / Red] — [One sentence summary: "The team delivers consistently at X pts/sprint but chronic over-commitment is eroding morale and creating a misleading picture for stakeholders."]
---
## Blocker Frequency Analysis
If blocker data was provided, complete this section. If not, note it as a recommended tracking addition.
| Blocker Category | Frequency (last 8 sprints) | Avg Days Blocked | Impact (pts delayed) |
|-----------------|--------------------------|------------------|---------------------|
| External dependency | [X occurrences] | [X days] | [X pts] |
| Technical debt / rework | [X occurrences] | [X days] | [X pts] |
| Unclear requirements | [X occurrences] | [X days] | [X pts] |
| On-call interruptions | [X occurrences] | [X days] | [X pts] |
| Environment / tooling | [X occurrences] | [X days] | [X pts] |
**Top blocker to address:** [Name the single highest-impact blocker category and what addressing it would mean for velocity.]
---
## Improvement Recommendations
Provide 3 specific recommendations ordered by expected impact. Each recommendation must include a measurable success target and implementation steps.
### Recommendation 1: [Title]
**Problem it addresses:** [Which health dimension is Red or Yellow, and what the data shows]
**What to do:**
1. [Specific action step — concrete enough that a tech lead can assign it]
2. [Next step]
3. [Next step]
**Who owns it:** [Tech lead / Engineering manager / Whole team]
**When to start:** [This sprint / Next sprint / Within 2 weeks]
**Measurable target:** [e.g., "Carry-over rate drops below 15% within 3 sprints" or "Completion rate above 80% for 4 consecutive sprints"]
**How to know it's working:** [Leading indicator to watch before the outcome metric improves — e.g., "Carry-over items decreasing sprint-over-sprint even before the target is hit"]
---
### Recommendation 2: [Title]
**Problem it addresses:** [Health dimension and evidence]
**What to do:**
1. [Step]
2. [Step]
3. [Step]
**Who owns it:** [Role]
**When to start:** [Timing]
**Measurable target:** [Specific metric and timeframe]
**How to know it's working:** [Leading indicator]
---
### Recommendation 3: [Title]
**Problem it addresses:** [Health dimension and evidence]
**What to do:**
1. [Step]
2. [Step]
**Who owns it:** [Role]
**When to start:** [Timing]
**Measurable target:** [Specific metric and timeframe]
**How to know it's working:** [Leading indicator]
---
## Next-Sprint Capacity Forecast
**Next sprint:** [Sprint N+8]
**Known team size:** [X engineers]
**Known capacity reducers:** [PTO: X days total, on-call rotation: ~Y pts of unplanned capacity, etc.]
| Factor | Impact |
|--------|--------|
| Base capacity (historical average) | [X pts] |
| PTO / planned absences | [X pts] |
| On-call overhead (estimate) | [X pts] |
| Carry-over from Sprint [N+7] | +[X pts committed capacity already spoken for] |
| **Recommended commitment ceiling** | **[X pts]** |
**Confidence:** [High — stable team and known capacity | Medium — some uncertainty in disruption level | Low — team composition uncertain]
**Recommendation for planning:** [One sentence — e.g., "Plan to Sprint [N+8] ceiling of X pts. Given the carry-over items, prioritize completing those before pulling in new scope."]
---
## Cycle Time Distribution (if data provided)
| Sprint | p50 Cycle Time | p90 Cycle Time | Items Completed |
|--------|---------------|---------------|-----------------|
| [Sprint N] | [X days] | [X days] | [X items] |
| [Average] | [X days] | [X days] | |
**Cycle time interpretation:** [p90 > 2× p50 indicates a long-tail of stuck items that deserve investigation. p50 increasing over time indicates slowing throughput independent of story point changes.]
If cycle time data was not provided: *Cycle time data was not included in this analysis. Recommend adding p50 and p90 cycle time per sprint to your tracking to detect throughput issues that story points alone cannot reveal.*
---
## Quality Checks
- [ ] Velocity chart is generated from the actual data provided — not a generic placeholder chart
- [ ] Trend diagnosis states a direction (Improving / Flat / Declining / Erratic) with a quantitative basis (trailing vs. leading average)
- [ ] Carry-over root causes are specific categories with counts — not a generic observation that carry-over exists
- [ ] Each of the 3 recommendations includes a named owner, a start date, and a measurable target with a timeframe
- [ ] Next-sprint capacity forecast uses historical average as the baseline and deducts specific known reducers
- [ ] Health diagnosis table uses Red/Yellow/Green with evidence cited in the Evidence column — no unsupported scores
- [ ] If metrics are missing (cycle time, blocker log), the report explicitly calls them out as recommended additions
## Anti-Patterns
- [ ] Do not generate the velocity chart from placeholder data — it must reflect the actual sprint data provided
- [ ] Do not diagnose trend direction without computing trailing vs leading averages — "it looks like it's declining" is not a diagnosis
- [ ] Do not list carry-over as a generic observation — identify root cause categories with counts for the analysis to be actionable
- [ ] Do not produce recommendations without a named owner, a start date, and a measurable target
- [ ] Do not score health dimensions without citing evidence in the Evidence column — unsupported Red/Yellow/Green scores are not credible
@@ -0,0 +1,141 @@
---
name: system-design-interview
description: "Structure a complete system design answer for interview questions or real architecture sessions. Use when asked to design a system, answer a system design interview question, or architect a solution at scale. Produces a structured answer covering requirements, capacity estimates, high-level design, component deep-dives, trade-offs, and follow-up considerations."
---
# System Design Interview Skill
Structures a complete, interview-grade system design response — covering clarifying questions, requirements, capacity estimates, architecture, component design, and trade-offs. Works equally well for real architecture sessions.
## Required Inputs
Ask for these if not provided:
- **The system to design** (e.g. "design a URL shortener", "design a notification service", "design Twitter's feed")
- **Scope** (interview prep / real architecture decision / practice run)
- **Scale target** (rough numbers: DAU, requests/sec, data volume — or "assume typical web scale")
- **Constraints or priorities** (e.g. prioritise availability over consistency, minimise cost, low-latency reads)
- **Time available** (interview context only: 30 / 45 / 60 minutes — skip for real architecture sessions)
- **Emphasis** (optional — any area to go deeper on, e.g. "focus on the DB design" or "spend more time on scaling")
## Output Format
### 1. Clarifying Questions
Before designing, list 46 questions that would change the design. Examples:
- Read-heavy or write-heavy? (affects caching and DB choice)
- Global or single-region? (affects latency requirements)
- Strong or eventual consistency? (affects storage and replication)
- Acceptable latency targets? (p50 / p99)
- Any existing infrastructure constraints?
Then proceed with stated assumptions if answering an interview question.
### 2. Functional Requirements
**Core features (must have):**
- [Feature 1]
- [Feature 2]
- [Feature 3]
**Out of scope (for this design):**
- [What's deliberately excluded and why]
### 3. Non-Functional Requirements
| Requirement | Target |
|---|---|
| Availability | [e.g. 99.9% / 99.99%] |
| Latency | [e.g. p95 < 100ms for reads] |
| Throughput | [e.g. 10k writes/sec peak] |
| Consistency | [Strong / Eventual] |
| Durability | [e.g. 99.999% — no data loss] |
### 4. Capacity Estimation
**Traffic:**
- DAU: [X]
- Reads/sec: [X] (peak: [X])
- Writes/sec: [X] (peak: [X])
**Storage:**
- Per record size: [X bytes]
- Records per day: [X]
- 5-year storage: [X GB/TB]
**Bandwidth:**
- Inbound: [X MB/s]
- Outbound: [X MB/s]
### 5. High-Level Architecture
Draw an ASCII diagram specific to this system. Do not default to the client→CDN→LB→API→Cache→DB template unless it genuinely applies. Label each component with the specific technology chosen (e.g. "Kafka" not "Message Queue", "PostgreSQL" not "DB"). Describe each component in 12 sentences explaining its role and why that technology was chosen.
### 6. Component Deep-Dive
Pick the 23 most critical/interesting components and go deep:
**[Component 1: e.g. Database Layer]**
- Choice: [Technology and why — e.g. PostgreSQL for ACID guarantees, Cassandra for write throughput]
- Schema design (high-level): [Key tables/collections and their structure]
- Indexing strategy: [What gets indexed and why]
- Replication: [Primary-replica / Multi-primary — and why]
**[Component 2: e.g. Caching Strategy]**
- Cache type: [Redis / Memcached — and why]
- What gets cached: [Hot data — e.g. user sessions, frequent reads]
- Cache invalidation: [TTL / Write-through / Write-behind — trade-offs]
- Cache hit rate target: [e.g. 95%]
**[Component 3: e.g. API Design]**
- Key endpoints: [List the 35 most important API calls]
- Authentication: [JWT / OAuth / API keys]
- Rate limiting: [Where and at what rate]
### 7. Data Flow
Walk through the two most critical paths end-to-end:
**Write path:** [Step 1 → Step 2 → Step 3...]
**Read path:** [Step 1 → Step 2 → Step 3...]
### 8. Scaling Bottlenecks and Mitigations
| Bottleneck | Mitigation |
|---|---|
| [e.g. DB write throughput] | [e.g. sharding by user_id, write batching] |
| [e.g. Hot-key cache misses] | [e.g. local in-process cache, probabilistic early expiry] |
| [e.g. Single region latency] | [e.g. multi-region deployment, GeoDNS routing] |
### 9. Trade-offs and Alternatives
Be explicit about what was chosen and what was sacrificed:
| Decision | Why | Trade-off |
|---|---|---|
| [e.g. Eventual consistency] | [Higher availability, lower latency] | [Stale reads possible] |
| [e.g. SQL over NoSQL] | [Complex queries, ACID transactions] | [Harder to shard horizontally] |
| [e.g. Async processing via queue] | [Decoupled, more resilient] | [Eventual delivery, harder to debug] |
### 10. Follow-up Considerations
Things to tackle in production but out of scope for this design session:
- Monitoring and alerting (what metrics matter)
- Disaster recovery and backup strategy
- Security (auth, encryption at rest/transit, rate limiting)
- Cost optimisation at scale
- Gradual rollout and feature flagging
## Quality Checks
- [ ] Clarifying questions are design-changing (not generic filler)
- [ ] Capacity estimates show the arithmetic: DAU → requests/day → requests/sec → storage per record → total storage, so the numbers can be sanity-checked
- [ ] Every row in the Trade-offs table has a non-empty Trade-off column (no rows where the trade-off is blank or says "none")
- [ ] At least 2 component deep-dives with technology choices justified
- [ ] Trade-offs section is honest (not just benefits of chosen approach)
- [ ] Data flow is described end-to-end for the critical path
## Anti-Patterns
- [ ] Do not jump to solutions before clarifying requirements — always establish functional and non-functional requirements first
- [ ] Do not present a design without discussing trade-offs — every architecture decision has costs and benefits that must be acknowledged
- [ ] Do not use vague capacity estimates — show the actual calculation (QPS, storage bytes, bandwidth) not just "this handles scale"
- [ ] Do not design for unlimited scale by default — match the design to the requirements stated
- [ ] Do not skip the data model — a system design without entity definitions and data flow is incomplete
## Usage Examples
- "Help me answer a system design interview: [question]"
- "Design [system] for a system design interview"
- "How would I architect [system] at scale?"
- "I have a system design interview — the question is [X]"
- "Design a [URL shortener / chat system / notification service / feed]"
@@ -0,0 +1,298 @@
---
name: tech-radar
description: "Build a technology radar for an engineering team, categorizing technologies into Adopt/Trial/Assess/Hold quadrants following the ThoughtWorks Tech Radar format. Use when asked to create a tech radar, evaluate the team's technology landscape, categorize tools and frameworks, or establish a technology strategy. Produces a full tech radar with quadrant tables, individual blip rationales, a decision trail, and a maintenance process guide."
---
# Tech Radar
Produce a complete technology radar document for an engineering team. The radar gives the team a shared, explicit position on every significant technology in their stack — what to standardize on, what to experiment with, what to evaluate, and what to actively stop using. Follow the ThoughtWorks Tech Radar format: four quadrants (Techniques, Tools, Platforms, Languages & Frameworks) each with four rings (Adopt, Trial, Assess, Hold). Each technology entry ("blip") gets a ring assignment, a one-paragraph rationale, and a date. Include a decision trail showing what moved and why, and a maintenance process the team can run to keep the radar current.
## Required Inputs
Ask for these if not already provided:
- **Team or company name** — for the document header
- **Current tech stack** — list every significant technology, tool, language, and platform the team currently uses
- **Technologies under active evaluation** — tools or frameworks the team is currently trying or considering
- **Technologies to deprecate or move off** — anything the team wants to stop using or is actively migrating away from
- **Strategic technology bets** — any technologies the company has made a deliberate bet on (e.g., "we're all-in on Kubernetes" or "migrating to event-driven architecture")
- **Team context** — team size, product domain, and any constraints (regulatory, compliance, vendor lock-in concerns)
If a technology is mentioned without a ring placement, use the rationale inputs to determine the appropriate ring. When uncertain between two rings, ask.
## Output Format
---
# Technology Radar: [Team / Company Name]
**Edition:** [Month Year]
**Maintained by:** [Team Name / Architecture Guild / CTO Office]
**Review cadence:** Bi-annual (every 6 months)
**Next review:** [Month Year + 6 months]
---
## How to Read This Radar
This radar reflects [Team / Company Name]'s current thinking on technologies we use, evaluate, and retire. Use it to make consistent technology choices, onboard new engineers, and have structured conversations about the stack.
**Quadrants** categorize the type of technology:
| Quadrant | What belongs here |
|----------|------------------|
| **Techniques** | Methods, patterns, and practices (e.g., trunk-based development, event sourcing) |
| **Tools** | Software tools used in the development and delivery process (e.g., linters, CI systems, observability platforms) |
| **Platforms** | Infrastructure and hosting environments (e.g., AWS, Kubernetes, Snowflake) |
| **Languages & Frameworks** | Programming languages and application frameworks (e.g., Go, React, FastAPI) |
**Rings** express our recommendation:
| Ring | Meaning | What to do |
|------|---------|-----------|
| **Adopt** | Industry-proven, working well for us — our standard choice | Use by default for new work; no special justification needed |
| **Trial** | Worth pursuing — we are experimenting with it in limited production use | Use in a bounded context with architectural oversight; share learnings |
| **Assess** | Worth exploring — we have not used it in production yet | Spike, prototype, or research; do not use in production without a review |
| **Hold** | Do not start new work with this technology | Complete existing commitments; do not expand use; plan migration |
---
## Quadrant 1: Techniques
### Adopt
| Technology | Since | Notes |
|------------|-------|-------|
| [Technique name, e.g., Trunk-based development] | [Month Year] | [One sentence: why we adopted it and what it replaced] |
| [Technique name] | [Month Year] | [One sentence rationale] |
| [Technique name] | [Month Year] | [One sentence rationale] |
**[Technique name] — Adopt**
[One paragraph rationale. Explain what problem this technique solves, why it works well in your context, and what the team should know before applying it. Reference any internal experience — e.g., "We rolled this out across 8 services in 2024 and saw a 40% reduction in merge conflicts."]
[Repeat for each Adopt-ring technique.]
### Trial
| Technology | Since | Notes |
|------------|-------|-------|
| [Technique name] | [Month Year] | [One sentence: what we're testing and where] |
**[Technique name] — Trial**
[One paragraph. What are we trialing? In which teams or services? What hypothesis are we testing? What would cause us to move it to Adopt vs. Hold?]
### Assess
| Technology | Since | Notes |
|------------|-------|-------|
| [Technique name] | [Month Year] | [One sentence: why we're interested] |
**[Technique name] — Assess**
[One paragraph. Why is this interesting to us? What would we need to see to move it to Trial? Who is responsible for the assessment?]
### Hold
| Technology | Since | Notes |
|------------|-------|-------|
| [Technique name] | [Month Year] | [One sentence: why we're stopping and what replaces it] |
**[Technique name] — Hold**
[One paragraph. Why are we putting this on hold? What is the migration path? What is the target end-state for teams still using it?]
---
## Quadrant 2: Tools
### Adopt
| Technology | Since | Notes |
|------------|-------|-------|
| [Tool name, e.g., GitHub Actions] | [Month Year] | [One sentence rationale] |
| [Tool name] | [Month Year] | [One sentence rationale] |
**[Tool name] — Adopt**
[One paragraph rationale. Why is this our standard tool? What does it do well in our context? Any configuration or usage patterns the team should follow?]
[Repeat for each Adopt-ring tool.]
### Trial
| Technology | Since | Notes |
|------------|-------|-------|
| [Tool name] | [Month Year] | [One sentence: what we're testing] |
**[Tool name] — Trial**
[One paragraph rationale and trial scope.]
### Assess
| Technology | Since | Notes |
|------------|-------|-------|
| [Tool name] | [Month Year] | [One sentence: why we're evaluating it] |
**[Tool name] — Assess**
[One paragraph: what sparked interest, who is evaluating, and timeline.]
### Hold
| Technology | Since | Notes |
|------------|-------|-------|
| [Tool name] | [Month Year] | [One sentence: what replaces it] |
**[Tool name] — Hold**
[One paragraph: deprecation rationale and migration path.]
---
## Quadrant 3: Platforms
### Adopt
| Technology | Since | Notes |
|------------|-------|-------|
| [Platform name, e.g., AWS EKS] | [Month Year] | [One sentence rationale] |
| [Platform name] | [Month Year] | [One sentence rationale] |
**[Platform name] — Adopt**
[One paragraph. What does this platform provide? What are the boundaries of its use? Any internal golden-path setup the team should follow?]
[Repeat for each Adopt-ring platform.]
### Trial
| Technology | Since | Notes |
|------------|-------|-------|
| [Platform name] | [Month Year] | [One sentence: scope of trial] |
**[Platform name] — Trial**
[One paragraph rationale and trial boundaries.]
### Assess
| Technology | Since | Notes |
|------------|-------|-------|
| [Platform name] | [Month Year] | [One sentence: why we're exploring it] |
**[Platform name] — Assess**
[One paragraph assessment plan.]
### Hold
| Technology | Since | Notes |
|------------|-------|-------|
| [Platform name] | [Month Year] | [One sentence: migration target and timeline] |
**[Platform name] — Hold**
[One paragraph: what triggered the hold decision, migration target, and timeline.]
---
## Quadrant 4: Languages & Frameworks
### Adopt
| Technology | Since | Notes |
|------------|-------|-------|
| [Language/Framework, e.g., Go] | [Month Year] | [One sentence rationale] |
| [Language/Framework] | [Month Year] | [One sentence rationale] |
**[Language/Framework] — Adopt**
[One paragraph. What is this language or framework used for? What are the team's proficiency expectations? Any frameworks or libraries that go alongside it as part of the standard choice?]
[Repeat for each Adopt-ring language or framework.]
### Trial
| Technology | Since | Notes |
|------------|-------|-------|
| [Language/Framework] | [Month Year] | [One sentence: bounded use case] |
**[Language/Framework] — Trial**
[One paragraph rationale.]
### Assess
| Technology | Since | Notes |
|------------|-------|-------|
| [Language/Framework] | [Month Year] | [One sentence: interest driver] |
**[Language/Framework] — Assess**
[One paragraph assessment plan.]
### Hold
| Technology | Since | Notes |
|------------|-------|-------|
| [Language/Framework] | [Month Year] | [One sentence: reason and migration path] |
**[Language/Framework] — Hold**
[One paragraph: deprecation rationale, existing system obligations, and timeline to retire.]
---
## Decision Trail
This log records every ring movement since the radar's first edition. Use it to understand the evolution of our technology choices.
| Technology | Quadrant | Previous Ring | New Ring | Edition | Reason |
|------------|----------|--------------|----------|---------|--------|
| [Name] | [Quadrant] | — | Adopt | [Month Year] | First placement — [one sentence why] |
| [Name] | [Quadrant] | Assess | Trial | [Month Year] | [What prompted the move — evidence, team feedback, production trial results] |
| [Name] | [Quadrant] | Trial | Adopt | [Month Year] | [Adoption rationale — usage results, team satisfaction, scale proven] |
| [Name] | [Quadrant] | Adopt | Hold | [Month Year] | [Why moved to Hold — better alternative, security concern, cost, vendor issue] |
| [Name] | [Quadrant] | — | Hold | [Month Year] | First placement — added directly to Hold because [reason] |
---
## Radar Maintenance Process
### Who Contributes
- **Architecture review group / CTO office** — final ring placement decisions
- **All engineers** — submit blip nominations via [channel or form]
- **Tech leads** — triage nominations and prepare proposals for review sessions
### Update Cadence
| Activity | Frequency | Owner |
|----------|-----------|-------|
| New blip nominations accepted | Ongoing — any engineer via [channel] | Anyone |
| Nomination triage | Monthly | Tech leads |
| Full radar review session | Every 6 months | Architecture group |
| Published radar update | Every 6 months | [Owner name or role] |
### How to Nominate a Blip
1. Submit to [Slack channel / form URL] with: technology name, quadrant, proposed ring, and one-paragraph rationale.
2. A tech lead reviews within 2 weeks and either schedules it for the next review session or requests more information.
3. At the review session, the architecture group discusses and votes. Simple majority wins; ties go to Hold pending further evidence.
4. Approved blips are added to the radar doc and the decision trail within 1 week of the session.
### Ring Change Criteria
| To move TO Adopt | To move TO Trial | To move TO Assess | To move TO Hold |
|-----------------|-----------------|-------------------|-----------------|
| Proven in multiple production systems; team broadly trained; clear operational runbook exists | At least one production use case running; architectural oversight in place; learnings documented | Concrete use case identified; spike completed or in progress; interest from at least 2 engineers | Better alternative exists; known security/compliance risk; strategic direction change; unacceptable maintenance burden |
---
*Questions about this radar: [Slack channel] | Submit a nomination: [URL or channel]*
---
## Quality Checks
- [ ] Every blip has a written rationale paragraph — not just a table row entry
- [ ] The decision trail is populated with at least the initial placement date for every blip
- [ ] Hold-ring entries include a concrete migration path or target technology, not just "stop using it"
- [ ] Ring definitions are present and include both what each ring means AND what engineers should do in response
- [ ] Maintenance process includes: nomination channel, review cadence, who decides, and ring-change criteria
- [ ] Technologies identified as "strategic bets" in the inputs are placed in Adopt (if proven) or Trial (if being rolled out)
- [ ] Technologies identified for deprecation are in Hold with a rationale that references the replacement
## Anti-Patterns
- [ ] Do not place a technology in Adopt without evidence it is proven at the team's scale — aspirational placements mislead engineers
- [ ] Do not add a blip without a written rationale paragraph — table rows without context are unusable
- [ ] Do not create a Hold entry without specifying a concrete migration path or target technology
- [ ] Do not skip the maintenance process — a radar with no process for updates becomes stale within two quarters
- [ ] Do not omit ring definitions — engineers need to know what they should do in response to each ring, not just what the ring means
@@ -0,0 +1,268 @@
---
name: technical-debt-register
description: "Document and prioritize a technical debt backlog with business impact, effort estimates, and resolution strategy. Use when asked to audit technical debt, create a debt register, prioritize tech debt for a quarter, document architectural shortcuts, or build a debt reduction roadmap. Produces a structured technical debt register covering debt inventory by category, business impact per item, effort and priority scores, top-item resolution plans, and a quarterly debt reduction roadmap."
---
# Technical Debt Register Skill
Produce a complete technical debt register for a team or service. A debt register is not a complaint list — it is a prioritized, business-impact-aware inventory that lets an engineering team make deliberate choices about which debt to pay down, in what order, and with what expected return.
Good debt management is not eliminating all debt. It is ensuring debt is visible, owned, and resolved when the interest cost exceeds the cost of fixing it.
## Required Inputs
Ask for these if not already provided:
- **Team or service name** — what team and/or service this register covers
- **Known debt items** — list of known technical debt, or ask Claude to elicit them by asking about: legacy code, missing tests, outdated dependencies, architectural shortcuts, manual processes, observability gaps, security backlogs
- **Tech stack** — language, frameworks, infrastructure (helps Claude categorise and score items correctly)
- **Team size and velocity** — number of engineers and approximate story points or days per sprint (needed for effort estimates)
- **Current quarter / planning period** — so the roadmap targets the right timeframe
## Output Format
---
# Technical Debt Register: [Team / Service Name]
**Team:** [Name] | **Service(s):** [Name(s)]
**Author:** [Name] | **Last updated:** [Date]
**Planning period:** [Q[X] [Year]] | **Review cadence:** [Monthly / Quarterly]
---
## Overview
[23 sentences describing the team's current debt situation, the main categories of debt, and the business context — e.g. are they in a growth phase where velocity matters, or approaching a compliance deadline where security debt is critical?]
**Total items in register:** [X]
**Unresolved items:** [X]
**Critical/High priority items:** [X]
**Estimated total resolution effort:** [X story points / X engineer-weeks]
---
## Debt Category Definitions
| Category | Description | Examples |
|---|---|---|
| **Code quality** | Code that works but is hard to change safely | Duplicated logic, deeply nested conditionals, inconsistent error handling, missing abstraction |
| **Architecture** | Structural decisions that limit scalability or increase coupling | Monolith that should be decomposed, sync calls that should be async, missing domain boundaries |
| **Testing** | Gaps in test coverage that increase regression risk | Missing unit tests, no integration tests, flaky test suite, no test data management |
| **Security** | Known vulnerabilities or missing security controls | Outdated dependencies with CVEs, missing rate limiting, hard-coded secrets, insufficient auth |
| **Dependencies** | Outdated or risky external dependencies | End-of-life libraries, major version lag, abandoned packages |
| **Infrastructure** | Infrastructure that limits reliability or developer productivity | Manual deployment steps, no IaC, single-AZ, missing autoscaling |
| **Observability** | Gaps in visibility that slow incident response | Missing metrics, no distributed tracing, poor log structure, no alerting on key SLIs |
| **Process** | Manual or error-prone operational processes | Manual DB migrations, no runbooks, tribal knowledge not documented |
---
## Debt Register
### Scoring Method
**Business impact (15):**
- 5 — Blocking growth, causing production incidents, or creating compliance risk
- 4 — Significantly slowing delivery or increasing incident likelihood
- 3 — Noticeable slowdown; manageable but accumulating
- 2 — Minor friction; low immediate risk
- 1 — Cosmetic or aspirational; no current business impact
**Effort to resolve (15, lower = easier):**
- 1 — <0.5 day; single engineer
- 2 — 0.52 days; single engineer
- 3 — 35 days; single engineer or small pair
- 4 — 12 weeks; team collaboration required
- 5 — >2 weeks; significant planning and coordination
**Priority score = Business impact × (6 Effort)** *(rewards high-impact, low-effort items)*
---
| ID | Item | Category | Business impact (15) | Effort (15) | Priority score | Status | Owner |
|---|---|---|---|---|---|---|---|
| TD-001 | [e.g. No integration tests for payment flow] | Testing | 5 | 3 | 15 | Open | [Name] |
| TD-002 | [e.g. Authentication library 3 major versions behind] | Security | 5 | 2 | 20 | Open | [Name] |
| TD-003 | [e.g. Database queries not using connection pooling] | Architecture | 4 | 2 | 16 | Open | [Name] |
| TD-004 | [e.g. Manual deployment process for [service]] | Infrastructure | 4 | 3 | 12 | In progress | [Name] |
| TD-005 | [e.g. 200-line God function in order processing] | Code quality | 3 | 3 | 9 | Open | [Name] |
| TD-006 | [e.g. No structured logging — plain text only] | Observability | 3 | 2 | 12 | Open | [Name] |
| TD-007 | [e.g. ORM version has known N+1 query issue] | Dependencies | 3 | 3 | 9 | Open | [Name] |
| TD-008 | [e.g. No runbook for [critical operation]] | Process | 3 | 1 | 15 | Open | [Name] |
| TD-009 | [e.g. Test coverage at 34% — no meaningful safety net] | Testing | 4 | 4 | 8 | Open | [Name] |
| TD-010 | [e.g. Hard-coded config values in application code] | Code quality | 2 | 1 | 10 | Open | [Name] |
| TD-011 | [e.g. Service deployed single-AZ with no failover] | Infrastructure | 5 | 4 | 10 | Open | [Name] |
| TD-012 | [e.g. No alerting on P95 latency for [endpoint]] | Observability | 4 | 1 | 20 | Open | [Name] |
---
## Category Breakdown
```
Category distribution (by item count):
─────────────────────────────────────────────
Code quality ████████░░ [X items] ([X]%)
Architecture ██████░░░░ [X items] ([X]%)
Testing █████████░ [X items] ([X]%)
Security ████░░░░░░ [X items] ([X]%)
Dependencies ███░░░░░░░ [X items] ([X]%)
Infrastructure ████░░░░░░ [X items] ([X]%)
Observability ████░░░░░░ [X items] ([X]%)
Process ██░░░░░░░░ [X items] ([X]%)
─────────────────────────────────────────────
Priority distribution:
Critical (score 2025): [X items]
High (score 1219): [X items]
Medium (score 611): [X items]
Low (score 15): [X items]
```
---
## Top 5 Priority Items — Resolution Plans
### TD-XXX: [Highest priority item name]
**Priority score:** [Score] | **Category:** [Category] | **Owner:** [Name]
**Problem:**
[23 sentences describing what the debt is, how it manifests, and what pain it currently causes. Be specific — reference actual incidents, slowdowns, or risks.]
**Business impact:**
[What happens if this is not resolved? Reference any incidents, near-misses, or growth blockers. E.g. "This caused 2 production incidents in the last quarter and adds ~30 minutes of debugging time to any change in this area."]
**Resolution approach:**
[Clear description of the fix. Not "improve the code" — describe the actual work: "Extract the payment processing logic into a dedicated `PaymentService` class, write unit tests to 80% coverage, and update the 3 call sites."]
**Steps:**
1. [Specific, ticketable step]
2. [Specific, ticketable step]
3. [Specific, ticketable step]
**Acceptance criteria:**
- [ ] [Measurable criterion — e.g. "Zero hard-coded config values remain in application code"]
- [ ] [Measurable criterion — e.g. "CI pipeline passes with new tests"]
- [ ] [Measurable criterion]
**Effort estimate:** [X story points / X days]
**Suggested sprint:** [Q[X] Sprint [Y] / When [dependency] is complete]
---
### TD-XXX: [Second priority item name]
**Priority score:** [Score] | **Category:** [Category] | **Owner:** [Name]
**Problem:**
[Description]
**Business impact:**
[Impact description]
**Resolution approach:**
[Approach description]
**Steps:**
1. [Step]
2. [Step]
3. [Step]
**Acceptance criteria:**
- [ ] [Criterion]
- [ ] [Criterion]
**Effort estimate:** [X story points / X days]
**Suggested sprint:** [Sprint or timeframe]
---
### TD-XXX: [Third priority item]
*(Follow same format as above)*
---
### TD-XXX: [Fourth priority item]
*(Follow same format as above)*
---
### TD-XXX: [Fifth priority item]
*(Follow same format as above)*
---
## Debt Reduction Roadmap
### Guiding principles
- Allocate [X%] of each sprint's capacity to debt resolution — recommended 1520% for healthy teams
- Security and dependency debt is addressed on a fixed cadence regardless of priority score
- No new feature work in modules with Critical debt unless the debt is scheduled for the current sprint
- Debt items closed without a resolution (accepted/deferred) must have a named owner and a review date
### Quarterly plan
| Quarter | Focus area | Items targeted | Estimated capacity | Expected outcome |
|---|---|---|---|---|
| **[Q1 Year]** (current) | Security + observability | TD-002, TD-012, TD-006 | [X] points / [Y] eng-days | Auth library current; latency alerting live; structured logging shipped |
| **[Q2 Year]** | Architecture + reliability | TD-003, TD-011, TD-004 | [X] points / [Y] eng-days | Connection pooling fixed; multi-AZ deployed; deploy automation complete |
| **[Q3 Year]** | Testing coverage | TD-001, TD-009 | [X] points / [Y] eng-days | Payment flow integration tests live; overall coverage ≥60% |
| **[Q4 Year]** | Code quality + process | TD-005, TD-008, TD-010 | [X] points / [Y] eng-days | God functions refactored; runbooks complete; zero hard-coded config |
### Sprint allocation model
```
Sprint capacity: [X] story points
Allocation:
├── Feature work: [X * 0.75 = ~Y] points (75%)
├── Debt resolution: [X * 0.15 = ~Y] points (15%)
└── Unplanned/bugs: [X * 0.10 = ~Y] points (10%)
Debt items that fit in one sprint ([≤Y] points each):
✓ TD-002 ([X] points)
✓ TD-012 ([X] points)
✓ TD-006 ([X] points)
✓ TD-008 ([X] points)
Multi-sprint debt items (break into phases):
~ TD-001: Phase 1 ([X] pts) → Phase 2 ([X] pts)
~ TD-009: Requires dedicated debt sprint or pairing
```
---
## Accepted / Deferred Debt
Items where the cost of remediation currently exceeds the business value, accepted with explicit review dates.
| ID | Item | Reason for deferral | Review date | Owner |
|---|---|---|---|---|
| TD-XXX | [Item] | [e.g. "Rewrite would require 3 weeks with no user-facing value at current scale; revisit at 10× traffic"] | [Date] | [Name] |
| TD-XXX | [Item] | [e.g. "Dependency has a CVE but no upgrade path exists until Q3; mitigated by WAF rule"] | [Date] | [Name] |
**Policy:** No item may be deferred more than twice without escalation to the engineering manager.
---
## Quality Checks
- [ ] Every item has a named owner — no unowned debt
- [ ] Priority scores are calculated using the formula, not assigned arbitrarily
- [ ] Security and dependency items are not scored below their actual business impact because they feel "technical"
- [ ] Top-5 resolution plans include specific, ticketable steps — not vague descriptions like "improve test coverage"
- [ ] The quarterly roadmap allocates realistic capacity — debt allocation does not exceed actual sprint budget
- [ ] Accepted/deferred items have a review date and a named owner — no permanently deferred items
- [ ] The register distinguishes between debt (deliberate or accumulated shortcuts) and bugs (unintended defects)
- [ ] Items are closed as resolved only when acceptance criteria are met — not when the PR is merged
## Anti-Patterns
- [ ] Do not score debt items arbitrarily — priority scores must be calculated using the documented formula
- [ ] Do not conflate technical debt (deliberate shortcuts) with bugs (unintended defects) — they require different remediation strategies
- [ ] Do not underrate security and dependency items because they feel abstract — score based on actual business impact
- [ ] Do not create "permanently deferred" items — every accepted item must have a review date and named owner
- [ ] Do not include resolution plans that are vague descriptions — each plan must have specific, ticketable steps
@@ -0,0 +1,138 @@
---
name: test-strategy-doc
description: "Write a test strategy document from a feature spec, PRD, or system description. Use when asked to create a test plan, write a test strategy, define QA approach, or plan testing for a feature or release. Produces a complete test strategy with scope, risk assessment, test types, coverage targets, and a prioritised test case outline."
---
# Test Strategy Document Skill
Produces a complete test strategy from a feature spec, PRD, or system description — covering scope, test types, risk areas, coverage requirements, and a prioritised test case outline.
## Required Inputs
Ask for these if not provided:
- **Feature or system being tested** (paste a spec, PRD, or describe it in plain English)
- **Tech stack** (language and framework — e.g. TypeScript + React, Python + FastAPI)
- **Existing test coverage** (e.g. "we have unit tests but no E2E tests", "we use Jest + Playwright already", or "starting from scratch")
- **Deployment cadence** (e.g. continuous deployment / weekly releases / quarterly — affects what must be automated vs. manual)
- **Risk level** (low / medium / high / critical — affects depth and coverage requirements)
- **Timeline** (when does this need to ship — affects prioritisation)
- **Team context** (who is doing the testing — developers / dedicated QA / both)
## Output Format
### 1. Test Scope
**In scope:**
- [Specific functionality being tested]
- [Integration points covered]
- [User-facing flows included]
**Out of scope:**
- [What is deliberately not tested here — and why]
- [Dependencies owned by other teams]
**Assumptions:**
- [What the test strategy assumes is true — e.g. mocked services, test data availability]
### 2. Risk Assessment
Identify the highest-risk areas first — these drive depth and coverage:
| Area | Risk Level | Why | Test Priority |
|---|---|---|---|
| [e.g. Payment processing] | High | Money movement, regulatory | P0 — exhaustive |
| [e.g. User authentication] | High | Security boundary | P0 — exhaustive |
| [e.g. Email notifications] | Medium | External dependency | P1 — happy path + key failures |
| [e.g. UI copy changes] | Low | Visual only, reversible | P2 — smoke only |
### 3. Test Types and Coverage
**Unit Tests**
- **What:** Individual functions and methods in isolation
- **Who writes:** Developer
- **Coverage target:** [e.g. 80% line coverage on new code / 100% on critical paths]
- **Tools:** [e.g. Jest, pytest, go test]
- **Focus areas for this feature:** [Specific logic that needs unit coverage]
**Integration Tests**
- **What:** Service interactions, database operations, API contracts
- **Who writes:** Developer / QA
- **Coverage target:** [All happy paths + key failure modes]
- **Tools:** [e.g. Supertest, pytest + testcontainers]
- **Focus areas:** [Specific integrations at risk — e.g. third-party API, DB schema changes]
**End-to-End Tests**
- **What:** Critical user journeys from browser/client to database
- **Who writes:** QA / Developer
- **Coverage target:** [Top N user journeys — list them]
- **Tools:** [e.g. Playwright, Cypress, Selenium]
- **Focus areas:** [The 35 most critical user flows]
**Performance Tests** *(include if any row in the Risk Assessment table has performance as a risk factor, regardless of overall risk level)*
- **What:** Load, stress, or latency testing
- **Targets:** [Specific numbers — e.g. 200 req/sec at p95 < 200ms]
- **Tools:** [e.g. k6, Locust, JMeter]
**Security Tests** *(include only if risk is high+)*
- **What:** OWASP Top 10 checks relevant to this feature
- **Focus:** [Auth bypasses, injection, data exposure]
- **Tools:** [e.g. OWASP ZAP, manual penetration testing, Snyk]
### 4. Test Case Outline
Priority-ordered list of specific test cases:
**P0 — Must pass before merge:**
| Test Case | Type | Expected Outcome |
|---|---|---|
| [e.g. User can log in with valid credentials] | E2E | [Redirect to dashboard, session created] |
| [e.g. Invalid login returns 401] | Integration | [Error message displayed, no session] |
| [e.g. Password is never stored in plain text] | Unit | [bcrypt hash in DB] |
**P1 — Must pass before release:**
| Test Case | Type | Expected Outcome |
|---|---|---|
| [e.g. Login fails gracefully when DB is down] | Integration | [User sees friendly error, 503] |
| [e.g. Rate limiting blocks after 5 failed attempts] | Integration | [429 returned, account flagged] |
**P2 — Should pass, can ship with known issues tracked:**
| Test Case | Type | Expected Outcome |
|---|---|---|
| [e.g. Login page renders correctly on mobile] | E2E | [Layout matches design] |
### 5. Test Data Requirements
- [Specific test data needed — e.g. test user accounts with various states]
- [External service stubs or mocks needed]
- [Database seed data requirements]
- [Any PII concerns and how test data handles them]
### 6. Definition of Done
Testing is complete when:
- [ ] All P0 test cases pass
- [ ] All P1 test cases pass
- [ ] Code coverage meets the stated target
- [ ] No critical or high severity bugs open
- [ ] Performance targets met (if applicable)
- [ ] Security checks completed (if applicable)
## Quality Checks
- [ ] Risk table is populated and drives test priority (not filled in generically)
- [ ] Every "P0 — exhaustive" row in the Risk Assessment table has at least one corresponding P0 test case
- [ ] "Out of scope" section names at least one explicit exclusion (not left blank)
- [ ] Each test type names a concrete tool (not "some testing framework")
- [ ] Definition of Done is measurable (not "tests are done when QA is happy")
## Anti-Patterns
- [ ] Do not write a test strategy without a risk table that drives test priority — generic coverage targets are not a strategy
- [ ] Do not leave the "out of scope" section blank — every test strategy must explicitly name what is not being tested and why
- [ ] Do not specify test types without naming a concrete tool for each — "some testing framework" is not actionable
- [ ] Do not define a Definition of Done that is not measurable — "QA is happy" is not a completion criterion
- [ ] Do not create P0 risk areas without corresponding P0 test cases — risk rating must map to test coverage
## Usage Examples
- "Write a test strategy for [feature]" + [paste spec or PRD]
- "Create a test plan for [system]"
- "How should we test [feature]?"
- "I need a QA plan for this sprint"
- "What tests do we need for [X]?"
@@ -1,251 +1,100 @@
---
name: competitive-analysis
description: Analyze competitors and create competitive landscape documentation. Use when the user asks to analyze competitors, create competitive analysis, compare features with competitors, track competitive landscape, or understand competitive positioning.
description: "Analyze competitors and create competitive landscape documentation with feature matrices, positioning maps, and strategic recommendations. Use when asked to analyze competitors, create competitive analysis, compare features with competitors, build a competitive landscape, track competitive positioning, or prepare sales battlecard inputs. Produces structured competitor profiles, feature comparison matrix, win/loss analysis, and prioritised strategic recommendations."
---
# Competitive Analysis Skill
This skill creates structured competitive analyses for product decision-making.
Create structured competitive analyses for product decision-making.
## Analysis Framework
## Required Inputs
Ask the user for these if not provided:
- **Your product or company** (what you're comparing against)
- **Competitors to analyze** (or ask to identify the top 3-5)
- **Analysis focus** (full landscape / feature comparison / pricing / positioning / win-loss)
- **Audience** (product team / leadership / sales / board)
## Process
1. Gather competitor information from provided inputs and available context
2. Build profiles for each competitor
3. Create feature comparison matrix on dimensions that matter to the user's customers
4. Analyze pricing and positioning
5. Identify win/loss patterns and strategic implications
6. **Validate** — Confirm all claims reference a specific source or are flagged as assumptions. Verify feature comparisons note quality differences, not just presence/absence.
## Output Structure
### 1. Executive Summary
- **Market Position**: Where we stand relative to competitors
- **Key Findings**: Top 3-5 insights from analysis
- **Strategic Implications**: What this means for our roadmap
- **Key Findings**: Top 3-5 insights
- **Strategic Implications**: What this means for the roadmap
### 2. Competitor Profiles
For each major competitor:
**[Competitor Name]**
For each competitor:
- **Company Overview**: Size, funding, market position
- **Target Customer**: Who they serve
- **Value Proposition**: Their core positioning
- **Business Model**: How they make money
- **Strengths**: What they do well
- **Weaknesses**: Where they fall short
- **Value Proposition**: Core positioning
- **Strengths / Weaknesses**: What they do well and where they fall short
- **Recent Activity**: Major updates, funding, announcements
### 3. Feature Comparison Matrix
| Feature | Us | Competitor A | Competitor B | Competitor C |
|---------|-----|--------------|--------------|--------------|
| Core Feature 1 | ✅ Full | ✅ Full | ⚠️ Limited | ❌ None |
| Core Feature 2 | ✅ Full | ⚠️ Limited | ✅ Full | ✅ Full |
| Advanced Feature 1 | ⚠️ Beta | ❌ None | ✅ Full | ❌ None |
| [Feature] | ✅ Full | ⚠️ Limited | ❌ None | ✅ Full |
Legend:
- ✅ Full: Complete, production-ready feature
- ⚠️ Limited/Beta: Partial or in-development
- ❌ None: Feature not available
Legend: ✅ Full (production-ready) · ⚠️ Limited/Beta · ❌ None
Include notes on quality/implementation differences where significant.
Include notes on quality and implementation differences where significant.
### 4. Pricing Comparison
| Plan Type | Us | Competitor A | Competitor B | Competitor C |
|-----------|-----|--------------|--------------|--------------|
| Free/Trial | $0 | $0 | $0 | N/A |
| Starter | $29/mo | $25/mo | $39/mo | $49/mo |
| Professional | $79/mo | $89/mo | $79/mo | $99/mo |
| Enterprise | Custom | Custom | $299/mo | Custom |
| Plan | Us | Competitor A | Competitor B |
|------|-----|--------------|--------------|
| Free/Trial | [price] | [price] | [price] |
| Pro | [price] | [price] | [price] |
| Enterprise | [price] | [price] | [price] |
**Pricing Strategy Notes**:
- How our pricing compares
- Value perception
- Packaging differences
### 5. Market Positioning Map
### 5. Strengths & Weaknesses Analysis
**Our Competitive Advantages:**
1. [Strength] - [Why it matters]
2. [Strength] - [Why it matters]
3. [Strength] - [Why it matters]
**Our Gaps vs. Competition:**
1. [Gap] - [Impact on customers]
2. [Gap] - [Impact on customers]
3. [Gap] - [Impact on customers]
### 6. Customer Perception Analysis
**What Customers Say About Competitors** (from reviews, G2, social media):
**Competitor A:**
- Most Praised: [Common positive feedback]
- Most Criticized: [Common complaints]
- Typical User: [Who uses them]
**Competitor B:**
- Most Praised: [Common positive feedback]
- Most Criticized: [Common complaints]
- Typical User: [Who uses them]
### 7. Market Positioning Map
Describe or diagram positioning on key dimensions:
Position competitors on two key dimensions relevant to the market:
- Y-Axis: [e.g., Enterprise vs. SMB]
- X-Axis: [e.g., Simple vs. Comprehensive]
**Our Position**: [Where we sit and why]
**Whitespace Opportunities**: [Underserved segments]
### 8. Win/Loss Analysis
### 6. Win/Loss Analysis
**Why We Win Against Competitors:**
- Better at: [Specific capabilities]
- Target customers that value: [What matters]
**Why We Win:**
- Better at: [specific capabilities]
- Customers who value: [what matters to them]
**Why We Lose to Competitors:**
- When customers need: [Specific requirements]
- When they prioritize: [What they value]
**Why We Lose:**
- When customers need: [specific requirements]
- Their advantage: [what tips the decision]
### 9. Strategic Implications & Recommendations
### 7. Strategic Recommendations
**Immediate Actions** (0-3 months):
1. [Action] - [Rationale]
2. [Action] - [Rationale]
**Immediate Actions (0-3 months):**
1. [Action] [Rationale]
**Medium-term Strategy** (3-12 months):
1. [Action] - [Rationale]
2. [Action] - [Rationale]
**Medium-term (3-12 months):**
1. [Action] [Rationale]
**Long-term Positioning** (12+ months):
1. [Strategic direction] - [Rationale]
## Anti-Patterns
## Analysis Best Practices
- [ ] Do not present competitor feature claims as facts without citing a source or flagging them as assumptions — outdated or incorrect feature data misleads sales and product decisions
- [ ] Do not build a competitive analysis that only covers features — pricing, messaging, go-to-market motion, and who they hire for are equally strategic signals
- [ ] Do not treat all buyers as identical — the same product may win against Competitor A in the enterprise segment and lose in SMB; segment-specific win/loss matters
- [ ] Do not soften weaknesses and threats in the SWOT to avoid internal discomfort — an honest SWOT is only useful if the negatives are real
**Data Sources:**
- Competitor websites and documentation
- G2, Capterra, TrustRadius reviews
- Customer interviews (especially win/loss)
- Sales team feedback
- Social media and community discussions
- Industry analysts and reports
- Competitor job postings (reveal strategy)
## Quality Checks
**Quality Standards:**
✅ Use recent data (within 3-6 months)
✅ Include sources for claims
✅ Focus on verifiable facts over assumptions
✅ Consider different customer segments
✅ Update regularly (at least quarterly)
❌ Don't rely solely on competitor marketing
❌ Don't ignore smaller/emerging competitors
❌ Don't assume features work well just because they exist
❌ Don't forget about indirect/substitute competitors
**Ethical Guidelines:**
- Use only publicly available information
- Don't misrepresent competitor capabilities
- Be honest about their strengths
- Don't disparage competitors personally
## Monitoring Cadence
**Weekly**: Check for major announcements, funding, leadership changes
**Monthly**: Review feature releases, pricing changes, marketing campaigns
**Quarterly**: Comprehensive feature comparison, strategic assessment
**Annually**: Market position analysis, long-term trend evaluation
## Example Analysis Section
```
## Competitor Profile: DataSync Pro
**Company Overview**
- Founded 2019, 85 employees, $12M Series A (2023)
- Fast-growing in mid-market segment
- Strong presence in Europe
**Target Customer**
- Mid-market companies (100-1000 employees)
- Technical users comfortable with APIs
- Data-intensive operations
**Value Proposition**
"The fastest way to sync data across your entire stack"
- Focus on speed and reliability
- Developer-first approach
**Business Model**
- Freemium with generous free tier
- Usage-based pricing above free limits
- Professional services for enterprise
**Strengths**
- Superior sync speed (2-3x faster than alternatives)
- Best-in-class developer documentation
- Strong developer community (5k+ GitHub stars)
- Excellent uptime (99.97% vs industry 99.5%)
- Modern, intuitive API design
**Weaknesses**
- Limited no-code options (requires technical knowledge)
- Smaller integration library (45 vs our 120)
- No dedicated enterprise features
- Limited customization options
- Support can be slow (avg 8hr response time)
**Recent Activity**
- Jan 2026: Released real-time sync capabilities
- Dec 2025: Raised $12M Series A
- Nov 2025: Added webhooks and event streaming
- Hired ex-Stripe engineering lead as CTO
**Strategic Implications**
- Their focus on speed creates pressure on our performance
- Developer-first approach winning technical buyers
- Gaps in no-code and enterprise create opportunities
- Need to monitor their enterprise moves closely
```
## Feature Comparison Best Practices
When comparing features:
1. **Group by Category**
- Core functionality
- Integration capabilities
- Analytics/reporting
- Security/compliance
- Collaboration features
2. **Note Quality Differences**
- Not all implementations are equal
- Speed, reliability, UX matter
- Example: "Both have API, but theirs has rate limits"
3. **Consider the Complete Experience**
- Onboarding process
- Documentation quality
- Support responsiveness
- Mobile experience
4. **Identify Gaps That Matter**
- What customers actually care about
- Not just feature count
- Focus on differentiators
## Win/Loss Analysis Template
When analyzing why you win or lose deals:
**Win Against [Competitor]**
- **Scenarios**: When do we win?
- **Key Differentiators**: What tips the decision?
- **Customer Quotes**: What they tell us
- **Typical Profile**: Who chooses us?
**Loss Against [Competitor]**
- **Scenarios**: When do we lose?
- **Their Advantages**: What tips the decision?
- **Customer Quotes**: What they tell us
- **Typical Profile**: Who chooses them?
**Lessons Learned**
- What we need to improve
- What we need to communicate better
- Where we should compete differently
- [ ] All competitor claims cite a source or are flagged as assumptions
- [ ] Feature comparison notes quality differences, not just feature presence
- [ ] Strategic recommendations are specific actions, not generic advice
- [ ] Win/loss analysis reflects customer perspective, not internal assumptions
- [ ] Different customer segments are considered (not all buyers value the same things)
@@ -0,0 +1,129 @@
---
name: docx-tracked-changes
description: "Produce properly-formatted tracked changes for a Word document. Use when asked to redline a document, suggest edits to a contract or document, create tracked changes for review, or mark up a document with proposed revisions. Produces a complete redline with insertions, deletions, and margin comments that can be applied to the source document. Best used with Claude Opus 4.7 or newer for reliable tracked changes handling."
---
# Word Doc Tracked Changes Skill
Produces properly-structured tracked changes for a Word document — insertions, deletions, replacements, and margin comments formatted so they can be applied directly to the source document. Built to leverage Opus 4.7 improvements in .docx redlining and tracked changes generation.
## Required Inputs
Ask the user for these if not provided:
- **The document** (paste the text or upload the .docx)
- **Review type** (legal review / copy edit / substantive rewrite / compliance check / plain English rewrite)
- **Review scope** (full document / specific sections / specific clause type)
- **Reviewer role** (author / manager / legal counsel / subject matter expert)
## Output Structure
### 1. Redline Summary
**Document:** [Name or identifier]
**Review type:** [As stated]
**Reviewer:** [Role]
**Total changes:** [Insertions: N / Deletions: N / Comments: N]
**Overall assessment:** [1-2 sentences — is this document close to final, or does it need substantial revision?]
### 2. Top-Level Changes
Changes that affect the meaning or structure of the document:
**Change N — [Section or paragraph reference]**
- Original: "[Exact original text]"
- Suggested: "[Proposed new text]"
- Reason: [Why this change — substantive/legal/clarity]
### 3. Line-by-Line Tracked Changes
For each paragraph that needs changes, format as:
**[Paragraph reference — e.g. "Section 3, Paragraph 2"]**
Original:
> [Exact original paragraph]
Tracked changes:
> [Same paragraph with deletions marked as ~~strikethrough~~ and insertions marked as **bold**]
Clean version:
> [Final clean text after applying changes]
### 4. Margin Comments
Comments that flag issues without proposing a specific wording change:
**Comment N — [Location]**
"[Comment text — written as the reviewer would write it. Direct, specific, actionable.]"
Comments are for things like:
- "This clause conflicts with Section 7 — please reconcile"
- "Missing definition of [term] used throughout"
- "Confirm figure with finance team"
### 5. Stylistic Edits
Line-level stylistic changes (if scope includes copy editing):
| Location | Before | After | Reason |
|---|---|---|---|
| Para 3 | [Text] | [Text] | [Readability/grammar/consistency] |
### 6. Pattern Flags
Issues that repeat across the document:
**[Pattern — e.g. "Passive voice overuse"]**
- Instances: [count]
- Examples: [2-3 specific locations]
- Suggested approach: [How to address]
### 7. Review Completeness
| Review dimension | Covered |
|---|---|
| Grammar and syntax | Yes / No |
| Clarity and readability | Yes / No |
| Substantive accuracy | Yes / No / N/A |
| Compliance/legal check | Yes / No / N/A |
| Consistency with referenced documents | Yes / No / N/A |
### 8. How to Apply These Changes
Instructions for applying the redline:
**In Microsoft Word:**
1. Enable Track Changes (Review tab → Track Changes)
2. Apply the changes from Section 3 in order
3. Add comments from Section 4 using Review → New Comment
4. Send the redlined document back to the reviewer
**In Google Docs:**
1. Switch to Suggesting mode (top right pencil icon)
2. Apply the changes from Section 3
3. Add comments using the comment button in the margin
## Quality Checks
- [ ] Every tracked change has the original text preserved exactly
- [ ] Substantive changes are separated from stylistic changes
- [ ] Comments are written as the reviewer would write them, not meta-commentary
- [ ] Pattern issues identified separately from individual changes
- [ ] Application instructions match the target platform
## Anti-Patterns
- [ ] Do not paraphrase original text when creating tracked deletions — the original text must be preserved exactly, character for character, or the tracked change cannot be reviewed against source
- [ ] Do not mix substantive changes with stylistic edits in the same section — reviewers need to approve substantive changes at a different threshold than copy edits
- [ ] Do not write margin comments as meta-commentary about the review process ("This section needs work") — comments must be actionable instructions the author can act on
- [ ] Do not flag every imperfect sentence as a change — over-redlining trains authors to accept changes without reading, which defeats the purpose of tracked review
- [ ] Do not produce a redline without a summary of top-level changes — reviewers read the summary first and use it to decide which changes to scrutinise in detail
## Example Trigger Phrases
- "Redline this contract"
- "Create tracked changes for this document"
- "Mark up this document with proposed edits"
- "Review this and suggest changes in tracked changes format"
- "Give me a redline version of this draft"
## Why This Works Better on Opus 4.7
Tracked changes require the model to preserve source text exactly while suggesting alternatives — earlier models would paraphrase the original or lose track of which text was original vs suggested. Opus 4.7 improvements specifically target this workflow.

Some files were not shown because too many files have changed in this diff Show More