justin/pm-claude-skills

Files

T

mohitagw15856 c0544fb76a feat: v7.0.0 — 6 new engineering skills, badges, milestone tracker, SKILL_REQUEST.md

New skills added to pm-engineering bundle (now 10 skills total):
- debugging-log-analyser: stack trace → structured root cause diagnosis + fix
- pr-description-writer: diff/commits → reviewer-ready PR description
- system-design-interview: full system design with capacity, components, trade-offs
- changelog-generator: git log → polished Keep a Changelog entry
- test-strategy-doc: spec/PRD → complete test strategy with P0/P1 test cases
- runbook-writer: operational runbooks with exact commands, rollback, and escalation

README updates:
- 5 shields.io badges (stars, skill count, version, install, license)
- "See It in Action" demo section
- pm-engineering added to Quick Install list
- Star Milestone Tracker (100/250/500/1000 stars roadmap)
- Engineering table extended from 4 to 10 skills (41–50)
- Article 14 link resolved from remote merge

Config updates:
- marketplace.json: v6.0.0 → v7.0.0, "106 skills"
- pm-engineering plugin.json: v1.0.0 → v2.0.0

New file: SKILL_REQUEST.md — community skill voting board

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-23 15:21:43 +01:00

4.7 KiB

Raw Blame History

name, description

name	description
runbook-writer	Write an operational runbook for a service, incident type, or deployment procedure. Use when asked to write a runbook, create an ops guide, document an operational procedure, or prepare an incident response playbook. Produces a runbook with overview, prerequisites, step-by-step procedures, rollback steps, troubleshooting table, and escalation paths. Optimised for Opus 4.7 and newer models.

Runbook Writer Skill

Produces operational runbooks for services, incident types, and deployment procedures — structured so an on-call engineer who's never touched the system can follow them under pressure.

Required Inputs

Ask for these if not provided:

What the runbook is for (e.g. deploying the payment service, responding to a database failover, rotating API keys)
Runbook type (Deployment / Incident Response / Maintenance / Disaster Recovery)
System/service name and what it does (brief description)
Audience (new on-call engineers / experienced SREs / DevOps team)
Tech stack (where relevant — e.g. Kubernetes, AWS RDS, Node.js)

Output Structure

Runbook: [Runbook Title] Service: [Service Name] Type: [Deployment / Incident Response / Maintenance / DR] Last Updated: [Date] Owner: [Team or person] Severity: [P1 / P2 / P3 — if incident-type]

Overview

What this runbook covers: [1–2 sentences on the scenario this runbook handles]

When to use this runbook:

[Specific trigger condition 1 — e.g. PagerDuty alert: high-error-rate-payment-service]
[Specific trigger condition 2 — e.g. Deploy needed after PR merged to main]

Estimated time to complete: [X minutes / X–Y minutes depending on outcome]

Impact if not completed correctly: [e.g. Payment processing degraded / Data loss risk / Users locked out]

Prerequisites

Access required:

[System/tool access — e.g. AWS Console: production-account]
[Credential — e.g. vault read secret/payment-service]
[VPN / bastion access if needed]

Tools required:

[Tool name and version — e.g. kubectl v1.28+]
[CLI or dashboard name]

Before you start:

[Prerequisite check — e.g. Verify current deployment is healthy in Grafana]
[Prerequisite action — e.g. Announce in #ops-live that you're starting]

Procedure

Number every step. Use exact commands. Do not paraphrase tool names or flags.

Step 1: [Action name] [What you're doing and why — one sentence]

# Exact command
[command here]

Expected output: [what should appear if this worked] If this fails: [Exact error message to look for] → [What to do, or see Troubleshooting]

Step 2: [Action name] [Same structure as Step 1]

Step 3: Verify Always include a verification step after the main procedure:

[verification command]

Expected state: [What a healthy system looks like after this runbook completes]

Rollback

How to undo this procedure if something went wrong:

Step R1: [Rollback action]

[rollback command]

Verify rollback: [command to confirm rollback succeeded]

Troubleshooting

Symptom	Likely Cause	Resolution
[Error message or observable symptom]	[Why this happens]	[Exact fix or next step]
[Another symptom]	[Cause]	[Resolution]

Escalation

If this runbook does not resolve the issue:

Condition	Who to Contact	How
[e.g. DB unavailable after 10 min]	[DBA on-call]	[PagerDuty policy: `db-oncall`]
[e.g. Payment provider unresponsive]	[Vendor contact]	[Contact in 1Password: `vendor-escalation`]

Always update the incident timeline in [tool] before escalating.

Post-Procedure Checklist

After completing the runbook:

Announce completion in #ops-live with outcome
Update the incident ticket / deploy log
Verify alerts have resolved in monitoring dashboard
If this revealed a gap in this runbook — update it now (link to edit process)

Quality Checks

Every step has an exact command (no "run the deploy script")
Expected output is specified for each step so engineer knows if it worked
Failure path is explicit for each step (not "if it fails, investigate")
Rollback procedure is complete and independently testable
Escalation paths name specific contacts, not just team names
Runbook can be followed by someone who has never touched this system

Example Trigger Phrases

"Write a runbook for [service] deployment"
"Create an incident response runbook for [alert type]"
"I need a runbook for [procedure]"
"Document the operational procedure for [X]"
"Write an ops playbook for [scenario]"