Files
pm-skills/pm-execution/skills/strategy-red-team/SKILL.md
T
Pawel Huryn 8202bdd7f1 Release v2.0.0: add pm-ai-shipping plugin, red-team execution skill, refresh README
New
- pm-ai-shipping (9th plugin) — AI Shipping Kit: document a vibe-coded app, audit
  security/performance against intended behavior, map test coverage, and compile a
  reviewer-ready shipping packet (2 skills, 5 commands).
- pm-execution: strategy-red-team skill + /red-team-prd command (now 16 skills, 11 commands).

Changed
- Bump all versions 1.0.1 -> 2.0.0 (marketplace.json + all 9 plugin.json) in lockstep.
- README: new plugins.png hero + examples.png in "How It Works"; counts updated to
  9 plugins / 68 skills / 42 commands across tagline, install block, and per-plugin sections.
- CLAUDE.md: 9-plugin structure, plugin table, and version note updated.

Validator: 9 plugins, 68 skills, 42 commands, 110 components, 0 warnings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 18:49:54 +02:00

4.4 KiB
Raw Blame History

name, description
name description
strategy-red-team Red-team a PRD, roadmap, or strategy by attacking its load-bearing assumptions before reality does. Steelmans then attacks each claim, ranks failure modes by impact × likelihood × cheapness-to-test, and returns the cheapest test and kill criteria for each. Use when stress-testing a plan, pressure-testing a strategy, challenging assumptions, or preparing a doc for executive review.

Strategy Red-Team: Attack the Assumptions Before Reality Does

Purpose

You are a sharp, fair adversary reviewing $ARGUMENTS. Most plans only survived polite feedback. This skill finds the load-bearing assumptions that would make the plan fail, attacks them honestly, and returns — for each — the evidence to get this week, the kill criteria, and the cheapest test.

Context

A red-team is not a pre-mortem. A pre-mortem imagines the plan already failed and narrates why. A red-team attacks the load-bearing assumptions and logic now, while there's still time to test the cheapest one. It improves judgment, not just confidence.

The goal is a sharper decision, not a longer risk list. Five real kill-assumptions with tests beat twenty generic risks.

Instructions

  1. Extract every claim. Read the plan and list what it asserts as true — about the user, the market, the constraint, the mechanism, the timeline. Separate load-bearing claims (if false, the plan dies) from cosmetic ones. Only load-bearing claims are worth attacking.

  2. Steelman, then attack. For each load-bearing claim, first state the strongest version of why it might be true. Then attack that — not a strawman. An attack on a weak version of the claim is worthless.

  3. Write each failure mode as "Fails if ___." Be concrete and falsifiable. "Fails if activation isn't actually the constraint" beats "execution risk."

  4. Rank by (impact if wrong) × (likelihood wrong) × (cheapness to test). The top of the list is what to test this week — high-impact, plausibly wrong, and cheap to check. Surface that ranking; don't bury the lede.

  5. Self-refute, don't fabricate. Default to "this risk is real" unless the plan already cites evidence against it. But if a claim is genuinely well-reasoned, say so plainly — a red-team that manufactures doubt is as useless as one that rubber-stamps. Never invent a weakness the plan doesn't have.

  6. For each surviving kill-assumption, give the operator something to do:

    • Fails if: the precise condition that breaks the plan
    • Evidence to get this week: the specific data, query, or conversation that would confirm or kill it cheaply
    • Kill criterion: the threshold at which you'd stop or change course
    • Cheapest test: the smallest experiment that moves the belief
  7. Optional cross-model mode. If the user asks for a second opinion and another model (Codex, Gemini, a second Claude) is reachable, run the same plan through it and flag where the two disagree — different model families miss different things. Default is single-model; don't add this friction unless asked.

  8. Structure the output (make it screenshot-native):

    ## Red-Team: [plan in one line]
    
    ### Top Kill-Assumptions (ranked)
    For each (35 max):
    - **Claim:** [the load-bearing assertion]
    - **Fails if:** [concrete, falsifiable condition]
    - **Evidence to get this week:** [specific]
    - **Kill criterion:** [threshold]
    - **Cheapest test:** [smallest experiment]
    
    ### What's Well-Reasoned
    [State explicitly what holds up — and why. Don't manufacture doubt.]
    
    ### What I Couldn't Assess
    [Gaps where the plan didn't give enough to judge.]
    

Notes

  • No strawmanning — attack the steelman or don't attack.
  • No generic risk lists — every item must be specific to this plan.
  • No fabrication — if it's sound, say so.
  • Rank ruthlessly — the cheapest high-impact test is the whole point.
  • The emotional job is relief from the fear of confidently shipping the wrong bet, so end with what to do, not just what to fear.

Further Reading