feat: v12.0.0 — 150-skill milestone, 15 new skills across 10 bundles

Adds 15 new skills reaching the 150-skill milestone: Data & Analytics (pm-data): - cohort-analysis: retention curves, LTV projection, behavioural segmentation, SQL reference queries - data-pipeline-spec: ETL/ELT design with SLAs, DQ rules, error handling, compliance Customer Success (pm-cs): - renewal-playbook: health snapshot, value story, commercial scenarios, objection responses, 16-week timeline - customer-success-plan: joint success plan with milestones, mutual commitments, escalation path People & Leadership (pm-people): - 360-feedback-template: survey instrument + narrative report with strengths and development themes - team-health-check: Spotify-model assessment across 7 dimensions with facilitation guide Operations (pm-operations): - risk-register: L×I scoring, RAG heat map, mitigation and contingency plans - raci-matrix: role definitions, decision map, anti-pattern guide, communication template Marketing & GTM (pm-gtm): - social-media-strategy: audience profile, content pillars, KPIs, 4-week starter calendar - product-positioning-doc: April Dunford-style positioning, messaging hierarchy, persona messaging Discovery (pm-discovery): - customer-journey-map: stage-by-stage journey with touchpoints, emotions, and prioritised opportunities Delivery (pm-delivery): - user-story-writer: Given/When/Then ACs, edge cases, definition of done, epic decomposition Advanced (pm-advanced): - ai-ethics-review: fairness, bias, transparency, privacy, safety, accountability, societal impact Sales (pm-sales): - partnership-proposal: mutual value, commercial model, joint GTM plan, governance Design (pm-design): - design-system-audit: component coverage, token consistency, WCAG, adoption, remediation roadmap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 21:58:13 +01:00
parent 94e53d38a8
commit ae6ea4d53e
32 changed files with 6630 additions and 152 deletions
@@ -0,0 +1,187 @@
+---
+name: cohort-analysis
+description: "Structure a cohort analysis for retention, LTV, or behavioural patterns. Use when asked to run a cohort analysis, analyse retention by cohort, segment users by behaviour over time, or calculate lifetime value by acquisition period. Produces a complete cohort analysis framework with methodology, cohort definitions, retention curves, and prioritised interventions."
+---
+
+# Cohort Analysis Skill
+
+This skill produces a structured cohort analysis covering retention curves, LTV estimation, behavioural segmentation, and actionable interventions. Output is ready to present to product leadership or share with growth and data teams.
+
+## Required Inputs
+
+Ask the user for these if not provided:
+- **Analysis goal** (retention improvement / LTV modelling / behavioural segmentation / churn prediction)
+- **Product or feature being analysed**
+- **Cohort definition** — what groups users? (acquisition month, signup channel, plan tier, feature adoption)
+- **Observation window** — how many periods to track? (e.g. 12 months, 8 weeks)
+- **Key metric** — what are you measuring per cohort? (retention rate, revenue, engagement score, feature usage)
+- **Available data** — what tables/metrics are available? (paste schema or describe)
+- **Baseline** — any existing retention benchmarks or goals?
+
+## Output Structure
+
+---
+
+# Cohort Analysis: [Product / Feature]
+
+**Analysis type:** [Retention / LTV / Behavioural / Churn]
+**Cohort definition:** [Acquisition month / Signup channel / Plan tier / Feature adoption date]
+**Observation window:** [X months / weeks]
+**Primary metric:** [Metric name]
+**Date prepared:** [Date]
+
+---
+
+## 1. Cohort Definitions
+
+| Cohort | Period | Size | Description |
+|---|---|---|---|
+| [Cohort 1] | [Jan 2025] | [N users] | [e.g. Users who signed up in Jan 2025 via organic] |
+| [Cohort 2] | [Feb 2025] | [N users] | [...] |
+
+**Cohort logic:**
+- Cohort entry event: [First sign-up / First purchase / Feature activation]
+- Cohort exit criteria: [Churned / Downgraded / No activity for 30 days]
+- Exclusions: [Trial users / Internal test accounts / Users with < X days of data]
+
+---
+
+## 2. Retention Curve
+
+**How to read:** Each cell shows what % of the cohort performed the key metric in period N.
+
+| Cohort | Period 0 | Period 1 | Period 2 | Period 3 | Period 6 | Period 12 |
+|---|---|---|---|---|---|---|
+| Jan 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
+| Feb 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
+| [Trend] | — | [↑/↓ vs prior] | [...] | [...] | [...] | [...] |
+
+**Retention plateau:** [At what period does retention flatten? What % does it flatten at?]
+
+**Key observations:**
+- [e.g. Period 1 → Period 2 drop is the largest — average X% churn in first 30 days]
+- [e.g. Cohorts acquired via [channel] retain X% better at Period 6]
+- [e.g. Retention has improved from X% → Y% at Period 3 comparing oldest to newest cohort]
+
+---
+
+## 3. LTV Projection (if applicable)
+
+**ARPU per period:** [£/$/€ X per active user per month]
+**Retention curve used:** [Which cohort or blended average]
+
+| Period | Retained % | Revenue per user | Cumulative LTV |
+|---|---|---|---|
+| Month 1 | [X%] | [£X] | [£X] |
+| Month 3 | [X%] | [£X] | [£X] |
+| Month 6 | [X%] | [£X] | [£X] |
+| Month 12 | [X%] | [£X] | [£X] |
+
+**Blended LTV:** [£X at 12 months — based on blended retention across cohorts]
+
+**LTV by segment:**
+| Segment | LTV (12M) | vs Baseline |
+|---|---|---|
+| [Organic] | [£X] | [+X%] |
+| [Paid] | [£X] | [-X%] |
+| [Enterprise] | [£X] | [+X%] |
+
+---
+
+## 4. Behavioural Segmentation
+
+Group cohorts by behaviour patterns, not just acquisition date:
+
+| Segment | Definition | Size | Retention (P6) | LTV (12M) |
+|---|---|---|---|---|
+| **Power users** | [Used core feature ≥ 3x/week in first 30 days] | [X%] | [X%] | [£X] |
+| **Casual users** | [Used 1–2x/week in first 30 days] | [X%] | [X%] | [£X] |
+| **Dormant** | [Logged in but did not use core feature] | [X%] | [X%] | [£X] |
+| **Never activated** | [Signed up but never completed onboarding] | [X%] | [X%] | [£X] |
+
+**Activation threshold insight:** [What action — taken within the first X days — most strongly predicts retention? This is the "aha moment" to optimise for.]
+
+---
+
+## 5. Leading Indicators of Churn
+
+List the signals that appear **before** users churn, so teams can intervene:
+
+| Signal | How early does it appear? | Churn correlation | Intervention |
+|---|---|---|---|
+| [No login for 7 days] | [7 days before churn] | [Strong] | [Re-engagement email sequence] |
+| [Support ticket with escalation] | [14 days before churn] | [Moderate] | [CSM outreach within 48 hours] |
+| [Feature usage dropped >50% WoW] | [10 days before churn] | [Strong] | [In-app nudge with use-case tutorial] |
+
+---
+
+## 6. Cohort Comparison: What's Changed Over Time
+
+Compare oldest and newest cohorts to assess whether product improvements are showing up in retention:
+
+| Metric | [Oldest cohort — e.g. Jan 2024] | [Newest cohort — e.g. Jan 2025] | Change |
+|---|---|---|---|
+| Period 1 retention | [X%] | [X%] | [↑/↓ X pp] |
+| Period 3 retention | [X%] | [X%] | [↑/↓ X pp] |
+| Activation rate | [X%] | [X%] | [↑/↓ X pp] |
+| Avg. sessions in first 30 days | [X] | [X] | [↑/↓] |
+
+**Verdict:** [Are more recent cohorts performing better or worse? What shipped in that period that might explain the change?]
+
+---
+
+## 7. Recommendations
+
+Prioritise by impact on retention curve:
+
+| # | Recommendation | Target segment | Expected impact | Effort | Priority |
+|---|---|---|---|---|---|
+| 1 | [e.g. Redesign onboarding to hit activation milestone in day 1, not day 7] | [Never-activated segment] | [+X pp P1 retention] | [Medium] | P1 |
+| 2 | [e.g. Launch re-engagement sequence at day 7 inactivity trigger] | [Dormant segment] | [+X pp P2 retention] | [Low] | P1 |
+| 3 | [e.g. Introduce power-user features earlier to accelerate habit formation] | [Casual users] | [+X pp P6 LTV] | [High] | P2 |
+
+---
+
+## 8. SQL Reference (if applicable)
+
+Provide the core cohort query so data teams can replicate or extend the analysis:
+
+```sql
+-- Retention cohort query
+SELECT
+  DATE_TRUNC('month', u.created_at) AS cohort_month,
+  DATE_TRUNC('month', e.event_date) AS activity_month,
+  DATEDIFF('month', u.created_at, e.event_date) AS period,
+  COUNT(DISTINCT e.user_id) AS retained_users,
+  COUNT(DISTINCT c.user_id) AS cohort_size,
+  ROUND(COUNT(DISTINCT e.user_id) * 100.0 / COUNT(DISTINCT c.user_id), 1) AS retention_rate
+FROM users u
+JOIN events e ON u.user_id = e.user_id
+JOIN (
+  SELECT user_id, DATE_TRUNC('month', created_at) AS cohort_month
+  FROM users
+  WHERE created_at >= '[start_date]'
+) c ON u.user_id = c.user_id AND DATE_TRUNC('month', u.created_at) = c.cohort_month
+WHERE e.event_type = '[key_retention_event]'
+GROUP BY 1, 2, 3
+ORDER BY 1, 3;
+```
+
+---
+
+## Quality Checks
+
+- [ ] Cohort definition is unambiguous — the same user cannot appear in two cohorts
+- [ ] Retention curve shows a clear plateau, or the analysis notes that the window is too short to see one
+- [ ] LTV projection uses observed retention, not assumed
+- [ ] Behavioural segments are mutually exclusive and exhaustive
+- [ ] Recommendations are tied to specific cohort or segment findings — not generic growth advice
+- [ ] Leading indicators are observable in production data, not just in theory
+
+## Example Trigger Phrases
+
+- "Run a cohort analysis for our SaaS product"
+- "Analyse retention by acquisition month for the last 12 cohorts"
+- "What's the LTV of users who came via paid vs organic?"
+- "Build a cohort retention model showing period 0 through period 12"
+- "Segment users by behaviour and show me which group retains best"
@@ -0,0 +1,221 @@
+---
+name: data-pipeline-spec
+description: "Design an ETL/ELT data pipeline specification. Use when asked to design a data pipeline, spec an ETL or ELT process, document a data ingestion workflow, or plan a data integration. Produces a complete pipeline spec with sources, transforms, destinations, SLAs, error handling, and data quality rules."
+---
+
+# Data Pipeline Spec Skill
+
+This skill produces a complete data pipeline specification covering sources, transformations, destinations, scheduling, SLAs, error handling, data quality checks, and monitoring requirements. Output is ready for engineering handoff or architecture review.
+
+## Required Inputs
+
+Ask the user for these if not provided:
+- **Pipeline purpose** — what business question or workflow does this pipeline serve?
+- **Source systems** — where does data come from? (databases, APIs, files, event streams)
+- **Destination** — where does data land? (data warehouse, data lake, downstream DB, reporting tool)
+- **Transformation type** — ETL (transform before loading) or ELT (load raw, transform in warehouse)?
+- **Frequency / SLA** — how often must data be fresh? (real-time / hourly / daily / weekly)
+- **Volume estimate** — approximate rows/events per run
+- **Data quality requirements** — completeness, deduplication, freshness, schema enforcement
+- **Team or stack** — any specific tools in use? (Airflow, dbt, Fivetran, Spark, Kafka, etc.)
+
+## Output Structure
+
+---
+
+# Data Pipeline Spec: [Pipeline Name]
+
+**Purpose:** [One sentence — what decision or workflow does this pipeline enable?]
+**Type:** [ETL / ELT / Streaming / Batch]
+**Owner:** [Team or individual]
+**Version:** [1.0]
+**Date:** [Date]
+**Status:** [Draft / Under Review / Approved]
+
+---
+
+## 1. Overview
+
+[2–3 sentences describing the pipeline end-to-end: what data moves, from where to where, at what cadence, and why.]
+
+**Architecture diagram (text):**
+
+```
+[Source A] ──┐
+[Source B] ──┤──► [Ingestion Layer] ──► [Transform Layer] ──► [Destination] ──► [Consumers]
+[Source C] ──┘
+```
+
+---
+
+## 2. Sources
+
+| Source | System | Connection type | Data format | Update pattern | Volume |
+|---|---|---|---|---|---|
+| [Source 1] | [PostgreSQL / Salesforce / S3 / Kafka] | [JDBC / REST API / SDK / Webhook] | [JSON / CSV / Parquet / CDC] | [Append / Full refresh / Incremental] | [X rows/day] |
+| [Source 2] | [...] | [...] | [...] | [...] | [...] |
+
+**Incremental key (if applicable):** [The column used to identify new or changed records — e.g. `updated_at`, `event_id`]
+
+**Authentication:** [API key / OAuth / IAM role / connection string — note where credentials are stored]
+
+---
+
+## 3. Ingestion Layer
+
+**Tool:** [Fivetran / Airbyte / Kafka Connect / custom script / dbt source]
+
+**Ingestion method:**
+- [ ] Full extract (full table refresh each run)
+- [ ] Incremental extract (only new/changed rows since last run)
+- [ ] CDC (change data capture from database transaction log)
+- [ ] Event streaming (continuous ingestion from Kafka/Kinesis)
+
+**Raw landing zone:** [Where raw data lands before transformation — e.g. `raw.salesforce_opportunities` in Snowflake, S3 bucket `s3://data-raw/crm/`]
+
+**Schema handling:** [Strict schema enforcement / Schema evolution allowed / Union schema]
+
+---
+
+## 4. Transformation Logic
+
+List each transformation in execution order. For ELT pipelines, this is the dbt model or SQL layer.
+
+| Step | Name | Description | Input | Output | Tool |
+|---|---|---|---|---|---|
+| 1 | [Deduplicate events] | [Remove duplicate event rows based on event_id] | `raw.events` | `staging.events_deduped` | [dbt / SQL / Spark] |
+| 2 | [Join user profile] | [Enrich events with user attributes from CRM] | `staging.events_deduped`, `raw.users` | `staging.events_enriched` | [...] |
+| 3 | [Aggregate to daily] | [Roll up to user×day grain] | `staging.events_enriched` | `mart.user_daily_activity` | [...] |
+
+**Business logic rules:**
+- [e.g. Revenue is recognised on `payment_confirmed_at`, not `payment_initiated_at`]
+- [e.g. Users in the `internal@company.com` domain are excluded from all metrics]
+- [e.g. Currency conversion uses the ECB rate from the first business day of each month]
+
+**Slowly Changing Dimensions (SCD) — if applicable:**
+- [e.g. `users.plan_tier` is SCD Type 2 — keep history of plan changes with `valid_from` / `valid_to`]
+
+---
+
+## 5. Destination
+
+| Destination | System | Schema / Table | Write mode | Consumers |
+|---|---|---|---|---|
+| [Primary] | [Snowflake / BigQuery / Redshift / PostgreSQL] | [`analytics.mart_user_activity`] | [Append / Upsert / Full replace] | [Looker / Metabase / downstream pipeline] |
+| [Secondary] | [...] | [...] | [...] | [...] |
+
+**Partitioning / Clustering:** [e.g. Partitioned by `event_date`, clustered by `user_id` — reduces query cost for time-range scans]
+
+**Retention policy:** [e.g. Raw data retained for 90 days; mart tables retained indefinitely]
+
+---
+
+## 6. Scheduling & SLAs
+
+| SLA | Target | Breach action |
+|---|---|---|
+| **Data freshness** | [Data must be ≤ X hours old by HH:MM UTC] | [Page on-call / alert Slack channel] |
+| **Pipeline completion** | [Must complete within X minutes of trigger] | [Alert and auto-retry] |
+| **Availability** | [Pipeline must run successfully X% of days per month] | [Incident review] |
+
+**Schedule:** [Cron expression and human description — e.g. `0 6 * * *` — daily at 06:00 UTC]
+
+**Trigger type:**
+- [ ] Time-based (cron)
+- [ ] Event-based (triggered by upstream pipeline success / file arrival / Kafka lag)
+- [ ] Manual (ad hoc runs only)
+
+**Backfill strategy:** [How to reprocess historical data if the pipeline fails or logic changes — e.g. parameterised date range, full drop-and-reload]
+
+---
+
+## 7. Data Quality Rules
+
+| Check | Table | Rule | Failure action |
+|---|---|---|---|
+| Completeness | `staging.events` | `event_id IS NOT NULL` — 100% of rows | Block load / Alert |
+| Uniqueness | `mart.user_daily_activity` | `(user_id, date)` must be unique | Block load |
+| Freshness | `mart.user_daily_activity` | `max(event_date) >= CURRENT_DATE - 1` | Alert |
+| Volume | `staging.events` | Row count within ±20% of 7-day average | Alert |
+| Referential integrity | `staging.events` | All `user_id` values exist in `users` table | Alert |
+
+**DQ tool:** [dbt tests / Great Expectations / Monte Carlo / custom SQL assertions]
+
+---
+
+## 8. Error Handling & Recovery
+
+**Retry policy:** [e.g. 3 retries with exponential back-off: 5 min, 20 min, 60 min]
+
+**Failure modes and responses:**
+
+| Failure | Detection | Response | Owner |
+|---|---|---|---|
+| Source unavailable | HTTP 5xx / connection timeout | Retry 3×, then alert and skip run | Data engineering |
+| Schema change in source | Column missing or type mismatch | Block load, alert schema owner | Data owner + engineering |
+| DQ check fails | dbt test failure / assertion error | Block load for P1 checks; alert for P2 | Data engineering |
+| Partial load | Row count < expected threshold | Alert; do not publish to consumers until resolved | Data engineering |
+
+**Dead-letter queue:** [Where failed records are routed for manual inspection — e.g. `raw.dlq_events`]
+
+---
+
+## 9. Monitoring & Observability
+
+**Metrics to track:**
+- Pipeline run duration (p50, p95)
+- Rows processed per run
+- DQ check pass rate
+- Source freshness lag
+- Error rate per source
+
+**Alerting:**
+- [Slack channel: #data-alerts]
+- [PagerDuty: data-on-call escalation for P1 SLA breaches]
+- [Dashboard: [link to monitoring dashboard]]
+
+**Logging:** [What gets logged and where — e.g. Airflow task logs to CloudWatch, structured JSON to data lake]
+
+---
+
+## 10. Dependencies & Sequencing
+
+**Upstream dependencies:** [Which pipelines or data sources must succeed before this pipeline runs?]
+
+**Downstream dependents:** [Which dashboards, pipelines, or models depend on this pipeline's output?]
+
+```
+[upstream pipeline A] ──► THIS PIPELINE ──► [downstream dashboard B]
+                                          └──► [downstream pipeline C]
+```
+
+**Coordination mechanism:** [Airflow DAG dependency / dbt ref() / event trigger / manual gate]
+
+---
+
+## 11. Security & Compliance
+
+- **PII fields:** [List columns containing PII — e.g. `email`, `ip_address`, `name`]
+- **Masking / Pseudonymisation:** [e.g. email hashed with SHA-256 before landing in mart layer]
+- **Access control:** [Who can query the destination tables? — e.g. Role-based access in Snowflake]
+- **Data residency:** [Which regions is data permitted to transit and rest in?]
+- **Audit trail:** [Is pipeline execution auditable for compliance purposes? Where are logs retained?]
+
+---
+
+## Quality Checks
+
+- [ ] Every source has an incremental key or full-refresh justification
+- [ ] Business logic rules are documented, not just the SQL
+- [ ] SLAs are agreed with consumers, not set unilaterally by engineering
+- [ ] DQ checks cover completeness, uniqueness, freshness, and volume
+- [ ] Failure modes include a documented recovery owner
+- [ ] PII fields are identified and a treatment plan is specified
+
+## Example Trigger Phrases
+
+- "Design a data pipeline for our Salesforce to Snowflake sync"
+- "Write a pipeline spec for ingesting Stripe events into our data warehouse"
+- "Build an ETL spec for our user activity data"
+- "Document our dbt pipeline from raw events to the analytics mart"
+- "Spec out the pipeline that feeds the executive dashboard"