feat: v12.0.0 — 150-skill milestone, 15 new skills across 10 bundles
Adds 15 new skills reaching the 150-skill milestone: Data & Analytics (pm-data): - cohort-analysis: retention curves, LTV projection, behavioural segmentation, SQL reference queries - data-pipeline-spec: ETL/ELT design with SLAs, DQ rules, error handling, compliance Customer Success (pm-cs): - renewal-playbook: health snapshot, value story, commercial scenarios, objection responses, 16-week timeline - customer-success-plan: joint success plan with milestones, mutual commitments, escalation path People & Leadership (pm-people): - 360-feedback-template: survey instrument + narrative report with strengths and development themes - team-health-check: Spotify-model assessment across 7 dimensions with facilitation guide Operations (pm-operations): - risk-register: L×I scoring, RAG heat map, mitigation and contingency plans - raci-matrix: role definitions, decision map, anti-pattern guide, communication template Marketing & GTM (pm-gtm): - social-media-strategy: audience profile, content pillars, KPIs, 4-week starter calendar - product-positioning-doc: April Dunford-style positioning, messaging hierarchy, persona messaging Discovery (pm-discovery): - customer-journey-map: stage-by-stage journey with touchpoints, emotions, and prioritised opportunities Delivery (pm-delivery): - user-story-writer: Given/When/Then ACs, edge cases, definition of done, epic decomposition Advanced (pm-advanced): - ai-ethics-review: fairness, bias, transparency, privacy, safety, accountability, societal impact Sales (pm-sales): - partnership-proposal: mutual value, commercial model, joint GTM plan, governance Design (pm-design): - design-system-audit: component coverage, token consistency, WCAG, adoption, remediation roadmap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,187 @@
|
||||
---
|
||||
name: cohort-analysis
|
||||
description: "Structure a cohort analysis for retention, LTV, or behavioural patterns. Use when asked to run a cohort analysis, analyse retention by cohort, segment users by behaviour over time, or calculate lifetime value by acquisition period. Produces a complete cohort analysis framework with methodology, cohort definitions, retention curves, and prioritised interventions."
|
||||
---
|
||||
|
||||
# Cohort Analysis Skill
|
||||
|
||||
This skill produces a structured cohort analysis covering retention curves, LTV estimation, behavioural segmentation, and actionable interventions. Output is ready to present to product leadership or share with growth and data teams.
|
||||
|
||||
## Required Inputs
|
||||
|
||||
Ask the user for these if not provided:
|
||||
- **Analysis goal** (retention improvement / LTV modelling / behavioural segmentation / churn prediction)
|
||||
- **Product or feature being analysed**
|
||||
- **Cohort definition** — what groups users? (acquisition month, signup channel, plan tier, feature adoption)
|
||||
- **Observation window** — how many periods to track? (e.g. 12 months, 8 weeks)
|
||||
- **Key metric** — what are you measuring per cohort? (retention rate, revenue, engagement score, feature usage)
|
||||
- **Available data** — what tables/metrics are available? (paste schema or describe)
|
||||
- **Baseline** — any existing retention benchmarks or goals?
|
||||
|
||||
## Output Structure
|
||||
|
||||
---
|
||||
|
||||
# Cohort Analysis: [Product / Feature]
|
||||
|
||||
**Analysis type:** [Retention / LTV / Behavioural / Churn]
|
||||
**Cohort definition:** [Acquisition month / Signup channel / Plan tier / Feature adoption date]
|
||||
**Observation window:** [X months / weeks]
|
||||
**Primary metric:** [Metric name]
|
||||
**Date prepared:** [Date]
|
||||
|
||||
---
|
||||
|
||||
## 1. Cohort Definitions
|
||||
|
||||
| Cohort | Period | Size | Description |
|
||||
|---|---|---|---|
|
||||
| [Cohort 1] | [Jan 2025] | [N users] | [e.g. Users who signed up in Jan 2025 via organic] |
|
||||
| [Cohort 2] | [Feb 2025] | [N users] | [...] |
|
||||
|
||||
**Cohort logic:**
|
||||
- Cohort entry event: [First sign-up / First purchase / Feature activation]
|
||||
- Cohort exit criteria: [Churned / Downgraded / No activity for 30 days]
|
||||
- Exclusions: [Trial users / Internal test accounts / Users with < X days of data]
|
||||
|
||||
---
|
||||
|
||||
## 2. Retention Curve
|
||||
|
||||
**How to read:** Each cell shows what % of the cohort performed the key metric in period N.
|
||||
|
||||
| Cohort | Period 0 | Period 1 | Period 2 | Period 3 | Period 6 | Period 12 |
|
||||
|---|---|---|---|---|---|---|
|
||||
| Jan 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
|
||||
| Feb 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
|
||||
| [Trend] | — | [↑/↓ vs prior] | [...] | [...] | [...] | [...] |
|
||||
|
||||
**Retention plateau:** [At what period does retention flatten? What % does it flatten at?]
|
||||
|
||||
**Key observations:**
|
||||
- [e.g. Period 1 → Period 2 drop is the largest — average X% churn in first 30 days]
|
||||
- [e.g. Cohorts acquired via [channel] retain X% better at Period 6]
|
||||
- [e.g. Retention has improved from X% → Y% at Period 3 comparing oldest to newest cohort]
|
||||
|
||||
---
|
||||
|
||||
## 3. LTV Projection (if applicable)
|
||||
|
||||
**ARPU per period:** [£/$/€ X per active user per month]
|
||||
**Retention curve used:** [Which cohort or blended average]
|
||||
|
||||
| Period | Retained % | Revenue per user | Cumulative LTV |
|
||||
|---|---|---|---|
|
||||
| Month 1 | [X%] | [£X] | [£X] |
|
||||
| Month 3 | [X%] | [£X] | [£X] |
|
||||
| Month 6 | [X%] | [£X] | [£X] |
|
||||
| Month 12 | [X%] | [£X] | [£X] |
|
||||
|
||||
**Blended LTV:** [£X at 12 months — based on blended retention across cohorts]
|
||||
|
||||
**LTV by segment:**
|
||||
| Segment | LTV (12M) | vs Baseline |
|
||||
|---|---|---|
|
||||
| [Organic] | [£X] | [+X%] |
|
||||
| [Paid] | [£X] | [-X%] |
|
||||
| [Enterprise] | [£X] | [+X%] |
|
||||
|
||||
---
|
||||
|
||||
## 4. Behavioural Segmentation
|
||||
|
||||
Group cohorts by behaviour patterns, not just acquisition date:
|
||||
|
||||
| Segment | Definition | Size | Retention (P6) | LTV (12M) |
|
||||
|---|---|---|---|---|
|
||||
| **Power users** | [Used core feature ≥ 3x/week in first 30 days] | [X%] | [X%] | [£X] |
|
||||
| **Casual users** | [Used 1–2x/week in first 30 days] | [X%] | [X%] | [£X] |
|
||||
| **Dormant** | [Logged in but did not use core feature] | [X%] | [X%] | [£X] |
|
||||
| **Never activated** | [Signed up but never completed onboarding] | [X%] | [X%] | [£X] |
|
||||
|
||||
**Activation threshold insight:** [What action — taken within the first X days — most strongly predicts retention? This is the "aha moment" to optimise for.]
|
||||
|
||||
---
|
||||
|
||||
## 5. Leading Indicators of Churn
|
||||
|
||||
List the signals that appear **before** users churn, so teams can intervene:
|
||||
|
||||
| Signal | How early does it appear? | Churn correlation | Intervention |
|
||||
|---|---|---|---|
|
||||
| [No login for 7 days] | [7 days before churn] | [Strong] | [Re-engagement email sequence] |
|
||||
| [Support ticket with escalation] | [14 days before churn] | [Moderate] | [CSM outreach within 48 hours] |
|
||||
| [Feature usage dropped >50% WoW] | [10 days before churn] | [Strong] | [In-app nudge with use-case tutorial] |
|
||||
|
||||
---
|
||||
|
||||
## 6. Cohort Comparison: What's Changed Over Time
|
||||
|
||||
Compare oldest and newest cohorts to assess whether product improvements are showing up in retention:
|
||||
|
||||
| Metric | [Oldest cohort — e.g. Jan 2024] | [Newest cohort — e.g. Jan 2025] | Change |
|
||||
|---|---|---|---|
|
||||
| Period 1 retention | [X%] | [X%] | [↑/↓ X pp] |
|
||||
| Period 3 retention | [X%] | [X%] | [↑/↓ X pp] |
|
||||
| Activation rate | [X%] | [X%] | [↑/↓ X pp] |
|
||||
| Avg. sessions in first 30 days | [X] | [X] | [↑/↓] |
|
||||
|
||||
**Verdict:** [Are more recent cohorts performing better or worse? What shipped in that period that might explain the change?]
|
||||
|
||||
---
|
||||
|
||||
## 7. Recommendations
|
||||
|
||||
Prioritise by impact on retention curve:
|
||||
|
||||
| # | Recommendation | Target segment | Expected impact | Effort | Priority |
|
||||
|---|---|---|---|---|---|
|
||||
| 1 | [e.g. Redesign onboarding to hit activation milestone in day 1, not day 7] | [Never-activated segment] | [+X pp P1 retention] | [Medium] | P1 |
|
||||
| 2 | [e.g. Launch re-engagement sequence at day 7 inactivity trigger] | [Dormant segment] | [+X pp P2 retention] | [Low] | P1 |
|
||||
| 3 | [e.g. Introduce power-user features earlier to accelerate habit formation] | [Casual users] | [+X pp P6 LTV] | [High] | P2 |
|
||||
|
||||
---
|
||||
|
||||
## 8. SQL Reference (if applicable)
|
||||
|
||||
Provide the core cohort query so data teams can replicate or extend the analysis:
|
||||
|
||||
```sql
|
||||
-- Retention cohort query
|
||||
SELECT
|
||||
DATE_TRUNC('month', u.created_at) AS cohort_month,
|
||||
DATE_TRUNC('month', e.event_date) AS activity_month,
|
||||
DATEDIFF('month', u.created_at, e.event_date) AS period,
|
||||
COUNT(DISTINCT e.user_id) AS retained_users,
|
||||
COUNT(DISTINCT c.user_id) AS cohort_size,
|
||||
ROUND(COUNT(DISTINCT e.user_id) * 100.0 / COUNT(DISTINCT c.user_id), 1) AS retention_rate
|
||||
FROM users u
|
||||
JOIN events e ON u.user_id = e.user_id
|
||||
JOIN (
|
||||
SELECT user_id, DATE_TRUNC('month', created_at) AS cohort_month
|
||||
FROM users
|
||||
WHERE created_at >= '[start_date]'
|
||||
) c ON u.user_id = c.user_id AND DATE_TRUNC('month', u.created_at) = c.cohort_month
|
||||
WHERE e.event_type = '[key_retention_event]'
|
||||
GROUP BY 1, 2, 3
|
||||
ORDER BY 1, 3;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks
|
||||
|
||||
- [ ] Cohort definition is unambiguous — the same user cannot appear in two cohorts
|
||||
- [ ] Retention curve shows a clear plateau, or the analysis notes that the window is too short to see one
|
||||
- [ ] LTV projection uses observed retention, not assumed
|
||||
- [ ] Behavioural segments are mutually exclusive and exhaustive
|
||||
- [ ] Recommendations are tied to specific cohort or segment findings — not generic growth advice
|
||||
- [ ] Leading indicators are observable in production data, not just in theory
|
||||
|
||||
## Example Trigger Phrases
|
||||
|
||||
- "Run a cohort analysis for our SaaS product"
|
||||
- "Analyse retention by acquisition month for the last 12 cohorts"
|
||||
- "What's the LTV of users who came via paid vs organic?"
|
||||
- "Build a cohort retention model showing period 0 through period 12"
|
||||
- "Segment users by behaviour and show me which group retains best"
|
||||
@@ -0,0 +1,221 @@
|
||||
---
|
||||
name: data-pipeline-spec
|
||||
description: "Design an ETL/ELT data pipeline specification. Use when asked to design a data pipeline, spec an ETL or ELT process, document a data ingestion workflow, or plan a data integration. Produces a complete pipeline spec with sources, transforms, destinations, SLAs, error handling, and data quality rules."
|
||||
---
|
||||
|
||||
# Data Pipeline Spec Skill
|
||||
|
||||
This skill produces a complete data pipeline specification covering sources, transformations, destinations, scheduling, SLAs, error handling, data quality checks, and monitoring requirements. Output is ready for engineering handoff or architecture review.
|
||||
|
||||
## Required Inputs
|
||||
|
||||
Ask the user for these if not provided:
|
||||
- **Pipeline purpose** — what business question or workflow does this pipeline serve?
|
||||
- **Source systems** — where does data come from? (databases, APIs, files, event streams)
|
||||
- **Destination** — where does data land? (data warehouse, data lake, downstream DB, reporting tool)
|
||||
- **Transformation type** — ETL (transform before loading) or ELT (load raw, transform in warehouse)?
|
||||
- **Frequency / SLA** — how often must data be fresh? (real-time / hourly / daily / weekly)
|
||||
- **Volume estimate** — approximate rows/events per run
|
||||
- **Data quality requirements** — completeness, deduplication, freshness, schema enforcement
|
||||
- **Team or stack** — any specific tools in use? (Airflow, dbt, Fivetran, Spark, Kafka, etc.)
|
||||
|
||||
## Output Structure
|
||||
|
||||
---
|
||||
|
||||
# Data Pipeline Spec: [Pipeline Name]
|
||||
|
||||
**Purpose:** [One sentence — what decision or workflow does this pipeline enable?]
|
||||
**Type:** [ETL / ELT / Streaming / Batch]
|
||||
**Owner:** [Team or individual]
|
||||
**Version:** [1.0]
|
||||
**Date:** [Date]
|
||||
**Status:** [Draft / Under Review / Approved]
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
[2–3 sentences describing the pipeline end-to-end: what data moves, from where to where, at what cadence, and why.]
|
||||
|
||||
**Architecture diagram (text):**
|
||||
|
||||
```
|
||||
[Source A] ──┐
|
||||
[Source B] ──┤──► [Ingestion Layer] ──► [Transform Layer] ──► [Destination] ──► [Consumers]
|
||||
[Source C] ──┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Sources
|
||||
|
||||
| Source | System | Connection type | Data format | Update pattern | Volume |
|
||||
|---|---|---|---|---|---|
|
||||
| [Source 1] | [PostgreSQL / Salesforce / S3 / Kafka] | [JDBC / REST API / SDK / Webhook] | [JSON / CSV / Parquet / CDC] | [Append / Full refresh / Incremental] | [X rows/day] |
|
||||
| [Source 2] | [...] | [...] | [...] | [...] | [...] |
|
||||
|
||||
**Incremental key (if applicable):** [The column used to identify new or changed records — e.g. `updated_at`, `event_id`]
|
||||
|
||||
**Authentication:** [API key / OAuth / IAM role / connection string — note where credentials are stored]
|
||||
|
||||
---
|
||||
|
||||
## 3. Ingestion Layer
|
||||
|
||||
**Tool:** [Fivetran / Airbyte / Kafka Connect / custom script / dbt source]
|
||||
|
||||
**Ingestion method:**
|
||||
- [ ] Full extract (full table refresh each run)
|
||||
- [ ] Incremental extract (only new/changed rows since last run)
|
||||
- [ ] CDC (change data capture from database transaction log)
|
||||
- [ ] Event streaming (continuous ingestion from Kafka/Kinesis)
|
||||
|
||||
**Raw landing zone:** [Where raw data lands before transformation — e.g. `raw.salesforce_opportunities` in Snowflake, S3 bucket `s3://data-raw/crm/`]
|
||||
|
||||
**Schema handling:** [Strict schema enforcement / Schema evolution allowed / Union schema]
|
||||
|
||||
---
|
||||
|
||||
## 4. Transformation Logic
|
||||
|
||||
List each transformation in execution order. For ELT pipelines, this is the dbt model or SQL layer.
|
||||
|
||||
| Step | Name | Description | Input | Output | Tool |
|
||||
|---|---|---|---|---|---|
|
||||
| 1 | [Deduplicate events] | [Remove duplicate event rows based on event_id] | `raw.events` | `staging.events_deduped` | [dbt / SQL / Spark] |
|
||||
| 2 | [Join user profile] | [Enrich events with user attributes from CRM] | `staging.events_deduped`, `raw.users` | `staging.events_enriched` | [...] |
|
||||
| 3 | [Aggregate to daily] | [Roll up to user×day grain] | `staging.events_enriched` | `mart.user_daily_activity` | [...] |
|
||||
|
||||
**Business logic rules:**
|
||||
- [e.g. Revenue is recognised on `payment_confirmed_at`, not `payment_initiated_at`]
|
||||
- [e.g. Users in the `internal@company.com` domain are excluded from all metrics]
|
||||
- [e.g. Currency conversion uses the ECB rate from the first business day of each month]
|
||||
|
||||
**Slowly Changing Dimensions (SCD) — if applicable:**
|
||||
- [e.g. `users.plan_tier` is SCD Type 2 — keep history of plan changes with `valid_from` / `valid_to`]
|
||||
|
||||
---
|
||||
|
||||
## 5. Destination
|
||||
|
||||
| Destination | System | Schema / Table | Write mode | Consumers |
|
||||
|---|---|---|---|---|
|
||||
| [Primary] | [Snowflake / BigQuery / Redshift / PostgreSQL] | [`analytics.mart_user_activity`] | [Append / Upsert / Full replace] | [Looker / Metabase / downstream pipeline] |
|
||||
| [Secondary] | [...] | [...] | [...] | [...] |
|
||||
|
||||
**Partitioning / Clustering:** [e.g. Partitioned by `event_date`, clustered by `user_id` — reduces query cost for time-range scans]
|
||||
|
||||
**Retention policy:** [e.g. Raw data retained for 90 days; mart tables retained indefinitely]
|
||||
|
||||
---
|
||||
|
||||
## 6. Scheduling & SLAs
|
||||
|
||||
| SLA | Target | Breach action |
|
||||
|---|---|---|
|
||||
| **Data freshness** | [Data must be ≤ X hours old by HH:MM UTC] | [Page on-call / alert Slack channel] |
|
||||
| **Pipeline completion** | [Must complete within X minutes of trigger] | [Alert and auto-retry] |
|
||||
| **Availability** | [Pipeline must run successfully X% of days per month] | [Incident review] |
|
||||
|
||||
**Schedule:** [Cron expression and human description — e.g. `0 6 * * *` — daily at 06:00 UTC]
|
||||
|
||||
**Trigger type:**
|
||||
- [ ] Time-based (cron)
|
||||
- [ ] Event-based (triggered by upstream pipeline success / file arrival / Kafka lag)
|
||||
- [ ] Manual (ad hoc runs only)
|
||||
|
||||
**Backfill strategy:** [How to reprocess historical data if the pipeline fails or logic changes — e.g. parameterised date range, full drop-and-reload]
|
||||
|
||||
---
|
||||
|
||||
## 7. Data Quality Rules
|
||||
|
||||
| Check | Table | Rule | Failure action |
|
||||
|---|---|---|---|
|
||||
| Completeness | `staging.events` | `event_id IS NOT NULL` — 100% of rows | Block load / Alert |
|
||||
| Uniqueness | `mart.user_daily_activity` | `(user_id, date)` must be unique | Block load |
|
||||
| Freshness | `mart.user_daily_activity` | `max(event_date) >= CURRENT_DATE - 1` | Alert |
|
||||
| Volume | `staging.events` | Row count within ±20% of 7-day average | Alert |
|
||||
| Referential integrity | `staging.events` | All `user_id` values exist in `users` table | Alert |
|
||||
|
||||
**DQ tool:** [dbt tests / Great Expectations / Monte Carlo / custom SQL assertions]
|
||||
|
||||
---
|
||||
|
||||
## 8. Error Handling & Recovery
|
||||
|
||||
**Retry policy:** [e.g. 3 retries with exponential back-off: 5 min, 20 min, 60 min]
|
||||
|
||||
**Failure modes and responses:**
|
||||
|
||||
| Failure | Detection | Response | Owner |
|
||||
|---|---|---|---|
|
||||
| Source unavailable | HTTP 5xx / connection timeout | Retry 3×, then alert and skip run | Data engineering |
|
||||
| Schema change in source | Column missing or type mismatch | Block load, alert schema owner | Data owner + engineering |
|
||||
| DQ check fails | dbt test failure / assertion error | Block load for P1 checks; alert for P2 | Data engineering |
|
||||
| Partial load | Row count < expected threshold | Alert; do not publish to consumers until resolved | Data engineering |
|
||||
|
||||
**Dead-letter queue:** [Where failed records are routed for manual inspection — e.g. `raw.dlq_events`]
|
||||
|
||||
---
|
||||
|
||||
## 9. Monitoring & Observability
|
||||
|
||||
**Metrics to track:**
|
||||
- Pipeline run duration (p50, p95)
|
||||
- Rows processed per run
|
||||
- DQ check pass rate
|
||||
- Source freshness lag
|
||||
- Error rate per source
|
||||
|
||||
**Alerting:**
|
||||
- [Slack channel: #data-alerts]
|
||||
- [PagerDuty: data-on-call escalation for P1 SLA breaches]
|
||||
- [Dashboard: [link to monitoring dashboard]]
|
||||
|
||||
**Logging:** [What gets logged and where — e.g. Airflow task logs to CloudWatch, structured JSON to data lake]
|
||||
|
||||
---
|
||||
|
||||
## 10. Dependencies & Sequencing
|
||||
|
||||
**Upstream dependencies:** [Which pipelines or data sources must succeed before this pipeline runs?]
|
||||
|
||||
**Downstream dependents:** [Which dashboards, pipelines, or models depend on this pipeline's output?]
|
||||
|
||||
```
|
||||
[upstream pipeline A] ──► THIS PIPELINE ──► [downstream dashboard B]
|
||||
└──► [downstream pipeline C]
|
||||
```
|
||||
|
||||
**Coordination mechanism:** [Airflow DAG dependency / dbt ref() / event trigger / manual gate]
|
||||
|
||||
---
|
||||
|
||||
## 11. Security & Compliance
|
||||
|
||||
- **PII fields:** [List columns containing PII — e.g. `email`, `ip_address`, `name`]
|
||||
- **Masking / Pseudonymisation:** [e.g. email hashed with SHA-256 before landing in mart layer]
|
||||
- **Access control:** [Who can query the destination tables? — e.g. Role-based access in Snowflake]
|
||||
- **Data residency:** [Which regions is data permitted to transit and rest in?]
|
||||
- **Audit trail:** [Is pipeline execution auditable for compliance purposes? Where are logs retained?]
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks
|
||||
|
||||
- [ ] Every source has an incremental key or full-refresh justification
|
||||
- [ ] Business logic rules are documented, not just the SQL
|
||||
- [ ] SLAs are agreed with consumers, not set unilaterally by engineering
|
||||
- [ ] DQ checks cover completeness, uniqueness, freshness, and volume
|
||||
- [ ] Failure modes include a documented recovery owner
|
||||
- [ ] PII fields are identified and a treatment plan is specified
|
||||
|
||||
## Example Trigger Phrases
|
||||
|
||||
- "Design a data pipeline for our Salesforce to Snowflake sync"
|
||||
- "Write a pipeline spec for ingesting Stripe events into our data warehouse"
|
||||
- "Build an ETL spec for our user activity data"
|
||||
- "Document our dbt pipeline from raw events to the analytics mart"
|
||||
- "Spec out the pipeline that feeds the executive dashboard"
|
||||
Reference in New Issue
Block a user