feat: v12.0.0 — 150-skill milestone, 15 new skills across 10 bundles

Adds 15 new skills reaching the 150-skill milestone:

Data & Analytics (pm-data):
- cohort-analysis: retention curves, LTV projection, behavioural segmentation, SQL reference queries
- data-pipeline-spec: ETL/ELT design with SLAs, DQ rules, error handling, compliance

Customer Success (pm-cs):
- renewal-playbook: health snapshot, value story, commercial scenarios, objection responses, 16-week timeline
- customer-success-plan: joint success plan with milestones, mutual commitments, escalation path

People & Leadership (pm-people):
- 360-feedback-template: survey instrument + narrative report with strengths and development themes
- team-health-check: Spotify-model assessment across 7 dimensions with facilitation guide

Operations (pm-operations):
- risk-register: L×I scoring, RAG heat map, mitigation and contingency plans
- raci-matrix: role definitions, decision map, anti-pattern guide, communication template

Marketing & GTM (pm-gtm):
- social-media-strategy: audience profile, content pillars, KPIs, 4-week starter calendar
- product-positioning-doc: April Dunford-style positioning, messaging hierarchy, persona messaging

Discovery (pm-discovery):
- customer-journey-map: stage-by-stage journey with touchpoints, emotions, and prioritised opportunities

Delivery (pm-delivery):
- user-story-writer: Given/When/Then ACs, edge cases, definition of done, epic decomposition

Advanced (pm-advanced):
- ai-ethics-review: fairness, bias, transparency, privacy, safety, accountability, societal impact

Sales (pm-sales):
- partnership-proposal: mutual value, commercial model, joint GTM plan, governance

Design (pm-design):
- design-system-audit: component coverage, token consistency, WCAG, adoption, remediation roadmap

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
mohitagw15856
2026-05-26 21:58:13 +01:00
parent 94e53d38a8
commit ae6ea4d53e
32 changed files with 6630 additions and 152 deletions
@@ -0,0 +1,187 @@
---
name: cohort-analysis
description: "Structure a cohort analysis for retention, LTV, or behavioural patterns. Use when asked to run a cohort analysis, analyse retention by cohort, segment users by behaviour over time, or calculate lifetime value by acquisition period. Produces a complete cohort analysis framework with methodology, cohort definitions, retention curves, and prioritised interventions."
---
# Cohort Analysis Skill
This skill produces a structured cohort analysis covering retention curves, LTV estimation, behavioural segmentation, and actionable interventions. Output is ready to present to product leadership or share with growth and data teams.
## Required Inputs
Ask the user for these if not provided:
- **Analysis goal** (retention improvement / LTV modelling / behavioural segmentation / churn prediction)
- **Product or feature being analysed**
- **Cohort definition** — what groups users? (acquisition month, signup channel, plan tier, feature adoption)
- **Observation window** — how many periods to track? (e.g. 12 months, 8 weeks)
- **Key metric** — what are you measuring per cohort? (retention rate, revenue, engagement score, feature usage)
- **Available data** — what tables/metrics are available? (paste schema or describe)
- **Baseline** — any existing retention benchmarks or goals?
## Output Structure
---
# Cohort Analysis: [Product / Feature]
**Analysis type:** [Retention / LTV / Behavioural / Churn]
**Cohort definition:** [Acquisition month / Signup channel / Plan tier / Feature adoption date]
**Observation window:** [X months / weeks]
**Primary metric:** [Metric name]
**Date prepared:** [Date]
---
## 1. Cohort Definitions
| Cohort | Period | Size | Description |
|---|---|---|---|
| [Cohort 1] | [Jan 2025] | [N users] | [e.g. Users who signed up in Jan 2025 via organic] |
| [Cohort 2] | [Feb 2025] | [N users] | [...] |
**Cohort logic:**
- Cohort entry event: [First sign-up / First purchase / Feature activation]
- Cohort exit criteria: [Churned / Downgraded / No activity for 30 days]
- Exclusions: [Trial users / Internal test accounts / Users with < X days of data]
---
## 2. Retention Curve
**How to read:** Each cell shows what % of the cohort performed the key metric in period N.
| Cohort | Period 0 | Period 1 | Period 2 | Period 3 | Period 6 | Period 12 |
|---|---|---|---|---|---|---|
| Jan 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
| Feb 2025 | 100% | [X%] | [X%] | [X%] | [X%] | [X%] |
| [Trend] | — | [↑/↓ vs prior] | [...] | [...] | [...] | [...] |
**Retention plateau:** [At what period does retention flatten? What % does it flatten at?]
**Key observations:**
- [e.g. Period 1 → Period 2 drop is the largest — average X% churn in first 30 days]
- [e.g. Cohorts acquired via [channel] retain X% better at Period 6]
- [e.g. Retention has improved from X% → Y% at Period 3 comparing oldest to newest cohort]
---
## 3. LTV Projection (if applicable)
**ARPU per period:** [£/$/€ X per active user per month]
**Retention curve used:** [Which cohort or blended average]
| Period | Retained % | Revenue per user | Cumulative LTV |
|---|---|---|---|
| Month 1 | [X%] | [£X] | [£X] |
| Month 3 | [X%] | [£X] | [£X] |
| Month 6 | [X%] | [£X] | [£X] |
| Month 12 | [X%] | [£X] | [£X] |
**Blended LTV:** [£X at 12 months — based on blended retention across cohorts]
**LTV by segment:**
| Segment | LTV (12M) | vs Baseline |
|---|---|---|
| [Organic] | [£X] | [+X%] |
| [Paid] | [£X] | [-X%] |
| [Enterprise] | [£X] | [+X%] |
---
## 4. Behavioural Segmentation
Group cohorts by behaviour patterns, not just acquisition date:
| Segment | Definition | Size | Retention (P6) | LTV (12M) |
|---|---|---|---|---|
| **Power users** | [Used core feature ≥ 3x/week in first 30 days] | [X%] | [X%] | [£X] |
| **Casual users** | [Used 12x/week in first 30 days] | [X%] | [X%] | [£X] |
| **Dormant** | [Logged in but did not use core feature] | [X%] | [X%] | [£X] |
| **Never activated** | [Signed up but never completed onboarding] | [X%] | [X%] | [£X] |
**Activation threshold insight:** [What action — taken within the first X days — most strongly predicts retention? This is the "aha moment" to optimise for.]
---
## 5. Leading Indicators of Churn
List the signals that appear **before** users churn, so teams can intervene:
| Signal | How early does it appear? | Churn correlation | Intervention |
|---|---|---|---|
| [No login for 7 days] | [7 days before churn] | [Strong] | [Re-engagement email sequence] |
| [Support ticket with escalation] | [14 days before churn] | [Moderate] | [CSM outreach within 48 hours] |
| [Feature usage dropped >50% WoW] | [10 days before churn] | [Strong] | [In-app nudge with use-case tutorial] |
---
## 6. Cohort Comparison: What's Changed Over Time
Compare oldest and newest cohorts to assess whether product improvements are showing up in retention:
| Metric | [Oldest cohort — e.g. Jan 2024] | [Newest cohort — e.g. Jan 2025] | Change |
|---|---|---|---|
| Period 1 retention | [X%] | [X%] | [↑/↓ X pp] |
| Period 3 retention | [X%] | [X%] | [↑/↓ X pp] |
| Activation rate | [X%] | [X%] | [↑/↓ X pp] |
| Avg. sessions in first 30 days | [X] | [X] | [↑/↓] |
**Verdict:** [Are more recent cohorts performing better or worse? What shipped in that period that might explain the change?]
---
## 7. Recommendations
Prioritise by impact on retention curve:
| # | Recommendation | Target segment | Expected impact | Effort | Priority |
|---|---|---|---|---|---|
| 1 | [e.g. Redesign onboarding to hit activation milestone in day 1, not day 7] | [Never-activated segment] | [+X pp P1 retention] | [Medium] | P1 |
| 2 | [e.g. Launch re-engagement sequence at day 7 inactivity trigger] | [Dormant segment] | [+X pp P2 retention] | [Low] | P1 |
| 3 | [e.g. Introduce power-user features earlier to accelerate habit formation] | [Casual users] | [+X pp P6 LTV] | [High] | P2 |
---
## 8. SQL Reference (if applicable)
Provide the core cohort query so data teams can replicate or extend the analysis:
```sql
-- Retention cohort query
SELECT
DATE_TRUNC('month', u.created_at) AS cohort_month,
DATE_TRUNC('month', e.event_date) AS activity_month,
DATEDIFF('month', u.created_at, e.event_date) AS period,
COUNT(DISTINCT e.user_id) AS retained_users,
COUNT(DISTINCT c.user_id) AS cohort_size,
ROUND(COUNT(DISTINCT e.user_id) * 100.0 / COUNT(DISTINCT c.user_id), 1) AS retention_rate
FROM users u
JOIN events e ON u.user_id = e.user_id
JOIN (
SELECT user_id, DATE_TRUNC('month', created_at) AS cohort_month
FROM users
WHERE created_at >= '[start_date]'
) c ON u.user_id = c.user_id AND DATE_TRUNC('month', u.created_at) = c.cohort_month
WHERE e.event_type = '[key_retention_event]'
GROUP BY 1, 2, 3
ORDER BY 1, 3;
```
---
## Quality Checks
- [ ] Cohort definition is unambiguous — the same user cannot appear in two cohorts
- [ ] Retention curve shows a clear plateau, or the analysis notes that the window is too short to see one
- [ ] LTV projection uses observed retention, not assumed
- [ ] Behavioural segments are mutually exclusive and exhaustive
- [ ] Recommendations are tied to specific cohort or segment findings — not generic growth advice
- [ ] Leading indicators are observable in production data, not just in theory
## Example Trigger Phrases
- "Run a cohort analysis for our SaaS product"
- "Analyse retention by acquisition month for the last 12 cohorts"
- "What's the LTV of users who came via paid vs organic?"
- "Build a cohort retention model showing period 0 through period 12"
- "Segment users by behaviour and show me which group retains best"
@@ -0,0 +1,221 @@
---
name: data-pipeline-spec
description: "Design an ETL/ELT data pipeline specification. Use when asked to design a data pipeline, spec an ETL or ELT process, document a data ingestion workflow, or plan a data integration. Produces a complete pipeline spec with sources, transforms, destinations, SLAs, error handling, and data quality rules."
---
# Data Pipeline Spec Skill
This skill produces a complete data pipeline specification covering sources, transformations, destinations, scheduling, SLAs, error handling, data quality checks, and monitoring requirements. Output is ready for engineering handoff or architecture review.
## Required Inputs
Ask the user for these if not provided:
- **Pipeline purpose** — what business question or workflow does this pipeline serve?
- **Source systems** — where does data come from? (databases, APIs, files, event streams)
- **Destination** — where does data land? (data warehouse, data lake, downstream DB, reporting tool)
- **Transformation type** — ETL (transform before loading) or ELT (load raw, transform in warehouse)?
- **Frequency / SLA** — how often must data be fresh? (real-time / hourly / daily / weekly)
- **Volume estimate** — approximate rows/events per run
- **Data quality requirements** — completeness, deduplication, freshness, schema enforcement
- **Team or stack** — any specific tools in use? (Airflow, dbt, Fivetran, Spark, Kafka, etc.)
## Output Structure
---
# Data Pipeline Spec: [Pipeline Name]
**Purpose:** [One sentence — what decision or workflow does this pipeline enable?]
**Type:** [ETL / ELT / Streaming / Batch]
**Owner:** [Team or individual]
**Version:** [1.0]
**Date:** [Date]
**Status:** [Draft / Under Review / Approved]
---
## 1. Overview
[23 sentences describing the pipeline end-to-end: what data moves, from where to where, at what cadence, and why.]
**Architecture diagram (text):**
```
[Source A] ──┐
[Source B] ──┤──► [Ingestion Layer] ──► [Transform Layer] ──► [Destination] ──► [Consumers]
[Source C] ──┘
```
---
## 2. Sources
| Source | System | Connection type | Data format | Update pattern | Volume |
|---|---|---|---|---|---|
| [Source 1] | [PostgreSQL / Salesforce / S3 / Kafka] | [JDBC / REST API / SDK / Webhook] | [JSON / CSV / Parquet / CDC] | [Append / Full refresh / Incremental] | [X rows/day] |
| [Source 2] | [...] | [...] | [...] | [...] | [...] |
**Incremental key (if applicable):** [The column used to identify new or changed records — e.g. `updated_at`, `event_id`]
**Authentication:** [API key / OAuth / IAM role / connection string — note where credentials are stored]
---
## 3. Ingestion Layer
**Tool:** [Fivetran / Airbyte / Kafka Connect / custom script / dbt source]
**Ingestion method:**
- [ ] Full extract (full table refresh each run)
- [ ] Incremental extract (only new/changed rows since last run)
- [ ] CDC (change data capture from database transaction log)
- [ ] Event streaming (continuous ingestion from Kafka/Kinesis)
**Raw landing zone:** [Where raw data lands before transformation — e.g. `raw.salesforce_opportunities` in Snowflake, S3 bucket `s3://data-raw/crm/`]
**Schema handling:** [Strict schema enforcement / Schema evolution allowed / Union schema]
---
## 4. Transformation Logic
List each transformation in execution order. For ELT pipelines, this is the dbt model or SQL layer.
| Step | Name | Description | Input | Output | Tool |
|---|---|---|---|---|---|
| 1 | [Deduplicate events] | [Remove duplicate event rows based on event_id] | `raw.events` | `staging.events_deduped` | [dbt / SQL / Spark] |
| 2 | [Join user profile] | [Enrich events with user attributes from CRM] | `staging.events_deduped`, `raw.users` | `staging.events_enriched` | [...] |
| 3 | [Aggregate to daily] | [Roll up to user×day grain] | `staging.events_enriched` | `mart.user_daily_activity` | [...] |
**Business logic rules:**
- [e.g. Revenue is recognised on `payment_confirmed_at`, not `payment_initiated_at`]
- [e.g. Users in the `internal@company.com` domain are excluded from all metrics]
- [e.g. Currency conversion uses the ECB rate from the first business day of each month]
**Slowly Changing Dimensions (SCD) — if applicable:**
- [e.g. `users.plan_tier` is SCD Type 2 — keep history of plan changes with `valid_from` / `valid_to`]
---
## 5. Destination
| Destination | System | Schema / Table | Write mode | Consumers |
|---|---|---|---|---|
| [Primary] | [Snowflake / BigQuery / Redshift / PostgreSQL] | [`analytics.mart_user_activity`] | [Append / Upsert / Full replace] | [Looker / Metabase / downstream pipeline] |
| [Secondary] | [...] | [...] | [...] | [...] |
**Partitioning / Clustering:** [e.g. Partitioned by `event_date`, clustered by `user_id` — reduces query cost for time-range scans]
**Retention policy:** [e.g. Raw data retained for 90 days; mart tables retained indefinitely]
---
## 6. Scheduling & SLAs
| SLA | Target | Breach action |
|---|---|---|
| **Data freshness** | [Data must be ≤ X hours old by HH:MM UTC] | [Page on-call / alert Slack channel] |
| **Pipeline completion** | [Must complete within X minutes of trigger] | [Alert and auto-retry] |
| **Availability** | [Pipeline must run successfully X% of days per month] | [Incident review] |
**Schedule:** [Cron expression and human description — e.g. `0 6 * * *` — daily at 06:00 UTC]
**Trigger type:**
- [ ] Time-based (cron)
- [ ] Event-based (triggered by upstream pipeline success / file arrival / Kafka lag)
- [ ] Manual (ad hoc runs only)
**Backfill strategy:** [How to reprocess historical data if the pipeline fails or logic changes — e.g. parameterised date range, full drop-and-reload]
---
## 7. Data Quality Rules
| Check | Table | Rule | Failure action |
|---|---|---|---|
| Completeness | `staging.events` | `event_id IS NOT NULL` — 100% of rows | Block load / Alert |
| Uniqueness | `mart.user_daily_activity` | `(user_id, date)` must be unique | Block load |
| Freshness | `mart.user_daily_activity` | `max(event_date) >= CURRENT_DATE - 1` | Alert |
| Volume | `staging.events` | Row count within ±20% of 7-day average | Alert |
| Referential integrity | `staging.events` | All `user_id` values exist in `users` table | Alert |
**DQ tool:** [dbt tests / Great Expectations / Monte Carlo / custom SQL assertions]
---
## 8. Error Handling & Recovery
**Retry policy:** [e.g. 3 retries with exponential back-off: 5 min, 20 min, 60 min]
**Failure modes and responses:**
| Failure | Detection | Response | Owner |
|---|---|---|---|
| Source unavailable | HTTP 5xx / connection timeout | Retry 3×, then alert and skip run | Data engineering |
| Schema change in source | Column missing or type mismatch | Block load, alert schema owner | Data owner + engineering |
| DQ check fails | dbt test failure / assertion error | Block load for P1 checks; alert for P2 | Data engineering |
| Partial load | Row count < expected threshold | Alert; do not publish to consumers until resolved | Data engineering |
**Dead-letter queue:** [Where failed records are routed for manual inspection — e.g. `raw.dlq_events`]
---
## 9. Monitoring & Observability
**Metrics to track:**
- Pipeline run duration (p50, p95)
- Rows processed per run
- DQ check pass rate
- Source freshness lag
- Error rate per source
**Alerting:**
- [Slack channel: #data-alerts]
- [PagerDuty: data-on-call escalation for P1 SLA breaches]
- [Dashboard: [link to monitoring dashboard]]
**Logging:** [What gets logged and where — e.g. Airflow task logs to CloudWatch, structured JSON to data lake]
---
## 10. Dependencies & Sequencing
**Upstream dependencies:** [Which pipelines or data sources must succeed before this pipeline runs?]
**Downstream dependents:** [Which dashboards, pipelines, or models depend on this pipeline's output?]
```
[upstream pipeline A] ──► THIS PIPELINE ──► [downstream dashboard B]
└──► [downstream pipeline C]
```
**Coordination mechanism:** [Airflow DAG dependency / dbt ref() / event trigger / manual gate]
---
## 11. Security & Compliance
- **PII fields:** [List columns containing PII — e.g. `email`, `ip_address`, `name`]
- **Masking / Pseudonymisation:** [e.g. email hashed with SHA-256 before landing in mart layer]
- **Access control:** [Who can query the destination tables? — e.g. Role-based access in Snowflake]
- **Data residency:** [Which regions is data permitted to transit and rest in?]
- **Audit trail:** [Is pipeline execution auditable for compliance purposes? Where are logs retained?]
---
## Quality Checks
- [ ] Every source has an incremental key or full-refresh justification
- [ ] Business logic rules are documented, not just the SQL
- [ ] SLAs are agreed with consumers, not set unilaterally by engineering
- [ ] DQ checks cover completeness, uniqueness, freshness, and volume
- [ ] Failure modes include a documented recovery owner
- [ ] PII fields are identified and a treatment plan is specified
## Example Trigger Phrases
- "Design a data pipeline for our Salesforce to Snowflake sync"
- "Write a pipeline spec for ingesting Stripe events into our data warehouse"
- "Build an ETL spec for our user activity data"
- "Document our dbt pipeline from raw events to the analytics mart"
- "Spec out the pipeline that feeds the executive dashboard"