05b6d799f0
Three more learnings from alirezarezvani/claude-skills, applied: 1. SkillCheck validator (scripts/skillcheck.mjs) — validates every SKILL.md against the authoring standard (frontmatter, name/folder match, trigger + produces clauses, required headings) plus tier referential integrity. Errors fail CI; --strict fails on warnings too. New skillcheck.yml workflow and a SkillCheck status badge in the README. Current: 0 errors / 14 advisory warnings across 172 skills. 2. Cursor export platform — build-exports.mjs now generates exports/cursor/<bundle>/<skill>/<skill>.mdc rule files. The PLATFORMS registry now supports per-skill filenames (file as a function). 3. Per-agent installers — scripts/install.sh unifies install for claude/hermes/codex/openclaw/cursor (--link, --target, --dry-run, --list). Curl-able one-liners codex-install.sh, openclaw-install.sh, and cursor-install.sh clone the library and install in a single command. README documents the one-line installs and Cursor exports; CHANGELOG and the authoring standard updated. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>
409 lines
17 KiB
Plaintext
409 lines
17 KiB
Plaintext
---
|
||
description: "Write an engineering RFC (Request for Comments) for a technical decision, architectural change, or significant implementation approach. Use when asked to write an RFC, document a technical proposal, create a design doc, write an architecture decision for review, or produce a technical specification for team feedback. Produces a complete RFC document covering problem statement, motivation, proposed solution, alternatives rejected, implementation plan, migration plan, security and performance implications, observability changes, rollout plan, and open questions."
|
||
globs:
|
||
alwaysApply: false
|
||
---
|
||
|
||
# RFC Writer Skill
|
||
|
||
Produce a complete engineering RFC (Request for Comments) for a technical decision or architectural change. An RFC is a structured proposal document — not a persuasion document. Its purpose is to expose a decision to scrutiny, surface trade-offs, document alternatives considered, and create a permanent record of why a choice was made.
|
||
|
||
A good RFC makes it possible for someone who wasn't in the room to understand years later why the team built something the way they did.
|
||
|
||
## Required Inputs
|
||
|
||
Ask for these if not already provided:
|
||
- **RFC title and author** — what this RFC is about and who is proposing it
|
||
- **Problem being solved** — what is broken, missing, or inadequate today; why action is needed now
|
||
- **Proposed solution** — the approach the author is recommending, at least at a high level
|
||
- **Context and constraints** — team size, existing architecture, timeline pressures, budget limits, compliance requirements
|
||
- **Alternatives considered** — at least 2 alternative approaches the author has thought about
|
||
- **Current status** — is this pre-decision (seeking feedback) or post-decision (documenting a made decision)?
|
||
|
||
## Output Format
|
||
|
||
---
|
||
|
||
# RFC [Number]: [Title]
|
||
|
||
**Author:** [Name] | **Team:** [Team name]
|
||
**Created:** [Date] | **Last updated:** [Date]
|
||
**Status:** Draft | In Review | Approved | Rejected | Superseded by RFC-[X]
|
||
**Ticket:** [JIRA-XXX] | **Slack thread:** [#channel link]
|
||
**Review deadline:** [Date — when comments should be submitted by]
|
||
|
||
---
|
||
|
||
## Abstract
|
||
|
||
[2–4 sentences summarising the entire RFC. Should stand alone — someone reading only this should understand what is being proposed, why, and what the main trade-off is. Write this last.]
|
||
|
||
---
|
||
|
||
## 1. Problem Statement
|
||
|
||
[Describe the problem being solved. Focus on the *problem*, not the solution. Be specific and quantified where possible.]
|
||
|
||
**Current state:**
|
||
[Describe how things work today — the existing system, process, or architecture. Include any relevant constraints or limitations.]
|
||
|
||
**Why this is a problem now:**
|
||
[Why is this being addressed now rather than earlier or later? Reference metrics, incidents, product requirements, or scaling thresholds that make this urgent or timely.]
|
||
|
||
**Example of the problem in practice:**
|
||
[A concrete scenario or incident that illustrates the problem. This helps reviewers understand the real-world impact, not just the abstract description.]
|
||
|
||
```
|
||
// Example: current behaviour that illustrates the problem
|
||
[code snippet, log output, or sequence description showing the problem]
|
||
```
|
||
|
||
**Impact of not solving this:**
|
||
- [Impact 1 — e.g. "New tenant onboarding requires 3 hours of manual configuration per account"]
|
||
- [Impact 2 — e.g. "Auth service handles 400 req/s; projected to hit capacity within 8 weeks at current growth"]
|
||
- [Impact 3 — e.g. "Current approach is incompatible with the upcoming multi-region requirement"]
|
||
|
||
---
|
||
|
||
## 2. Goals and Non-Goals
|
||
|
||
**Goals:**
|
||
- [ ] [Specific, measurable outcome — e.g. "Reduce tenant onboarding time from 3 hours to <5 minutes"]
|
||
- [ ] [e.g. "Support 2,000 req/s on the auth service with P99 latency ≤50ms"]
|
||
- [ ] [e.g. "Enable multi-region deployment without changes to the application layer"]
|
||
|
||
**Non-goals:** *(what this RFC explicitly does not address)*
|
||
- [e.g. "This RFC does not address authentication for internal service-to-service calls — see RFC-042"]
|
||
- [e.g. "Performance improvements to the existing system — this RFC replaces it"]
|
||
- [e.g. "Migration of historical data — covered in a follow-on RFC"]
|
||
|
||
**Success metrics:**
|
||
| Metric | Current | Target | Measurement method |
|
||
|---|---|---|---|
|
||
| [e.g. Onboarding time] | [3 hours] | [<5 minutes] | [Prometheus histogram on onboarding job duration] |
|
||
| [e.g. Auth latency P99] | [120ms] | [≤50ms] | [Datadog APM] |
|
||
| [e.g. Engineer setup time] | [4 hours] | [<30 minutes] | [Onboarding survey] |
|
||
|
||
---
|
||
|
||
## 3. Background and Motivation
|
||
|
||
[Provide the context a reviewer needs to evaluate the proposal. This is not a repeat of the problem statement — it is the surrounding technical and business context.]
|
||
|
||
**Existing system overview:**
|
||
[Describe the relevant parts of the current architecture. Include an ASCII diagram if the relationships between components help understanding.]
|
||
|
||
```
|
||
[ASCII diagram of current architecture — optional but strongly recommended for architectural RFCs]
|
||
|
||
┌──────────┐ ┌──────────────┐ ┌──────────────┐
|
||
│ Client │────▶│ [Service A] │────▶│ [Service B] │
|
||
└──────────┘ └──────────────┘ └──────────────┘
|
||
│
|
||
▼
|
||
┌──────────────┐
|
||
│ [Database] │
|
||
└──────────────┘
|
||
```
|
||
|
||
**Prior work and related decisions:**
|
||
- [RFC-XXX: Title — relevant previous decision; link]
|
||
- [ADR-XXX: Title — architectural decision record]
|
||
- [Any external standards, blog posts, or vendor documentation that informs this proposal]
|
||
|
||
**Constraints:**
|
||
- [e.g. Must remain backward compatible with v1 API clients for 12 months]
|
||
- [e.g. Team has no Rust expertise — solution must be in Python or Go]
|
||
- [e.g. Must be deployable without a maintenance window]
|
||
|
||
---
|
||
|
||
## 4. Proposed Solution
|
||
|
||
[Describe the proposed approach clearly and specifically. Include enough detail that an engineer could begin implementing from this document, but don't write the code — that is for the PR.]
|
||
|
||
### 4.1 High-Level Approach
|
||
|
||
[1–3 paragraphs describing the overall solution. Explain the key idea and why it solves the problem.]
|
||
|
||
### 4.2 Architecture
|
||
|
||
```
|
||
[ASCII diagram of the proposed architecture — what the system looks like after this RFC is implemented]
|
||
|
||
┌──────────┐ ┌──────────────────┐ ┌──────────────┐
|
||
│ Client │────▶│ [New Component] │────▶│ [Service B] │
|
||
└──────────┘ └──────────────────┘ └──────────────┘
|
||
│ │
|
||
▼ ▼
|
||
┌──────────────┐ ┌──────────────┐
|
||
│ [Store A] │ │ [Store B] │
|
||
└──────────────┘ └──────────────┘
|
||
```
|
||
|
||
### 4.3 Detailed Design
|
||
|
||
[Break the solution into its key components or decisions. For each, explain what it does and why it was designed this way.]
|
||
|
||
**Component / Decision 1: [Name]**
|
||
|
||
[Description of this component — what it does, how it works, why this approach was chosen.]
|
||
|
||
```
|
||
// Example interface, API contract, or pseudocode (not implementation code)
|
||
[Relevant schema, API definition, data flow, or pseudocode]
|
||
```
|
||
|
||
**Component / Decision 2: [Name]**
|
||
|
||
[Description]
|
||
|
||
**Component / Decision 3: [Name]**
|
||
|
||
[Description]
|
||
|
||
### 4.4 API Changes
|
||
|
||
*Complete this section if the RFC introduces or modifies any API endpoints, events, or interfaces.*
|
||
|
||
**New endpoints / events:**
|
||
```
|
||
[HTTP method + path or event name]
|
||
Request: { ... }
|
||
Response: { ... }
|
||
```
|
||
|
||
**Modified endpoints:**
|
||
- `[endpoint]`: [what changes and why; backward compatibility note]
|
||
|
||
**Deprecated endpoints:**
|
||
- `[endpoint]`: deprecated in favour of `[new endpoint]` — removal timeline: [date/version]
|
||
|
||
### 4.5 Data Model Changes
|
||
|
||
*Complete this section if any database schema or data structure changes are required.*
|
||
|
||
[Describe schema changes at a high level. Reference the database-migration-plan skill for detailed migration steps.]
|
||
|
||
```sql
|
||
-- Key schema changes (abbreviated — full migration in [link])
|
||
[DDL statements for key additions/changes]
|
||
```
|
||
|
||
---
|
||
|
||
## 5. Alternatives Considered
|
||
|
||
*Every alternative must include an explicit reason why it was rejected. "We went with the proposed solution" is not a reason.*
|
||
|
||
### Alternative 1: [Name]
|
||
|
||
**Description:**
|
||
[What this alternative would involve.]
|
||
|
||
**Pros:**
|
||
- [Pro 1]
|
||
- [Pro 2]
|
||
|
||
**Cons:**
|
||
- [Con 1]
|
||
- [Con 2]
|
||
|
||
**Why rejected:**
|
||
[Specific reason — e.g. "Requires 3× the infrastructure cost", "Incompatible with multi-region requirement", "Team has no expertise in this technology and the ramp-up would miss the Q3 deadline"]
|
||
|
||
---
|
||
|
||
### Alternative 2: [Name]
|
||
|
||
**Description:**
|
||
[What this alternative would involve.]
|
||
|
||
**Pros:**
|
||
- [Pro 1]
|
||
- [Pro 2]
|
||
|
||
**Cons:**
|
||
- [Con 1]
|
||
- [Con 2]
|
||
|
||
**Why rejected:**
|
||
[Specific reason]
|
||
|
||
---
|
||
|
||
### Alternative 3: Do nothing / defer
|
||
|
||
**Description:**
|
||
Accept the current state and revisit the problem in [timeframe].
|
||
|
||
**Why rejected:**
|
||
[Why deferring is not acceptable — reference the impact of not solving this from Section 1.]
|
||
|
||
---
|
||
|
||
## 6. Implementation Plan
|
||
|
||
**Estimated effort:** [X engineer-weeks] | **Target completion:** [Date / Quarter]
|
||
**Team:** [Who is building this — names or roles]
|
||
|
||
| Phase | Description | Duration | Dependencies | Owner |
|
||
|---|---|---|---|---|
|
||
| 1 | [e.g. Core implementation — new component built and tested] | [X weeks] | [None] | [Name] |
|
||
| 2 | [e.g. Integration — connect new component to existing services] | [X weeks] | [Phase 1 complete] | [Name] |
|
||
| 3 | [e.g. Rollout — canary deploy, then full rollout] | [X weeks] | [Phase 2 + staging validated] | [Name] |
|
||
| 4 | [e.g. Cleanup — deprecate old system, remove feature flags] | [X weeks] | [Phase 3 stable for X weeks] | [Name] |
|
||
|
||
**Key milestones:**
|
||
- [ ] [Date]: [Milestone — e.g. "Core implementation complete and code-reviewed"]
|
||
- [ ] [Date]: [Milestone — e.g. "Staging environment validation complete"]
|
||
- [ ] [Date]: [Milestone — e.g. "10% canary traffic without regression"]
|
||
- [ ] [Date]: [Milestone — e.g. "Full rollout complete"]
|
||
- [ ] [Date]: [Milestone — e.g. "Old system decommissioned"]
|
||
|
||
---
|
||
|
||
## 7. Migration Plan
|
||
|
||
*Complete this section if the RFC requires migrating existing users, data, or API consumers.*
|
||
|
||
**Migration strategy:** [Big-bang / Phased / Parallel-run / Opt-in]
|
||
|
||
**Who is affected:**
|
||
- [e.g. All existing API v1 consumers — requires updated client libraries]
|
||
- [e.g. X million rows in the `orders` table require backfilling]
|
||
|
||
**Migration steps:**
|
||
1. [Step 1 — describe action, who does it, estimated duration]
|
||
2. [Step 2]
|
||
3. [Step 3]
|
||
|
||
**Backward compatibility window:** [How long will the old system/API remain available?]
|
||
|
||
**Communication plan:**
|
||
- [Who needs to be notified, when, and how — e.g. "API consumers will receive a deprecation notice 3 months before the old endpoint is removed"]
|
||
|
||
---
|
||
|
||
## 8. Security Implications
|
||
|
||
[Describe the security impact of this change. If there are no security implications, state that explicitly with reasoning — do not leave this section blank.]
|
||
|
||
| Concern | Impact | Mitigation |
|
||
|---|---|---|
|
||
| [e.g. New API endpoint exposed to internet] | [e.g. New attack surface] | [e.g. Rate limiting, auth required, WAF rules] |
|
||
| [e.g. New data stored — user PII] | [e.g. GDPR scope expanded] | [e.g. Encrypted at rest, access log, data retention policy] |
|
||
| [e.g. Service-to-service communication] | [e.g. Token forgery risk] | [e.g. mTLS between services] |
|
||
|
||
**Has a threat model been produced or updated?** [Yes — link / No — required before implementation / Not required — reason]
|
||
|
||
---
|
||
|
||
## 9. Performance Implications
|
||
|
||
[Describe the expected performance impact. Include projections for the new system and how it was estimated.]
|
||
|
||
| Metric | Current | Projected | Measurement method |
|
||
|---|---|---|---|
|
||
| [e.g. P99 latency — /api/auth] | [120ms] | [≤50ms] | [Load test results — link] |
|
||
| [e.g. Database query count per request] | [12] | [3] | [Query logging in staging] |
|
||
| [e.g. Memory per instance] | [512MB] | [768MB] | [Profiling — link] |
|
||
| [e.g. Infrastructure cost] | [$X/month] | [$Y/month] | [AWS cost calculator estimate] |
|
||
|
||
**Load testing:** [Has load testing been done? Link to results. If not, when will it be done?]
|
||
|
||
**Performance risks:**
|
||
- [Risk 1 — e.g. "New component adds a network hop that may increase tail latency under congestion — needs validation at 2× peak load"]
|
||
|
||
---
|
||
|
||
## 10. Observability Changes
|
||
|
||
*Describe what new or changed metrics, logs, traces, and alerts this RFC introduces.*
|
||
|
||
**New metrics:**
|
||
| Metric name | Type | Description | Alert threshold |
|
||
|---|---|---|---|
|
||
| `[service].[component].[metric]` | [counter/gauge/histogram] | [What it measures] | [e.g. P99 > 100ms for 5 min] |
|
||
|
||
**New log events:**
|
||
| Event | Level | When emitted | Key fields |
|
||
|---|---|---|---|
|
||
| `[event.name]` | INFO | [When] | `user_id`, `duration_ms`, `result` |
|
||
|
||
**Distributed tracing:** [Are spans added for new components? Which operations are instrumented?]
|
||
|
||
**Dashboard changes:** [New dashboard / updated existing dashboard — link]
|
||
|
||
---
|
||
|
||
## 11. Rollout Plan
|
||
|
||
**Rollout strategy:** [Feature flag / Canary / Blue-green / Gradual traffic shift / Full deploy]
|
||
|
||
| Stage | Traffic % | Duration | Success criteria | Rollback trigger |
|
||
|---|---|---|---|---|
|
||
| Internal testing | 0% (dogfood) | [X days] | [No errors in internal usage] | Any error |
|
||
| Canary | 1% | [X hours] | [Error rate <0.1%; P99 latency within budget] | Error rate >0.5% |
|
||
| Limited rollout | 10% | [X days] | [As above + business metrics stable] | Error rate >0.2% |
|
||
| Full rollout | 100% | — | [All success metrics from Section 2 met] | Any SLO breach |
|
||
|
||
**Feature flag:** [Name of feature flag, if applicable] — managed in [LaunchDarkly / Unleash / config]
|
||
|
||
**Rollback procedure:**
|
||
```
|
||
// How to roll back if the rollout needs to be reversed
|
||
1. [Step 1 — e.g. Toggle feature flag to off]
|
||
2. [Step 2 — e.g. Deploy previous version]
|
||
3. [Step 3 — e.g. Notify stakeholders]
|
||
```
|
||
|
||
---
|
||
|
||
## 12. Open Questions
|
||
|
||
[List any unresolved questions, design decisions not yet made, or areas where the author is specifically seeking feedback. Assign an owner and a resolution deadline for each.]
|
||
|
||
| # | Question | Owner | Deadline | Resolution |
|
||
|---|---|---|---|---|
|
||
| 1 | [e.g. Should we use optimistic or pessimistic locking for concurrent updates to [resource]?] | [Name] | [Date] | [Pending / [Answer]] |
|
||
| 2 | [e.g. What is the retention policy for [new data type]?] | [Name] | [Date] | [Pending / [Answer]] |
|
||
| 3 | [e.g. Do we need a read replica for this query pattern at launch, or can we defer it?] | [Name] | [Date] | [Pending / [Answer]] |
|
||
|
||
---
|
||
|
||
## 13. Decision
|
||
|
||
*To be filled in after the review period closes.*
|
||
|
||
**Decision:** [Approved / Rejected / Approved with modifications]
|
||
**Decision date:** [Date]
|
||
**Decision makers:** [Names]
|
||
|
||
**Summary of key feedback addressed:**
|
||
- [Feedback item and how it was resolved]
|
||
|
||
**Conditions of approval (if any):**
|
||
- [e.g. Must complete load testing before Phase 2 begins]
|
||
|
||
---
|
||
|
||
## Quality Checks
|
||
|
||
- [ ] The problem statement is specific and quantified — not "the current system is slow" but "P99 latency is 800ms; budget is 200ms"
|
||
- [ ] Goals section includes measurable success metrics, not aspirational statements
|
||
- [ ] Every alternative has an explicit rejection reason — not just a list of cons
|
||
- [ ] Security implications section is completed, not left blank
|
||
- [ ] Performance implications include projected numbers, not just "should be better"
|
||
- [ ] Open questions are assigned to named owners with deadlines — not floating
|
||
- [ ] The RFC is written to be read by someone who was not in the planning conversations
|
||
- [ ] Migration plan addresses all affected parties — users, API consumers, data — not just the technical steps
|
||
|
||
## Anti-Patterns
|
||
|
||
- [ ] Do not write the RFC as a persuasion document — its purpose is to expose trade-offs, not sell a decision
|
||
- [ ] Do not list alternatives without explicit rejection reasons — "we preferred the proposed solution" is not a reason
|
||
- [ ] Do not leave the security implications section blank or write "N/A" without a reasoned explanation
|
||
- [ ] Do not write open questions without assigning a named owner and a resolution deadline
|
||
- [ ] Do not skip the "impact of not solving this" section — without it, reviewers cannot assess urgency
|