Compare commits

..

8 Commits

Author SHA1 Message Date
mohitagw15856 83bfff4f2f docs: surface v20.1.0 in README changelog and latest-release line (#53)
The structured CHANGELOG.md already had 20.1.0; this updates the README's
embedded 'What's New' changelog and the top 'Latest release' line to match.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 20:35:03 +01:00
mohitagw15856 0c33330211 release: v20.1.0 — star nudges & eval hardening (#52)
Bump to 20.1.0. Folds the prior Unreleased items (CI leaderboard, PR-based
results flow, faster/hang-proof evals) plus the new star CTAs into a
[20.1.0] changelog section. Updates the README version badge.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 20:29:01 +01:00
mohitagw15856 82beaed5c6 feat: add star CTA to CLI list output and MCP server banner (#51)
More touchpoints to convert users into stargazers: the `list` command
footer and the MCP server's stderr startup banner (stderr is safe — it
never corrupts the JSON-RPC stream on stdout).


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 20:25:13 +01:00
mohitagw15856 511bad19b0 feat: nudge npm users to star the repo (CLI + README + funding) (#49)
- CLI prints a star CTA after a successful install and in --help
- README adds a prominent star line below the badges (npm renders this)
- package.json gains a funding field so npm shows a Fund/Star link


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 19:03:41 +01:00
mohitagw15856 63cef03324 Merge pull request #46 from mohitagw15856/eval-results
chore(evals): refresh leaderboard results
2026-06-18 13:41:58 +01:00
mohitagw15856 c28825dd38 chore(evals): refresh leaderboard results 2026-06-18 12:40:15 +00:00
mohitagw15856 4209963cff Leaderboard workflow: open a PR instead of pushing to protected main (#45)
The eval run worked (12 scored runs) but the final step failed: it pushed
evals/results.json directly to main, which the branch ruleset blocks
("Changes must be made through a pull request").

- eval-leaderboard.yml: replace the direct commit/push with
  peter-evans/create-pull-request@v7 (branch eval-results), add
  pull-requests: write. Merging that PR triggers the Pages deploy (which
  watches evals/results.json) to publish real numbers.
- evals/README documents the PR flow + the required "Allow GitHub Actions to
  create and approve pull requests" setting.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 13:33:15 +01:00
mohitagw15856 827d7f62ec Make evals fast and hang-proof (timeout, retry, concurrency) (#44)
The "Run evals" step ran 24 API calls sequentially with no request timeout, so
it was slow and could stall indefinitely if one call hung.

- bin/lib/anthropic.mjs: per-request timeout (120s) via AbortController + retry
  (2x, backoff) on 429/5xx/timeout. Fails fast on 4xx (bad key/model).
- evals/run-evals.mjs: run (case × model) tasks through a concurrency pool
  (default 4, --concurrency to tune); preserves result order.
- eval-leaderboard.yml: job timeout-minutes: 20 as a safety net.

Applies to the next run. The hardening also benefits the Action runner and
`generate`, which share the client.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 13:30:06 +01:00
10 changed files with 286 additions and 49 deletions
+15 -12
View File
@@ -21,6 +21,7 @@ on:
permissions:
contents: write
pull-requests: write
concurrency:
group: eval-leaderboard
@@ -29,6 +30,7 @@ concurrency:
jobs:
evaluate:
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout
uses: actions/checkout@v4
@@ -53,15 +55,16 @@ jobs:
- name: Build the leaderboard page (sanity check)
run: node scripts/build-leaderboard.mjs
- name: Commit results
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add evals/results.json
if git diff --cached --quiet; then
echo "No change in results."
else
git commit -m "chore(evals): refresh leaderboard results"
git push
echo "Committed evals/results.json — the Pages deploy will render real numbers."
fi
- name: Open a PR with the refreshed results
uses: peter-evans/create-pull-request@v7
with:
add-paths: evals/results.json
branch: eval-results
delete-branch: true
commit-message: "chore(evals): refresh leaderboard results"
title: "chore(evals): refresh leaderboard results"
body: |
Auto-generated by the **Update Skill Leaderboard** workflow.
Merging this publishes the **real** numbers on the live leaderboard — the
Pages deploy is triggered by changes to `evals/results.json`.
+15
View File
@@ -9,13 +9,28 @@ each new wave of skills bumps the **major** version, extensions and fixes bump
## [Unreleased]
## [20.1.0] — Star Nudges & Eval Hardening — 2026-06-18
### Added
- **Star the repo, from anywhere you use it.** Tasteful, non-spammy calls-to-action that turn
npm/CLI users into stargazers — no `postinstall` hook: a prompt after a successful
`npx pm-claude-skills add`, in `--help`, in `list`, in the MCP server's startup banner, a
CTA below the README badges (npm renders it on the package page), and a `funding` field in
`package.json` so npm shows a Fund/Sponsor link.
- **One-click leaderboard updates in CI** — `.github/workflows/eval-leaderboard.yml`
("Update Skill Leaderboard") runs the evals with the `ANTHROPIC_API_KEY` secret, commits
`evals/results.json`, and the Pages deploy re-renders the public leaderboard with real
numbers — no local key needed. The deploy workflow now also triggers on
`evals/results.json`.
### Changed
- **Leaderboard workflow opens a PR** instead of pushing to `main` (which the branch
ruleset blocks). After it runs, merge the auto-created results PR to publish real numbers.
- **Faster, hang-proof evals.** The Anthropic client now has a per-request timeout (120s)
and limited retries (429/5xx/timeout); the eval harness runs cases concurrently
(default 4). The leaderboard workflow has a 20-minute job timeout. A 24-call run that
was sequential now finishes in a few minutes and can't stall a job indefinitely.
## [20.0.0] — Agentic Tooling — 2026-06-18
### Added
+14 -3
View File
@@ -12,17 +12,19 @@
[![Platforms](https://img.shields.io/badge/works%20with-Claude%20%7C%20ChatGPT%20%7C%20Gemini%20%7C%20Cursor%20%7C%20Codex%20%7C%20Hermes-8A2BE2)](#-works-with--cross-tool-compatibility)
[![SkillCheck](https://img.shields.io/github/actions/workflow/status/mohitagw15856/pm-claude-skills/skillcheck.yml?branch=main&label=SkillCheck)](.github/workflows/skillcheck.yml)
[![Security Audit](https://img.shields.io/github/actions/workflow/status/mohitagw15856/pm-claude-skills/skill-audit.yml?branch=main&label=security%20audit)](.github/workflows/skill-audit.yml)
[![Version](https://img.shields.io/badge/version-20.0.0-brightgreen)](https://github.com/mohitagw15856/pm-claude-skills/releases)
[![Version](https://img.shields.io/badge/version-20.1.0-brightgreen)](https://github.com/mohitagw15856/pm-claude-skills/releases)
[![Install](https://img.shields.io/badge/Install%20in%20Claude%20Code-2%20minutes-orange)](https://github.com/mohitagw15856/pm-claude-skills#-quick-install-2-minutes)
[![License](https://img.shields.io/badge/license-MIT-lightgrey)](LICENSE)
[![Sponsor](https://img.shields.io/badge/sponsor-❤️-ff69b4)](https://github.com/sponsors/mohitagw15856)
### ⭐ If this saves you time, [star the repo](https://github.com/mohitagw15856/pm-claude-skills) — it's the #1 way to help others find it.
> **PM stands for Professional, not just Product Management.**
> 167 professional skills + 4 agent templates across 26 bundles covering 18 professions. Built for Claude Code — and now portable to ChatGPT, Gemini, and Hermes Agent. Built by a PM, used by everyone.
A community-built library of professional skills for every field — product management, engineering, customer success, marketing, social media, writers, design, legal, finance, HR, sales, operations, research, and more. Each skill is a structured `SKILL.md` file that teaches an AI assistant how to produce professional-grade outputs for your workflows. Skills run natively in **Claude Code** and **Hermes Agent** (same open `SKILL.md` standard), and ship as ready-to-paste exports for **ChatGPT** and **Gemini** — see [Works With](#-works-with--cross-tool-compatibility).
**🆕 Latest release (v20.0.0 — Agentic Tooling):** run any skill in CI with the new **[GitHub Action](action/)**, turn your docs into a skill with **`npx pm-claude-skills generate`**, and compare skills across models on the **[Skill Leaderboard](https://mohitagw15856.github.io/pm-claude-skills/leaderboard.html)** (LLM-judge evals). See the [changelog](#-changelog).
**🆕 Latest release (v20.1.0 — Star Nudges & Eval Hardening):** run any skill in CI with the **[GitHub Action](action/)**, turn your docs into a skill with **`npx pm-claude-skills generate`**, compare skills across models on the **[Skill Leaderboard](https://mohitagw15856.github.io/pm-claude-skills/leaderboard.html)** (now one-click in CI), and — if it saves you time — ⭐ the repo. See the [changelog](#-changelog).
<!-- DEMO: replace web/docs-assets/playground.png below with web/docs-assets/playground-demo.gif
once recorded (see web/docs-assets/README.md for how). The link goes to the live app. -->
@@ -403,7 +405,14 @@ More templates will follow. If you want to contribute one, see the [template con
The highlights are below. For the structured, [Keep a Changelog](https://keepachangelog.com/)-format history, see **[CHANGELOG.md](CHANGELOG.md)**.
### 🆕 What's New in v20.0.0 — Agentic Tooling
### 🆕 What's New in v20.1.0 — Star Nudges & Eval Hardening
- **Star the repo, from anywhere you use it** — tasteful, non-spammy CTAs (no `postinstall`): after a successful `npx pm-claude-skills add`, in `--help`, in `list`, in the MCP server banner, below the README badges, and a `funding` link on npm.
- **One-click leaderboard in CI** — the "Update Skill Leaderboard" workflow runs the evals with your `ANTHROPIC_API_KEY` secret and opens a results PR; merge it to publish real numbers.
- **Faster, hang-proof evals** — per-request timeout + retries in the API client and concurrent eval runs, so a CI run finishes in minutes and can't stall.
<details>
<summary><strong>v20.0.0 — Agentic Tooling</strong> (click to expand)</summary>
The library starts *doing* the work, not just describing it:
@@ -411,6 +420,8 @@ The library starts *doing* the work, not just describing it:
- **`generate` command** — `npx pm-claude-skills generate --from <url|file>` turns your docs into a standard-compliant `SKILL.md`.
- **Skill evals + Leaderboard** — LLM-as-judge scoring of skills across models, rendered as a public [leaderboard](https://mohitagw15856.github.io/pm-claude-skills/leaderboard.html).
</details>
<details>
<summary><strong>v19.0.0 — Security Auditor, Personas & Catalog</strong> (click to expand)</summary>
+5
View File
@@ -19,6 +19,7 @@ import { homedir } from 'node:os';
import { createRequire } from 'node:module';
const PKG_ROOT = dirname(dirname(fileURLToPath(import.meta.url)));
const STAR = '⭐ Find this useful? Star the repo: https://github.com/mohitagw15856/pm-claude-skills';
const VERSION = (() => {
try { return createRequire(import.meta.url)('../package.json').version; } catch { return '0.0.0'; }
})();
@@ -128,6 +129,7 @@ function add(opts) {
aider: `Load any of them with: aider --read ${join(target, '<skill>.md')}`,
}[agent] || `Restart ${agent} — it auto-discovers SKILL.md skills in ${target} by their description.`;
console.log(note);
console.log(`\n${STAR}`);
}
}
@@ -139,6 +141,7 @@ function list() {
console.log('\nNative SKILL.md agents: claude, hermes, codex, openclaw (install skill folders).');
console.log('Claude also gets subagents + slash commands. Cursor/Windsurf install rule files;');
console.log('Aider installs conventions you load with "aider --read".');
console.log(`\n${STAR}`);
}
const HELP = `pm-claude-skills — install professional Agent Skills into any AI coding tool.
@@ -155,6 +158,8 @@ Examples:
npx pm-claude-skills add --agent codex --link
npx pm-claude-skills generate --from <url|file> # turn your docs into a SKILL.md (needs ANTHROPIC_API_KEY)
${STAR}
`;
const opts = parse(process.argv.slice(2));
+41 -15
View File
@@ -6,31 +6,57 @@ const API_URL = 'https://api.anthropic.com/v1/messages';
/**
* Call the Anthropic Messages API and return the concatenated text output.
* Adds a per-request timeout and limited retries so a slow/transient failure
* can't hang a CI job forever.
* @param {object} o
* @param {string} o.apiKey - Anthropic API key.
* @param {string} [o.model] - Model id (default claude-sonnet-4-6).
* @param {string} [o.system]- System prompt.
* @param {Array} o.messages- [{role, content}] messages.
* @param {number} [o.maxTokens]
* @param {number} [o.timeoutMs] - Per-request timeout (default 120s).
* @param {number} [o.retries] - Retries on timeout / 429 / 5xx (default 2).
* @returns {Promise<string>}
*/
export async function complete({ apiKey, model = 'claude-sonnet-4-6', system, messages, maxTokens = 4096 }) {
export async function complete({ apiKey, model = 'claude-sonnet-4-6', system, messages, maxTokens = 4096, timeoutMs = 120000, retries = 2 }) {
if (!apiKey) throw new Error('Missing Anthropic API key (set ANTHROPIC_API_KEY).');
const res = await fetch(API_URL, {
method: 'POST',
headers: {
'content-type': 'application/json',
'x-api-key': apiKey,
'anthropic-version': '2023-06-01',
},
body: JSON.stringify({ model, max_tokens: maxTokens, ...(system ? { system } : {}), messages }),
});
if (!res.ok) {
const body = await res.text().catch(() => '');
throw new Error(`Anthropic API ${res.status}: ${body.slice(0, 500)}`);
let lastErr;
for (let attempt = 0; attempt <= retries; attempt++) {
const ctrl = new AbortController();
const timer = setTimeout(() => ctrl.abort(), timeoutMs);
try {
const res = await fetch(API_URL, {
method: 'POST',
headers: {
'content-type': 'application/json',
'x-api-key': apiKey,
'anthropic-version': '2023-06-01',
},
body: JSON.stringify({ model, max_tokens: maxTokens, ...(system ? { system } : {}), messages }),
signal: ctrl.signal,
});
if (res.ok) {
const data = await res.json();
return (data.content || []).map((c) => c.text || '').join('').trim();
}
const body = await res.text().catch(() => '');
// Retry transient server / rate-limit errors; fail fast on 4xx (bad key/model).
if ((res.status === 429 || res.status >= 500) && attempt < retries) {
lastErr = new Error(`Anthropic API ${res.status}`);
} else {
throw new Error(`Anthropic API ${res.status}: ${body.slice(0, 500)}`);
}
} catch (e) {
if (e.name === 'AbortError') e = new Error(`Anthropic API request timed out after ${timeoutMs}ms`);
const retryable = /timed out/.test(e.message) || e.name === 'TypeError' || /Anthropic API (429|5\d\d)/.test(e.message);
if (!retryable || attempt >= retries) throw e;
lastErr = e;
} finally {
clearTimeout(timer);
}
await new Promise((r) => setTimeout(r, 1000 * 2 ** attempt)); // backoff: 1s, 2s, 4s
}
const data = await res.json();
return (data.content || []).map((c) => c.text || '').join('').trim();
throw lastErr || new Error('Anthropic API request failed.');
}
/** Parse "name: value" YAML-ish frontmatter + body from a SKILL.md string. */
+7 -3
View File
@@ -30,9 +30,13 @@ back to `results.example.json` (clearly labelled) so the page renders before you
### No local key? Run it in CI
Add an `ANTHROPIC_API_KEY` repo secret, then go to **Actions → "Update Skill Leaderboard"
→ Run workflow**. It runs the evals, commits `evals/results.json`, and the Pages deploy
re-renders the public leaderboard with real numbers — no laptop required.
1. Add an `ANTHROPIC_API_KEY` repo secret.
2. Enable **Settings → Actions → General → Workflow permissions → "Allow GitHub Actions to
create and approve pull requests"** (so the workflow can open its results PR — `main`
requires PRs).
3. **Actions → "Update Skill Leaderboard" → Run workflow.** It runs the evals and opens a
PR with `evals/results.json`. **Merge that PR** and the Pages deploy re-renders the
public leaderboard with real numbers — no laptop required.
## Add a case
+148
View File
@@ -0,0 +1,148 @@
{
"generatedAt": "2026-06-18T12:40:14.995Z",
"judge": "claude-opus-4-8",
"models": [
"claude-sonnet-4-6",
"claude-haiku-4-5-20251001"
],
"dimensions": [
"structure",
"completeness",
"usefulness",
"grounding"
],
"results": [
{
"skill": "rice-prioritisation",
"model": "claude-sonnet-4-6",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 5
},
"overall": 5
},
{
"skill": "rice-prioritisation",
"model": "claude-haiku-4-5-20251001",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 4
},
"overall": 4.75
},
{
"skill": "prd-template",
"model": "claude-sonnet-4-6",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 4
},
"overall": 4.75
},
{
"skill": "prd-template",
"model": "claude-haiku-4-5-20251001",
"scores": {
"structure": 5,
"completeness": 4,
"usefulness": 5,
"grounding": 3
},
"overall": 4.25
},
{
"skill": "cs-health-scorecard",
"model": "claude-sonnet-4-6",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 5
},
"overall": 5
},
{
"skill": "cs-health-scorecard",
"model": "claude-haiku-4-5-20251001",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 4
},
"overall": 4.75
},
{
"skill": "executive-summary",
"model": "claude-sonnet-4-6",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 5
},
"overall": 5
},
{
"skill": "executive-summary",
"model": "claude-haiku-4-5-20251001",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 4
},
"overall": 4.75
},
{
"skill": "competitive-analysis",
"model": "claude-sonnet-4-6",
"scores": {
"structure": 5,
"completeness": 4,
"usefulness": 5,
"grounding": 5
},
"overall": 4.75
},
{
"skill": "competitive-analysis",
"model": "claude-haiku-4-5-20251001",
"scores": {
"structure": 5,
"completeness": 4,
"usefulness": 5,
"grounding": 4
},
"overall": 4.5
},
{
"skill": "sprint-planning",
"model": "claude-sonnet-4-6",
"scores": {
"structure": 5,
"completeness": 5,
"usefulness": 5,
"grounding": 4
},
"overall": 4.75
},
{
"skill": "sprint-planning",
"model": "claude-haiku-4-5-20251001",
"scores": {
"structure": 5,
"completeness": 4,
"usefulness": 4,
"grounding": 3
},
"overall": 4
}
]
}
+35 -15
View File
@@ -61,33 +61,53 @@ function parseScores(text) {
return s;
}
// Run an async worker over `items` with at most `limit` in flight.
async function pool(items, limit, worker) {
const out = [];
let i = 0;
await Promise.all(Array.from({ length: Math.min(limit, items.length) }, async () => {
while (i < items.length) {
const idx = i++;
out[idx] = await worker(items[idx]);
}
}));
return out;
}
async function scoreTask({ c, body, description, model }) {
try {
const output = await complete({ apiKey, model, system: runPrompt(body), messages: [{ role: 'user', content: c.input }], maxTokens: 3000 });
const judged = await complete({ apiKey, model: judge, messages: [{ role: 'user', content: judgePrompt(description, output) }], maxTokens: 200 });
const scores = parseScores(judged);
const overall = DIMENSIONS.reduce((a, d) => a + scores[d], 0) / DIMENSIONS.length;
process.stderr.write(`${c.skill} on ${model}${overall.toFixed(2)}/5\n`);
return { skill: c.skill, model, scores, overall: Math.round(overall * 100) / 100 };
} catch (e) {
process.stderr.write(`${c.skill} on ${model} — FAILED (${e.message})\n`);
return null;
}
}
async function main() {
if (!apiKey) { console.error('Set ANTHROPIC_API_KEY to run evals.'); process.exit(1); }
const concurrency = parseInt(arg('concurrency', '4'), 10) || 4;
const { cases } = JSON.parse(readFileSync(casesPath, 'utf8'));
const results = [];
// Build the full (case × model) task list.
const tasks = [];
for (const c of cases) {
const skillFile = join(root, 'skills', c.skill, 'SKILL.md');
if (!existsSync(skillFile)) { console.error(`skip ${c.skill}: no SKILL.md`); continue; }
const { meta, body } = parseSkill(readFileSync(skillFile, 'utf8'));
for (const model of models) {
process.stderr.write(`Running ${c.skill} on ${model}`);
try {
const output = await complete({ apiKey, model, system: runPrompt(body), messages: [{ role: 'user', content: c.input }], maxTokens: 3000 });
const judged = await complete({ apiKey, model: judge, messages: [{ role: 'user', content: judgePrompt(meta.description || c.skill, output) }], maxTokens: 200 });
const scores = parseScores(judged);
const overall = DIMENSIONS.reduce((a, d) => a + scores[d], 0) / DIMENSIONS.length;
results.push({ skill: c.skill, model, scores, overall: Math.round(overall * 100) / 100 });
process.stderr.write(`${overall.toFixed(2)}/5\n`);
} catch (e) {
process.stderr.write(`FAILED (${e.message})\n`);
}
}
for (const model of models) tasks.push({ c, body, description: meta.description || c.skill, model });
}
process.stderr.write(`Scoring ${tasks.length} runs (concurrency ${concurrency})…\n`);
const results = (await pool(tasks, concurrency, scoreTask)).filter(Boolean);
const out = { generatedAt: new Date().toISOString(), judge, models, dimensions: DIMENSIONS, results };
writeFileSync(outPath, JSON.stringify(out, null, 2));
console.log(`\nWrote ${outPath}${results.length} scored runs. Build the page: node scripts/build-leaderboard.mjs`);
console.log(`\nWrote ${outPath}${results.length}/${tasks.length} scored runs. Build the page: node scripts/build-leaderboard.mjs`);
}
main();
+1
View File
@@ -166,6 +166,7 @@ function handle(msg) {
}
process.stderr.write(`[${SERVER_NAME}] MCP server ready — ${SKILLS.length} skills, ${TOOLS.length} tools.\n`);
process.stderr.write(`[${SERVER_NAME}] ⭐ Star the repo: https://github.com/mohitagw15856/pm-claude-skills\n`);
const rl = createInterface({ input: process.stdin });
rl.on('line', (line) => {
const s = line.trim();
+5 -1
View File
@@ -1,6 +1,6 @@
{
"name": "pm-claude-skills",
"version": "20.0.0",
"version": "20.1.0",
"type": "module",
"description": "167 professional Agent Skills (SKILL.md) + subagents + slash commands for Claude, ChatGPT, Gemini, Cursor, Codex & Hermes. Install into any AI coding tool with: npx pm-claude-skills add --agent <tool>.",
"keywords": [
@@ -29,6 +29,10 @@
"bugs": {
"url": "https://github.com/mohitagw15856/pm-claude-skills/issues"
},
"funding": {
"type": "github",
"url": "https://github.com/mohitagw15856/pm-claude-skills"
},
"author": "Mohit Aggarwal",
"bin": {
"pm-claude-skills": "bin/cli.mjs",