Commit Graph

10 Commits

Author SHA1 Message Date
mohitagw15856 827d7f62ec Make evals fast and hang-proof (timeout, retry, concurrency) (#44)
The "Run evals" step ran 24 API calls sequentially with no request timeout, so
it was slow and could stall indefinitely if one call hung.

- bin/lib/anthropic.mjs: per-request timeout (120s) via AbortController + retry
  (2x, backoff) on 429/5xx/timeout. Fails fast on 4xx (bad key/model).
- evals/run-evals.mjs: run (case × model) tasks through a concurrency pool
  (default 4, --concurrency to tune); preserves result order.
- eval-leaderboard.yml: job timeout-minutes: 20 as a safety net.

Applies to the next run. The hardening also benefits the Action runner and
`generate`, which share the client.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 13:30:06 +01:00
mohitagw15856 edb663ad72 CI workflow to run evals and update the leaderboard (#43)
Lets the leaderboard show real numbers without a local key: the new
"Update Skill Leaderboard" workflow (workflow_dispatch) runs the eval harness
with the ANTHROPIC_API_KEY secret, commits evals/results.json, and the Pages
deploy re-renders the public leaderboard with real data.

- .github/workflows/eval-leaderboard.yml: manual trigger, contents: write,
  runs run-evals.mjs + build-leaderboard.mjs, commits results.json.
- deploy-playground.yml: also trigger on evals/results.json (and the build
  scripts) so the committed results refresh the live page.
- evals/README + CHANGELOG document the CI route.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 12:58:45 +01:00
mohitagw15856 3ccfd6b5c7 Dogfood the Action + bump to v20.0.0 (Agentic Tooling) (#42)
- .github/workflows/pr-description.yml: uses our own Action (uses: ./action)
  to auto-write this repo's PR descriptions when a PR opens empty; skips
  quietly without ANTHROPIC_API_KEY and on forks. A living demo.
- Version -> 20.0.0 (Agentic Tooling): bundles the GitHub Action, generate
  command, and evals/leaderboard for npm. README badge + What's New (v19
  collapsed), CHANGELOG [Unreleased] -> [20.0.0], SECURITY table.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 12:52:37 +01:00
mohitagw15856 51bf4be52f AI-powered tooling: GitHub Action, generate command, evals + leaderboard (#41)
Three features riding 2026 trends (agentic CI, codegen, evals), sharing one
dependency-free Anthropic client (bin/lib/anthropic.mjs).

1. GitHub Action (action/) — run any skill in a consumer repo's CI:
   uses: mohitagw15856/pm-claude-skills/action@main. Composite action +
   run.mjs (loads the bundled SKILL.md, calls the API, exposes result as a
   step output / file). Docs with auto-PR-description example.

2. generate command — `npx pm-claude-skills generate --from <url|file>` turns
   a team's docs into a SKILL.md following the authoring standard
   (bin/generate.mjs, wired into the CLI; needs ANTHROPIC_API_KEY).

3. Skill evals + Leaderboard — evals/run-evals.mjs runs each case across models
   and scores output with an LLM judge (structure/completeness/usefulness/
   grounding); scripts/build-leaderboard.mjs renders web/leaderboard.html
   (built in the Pages deploy, falls back to clearly-labelled example data).
   Linked from README, catalog, and playground.

Offline-testable parts verified (prompt building, skill loading, graceful
errors, leaderboard render). SkillCheck/audit/exports all green.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 08:37:40 +01:00
mohitagw15856 e9bc1d0626 Security auditor, personas, orchestration, docs catalog & roadmap (#35)
Closes the remaining gaps vs alirezarezvani/claude-skills across trust, content
types, discoverability, and community.

Security (trust signal + useful):
- scripts/skill-audit.mjs scans skills/*/SKILL.md + each skill's scripts/ for
  prompt injection, exfiltration, dynamic code exec, destructive shell, secrets,
  and hidden text. HIGH fails CI (.github/workflows/skill-audit.yml) + a badge.
- New skill-security-auditor skill teaches the same review (production tier).

Content types:
- output-styles/ — 4 personas (Startup CTO, Growth Marketer, Solo Founder,
  Product Leader) as Claude Code output styles; --agent claude installs them too.
- ORCHESTRATION.md — Skill Chain / Multi-Agent Handoff / Domain Deep-Dive /
  Solo Sprint patterns.

Discoverability:
- scripts/build-docs.mjs generates a server-rendered, SEO-indexable
  web/catalog.html of all skills (built in the Pages deploy; gitignored).
  Linked from README + playground.

Community:
- ROADMAP.md (now/next/later + good-first-issues).

README badges/sections, TIERS (47 production), CHANGELOG, package.json files,
and exports/web index all updated. SkillCheck + security audit + exports verified.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-18 08:09:14 +01:00
mohitagw15856 1d18e50c68 Fix npm-publish tag check to accept capital-V tags (#31)
The release tag check stripped only a lowercase 'v', so a tag like V17.0.0
failed against package.json 17.0.0. Strip a leading v OR V.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-17 22:50:34 +01:00
mohitagw15856 abdf20acf3 Automated npm publish via GitHub Actions (#30)
Lets the package ship to npm without a local npm install: publish a GitHub
Release and CI runs `npm publish` using an NPM_TOKEN repo secret.

- .github/workflows/npm-publish.yml: triggers on release published (and manual
  dispatch), verifies the release tag matches package.json version, then
  publishes with provenance (id-token: write) to the public registry.

One-time setup by the maintainer: create an npm Automation token and add it as
the NPM_TOKEN repository secret. Documented in the workflow header.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-17 15:22:04 +01:00
mohitagw15856 05b6d799f0 SkillCheck validator, Cursor exports, and per-agent installers (#27)
Three more learnings from alirezarezvani/claude-skills, applied:

1. SkillCheck validator (scripts/skillcheck.mjs) — validates every SKILL.md
   against the authoring standard (frontmatter, name/folder match, trigger +
   produces clauses, required headings) plus tier referential integrity.
   Errors fail CI; --strict fails on warnings too. New skillcheck.yml workflow
   and a SkillCheck status badge in the README. Current: 0 errors / 14 advisory
   warnings across 172 skills.

2. Cursor export platform — build-exports.mjs now generates
   exports/cursor/<bundle>/<skill>/<skill>.mdc rule files. The PLATFORMS
   registry now supports per-skill filenames (file as a function).

3. Per-agent installers — scripts/install.sh unifies install for
   claude/hermes/codex/openclaw/cursor (--link, --target, --dry-run, --list).
   Curl-able one-liners codex-install.sh, openclaw-install.sh, and
   cursor-install.sh clone the library and install in a single command.

README documents the one-line installs and Cursor exports; CHANGELOG and the
authoring standard updated.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-17 13:38:31 +01:00
Claude 572b8acf8c Add multi-platform export generator (single source of truth)
Make the library multi-platform without duplicating content. Each
skills/<name>/SKILL.md body remains the single source of truth; a new
generator renders platform-ready exports from it.

- scripts/build-exports.mjs — dependency-free Node generator with a PLATFORMS
  registry so new platforms (Gemini, Cursor, …) are a few lines. Ships ChatGPT
  exports at exports/chatgpt/<bundle>/<skill>/SYSTEM_PROMPT.md (172 skills),
  plus generated index READMEs. Supports --platform and --check.
- exports/ — generated ChatGPT system prompts, ready to paste into a Custom GPT.
- .github/workflows/check-generated.yml — fails a PR if exports or
  web/skills.json drift from the source skills.
- README "Works With" now documents the ready-to-use exports and regen command.
- CHANGELOG + SKILL-AUTHORING-STANDARD note the generated artifacts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px
2026-06-17 08:01:20 +00:00
Mohit f956b4c329 ci: auto-deploy Skill Playground to GitHub Pages
On push to main, rebuild web/skills.json from the SKILL.md files and publish
web/ to GitHub Pages, so the live site always reflects the current skill
library. Manual runs supported via workflow_dispatch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 12:01:09 +01:00