CI workflow to run evals and update the leaderboard (#43)
Lets the leaderboard show real numbers without a local key: the new "Update Skill Leaderboard" workflow (workflow_dispatch) runs the eval harness with the ANTHROPIC_API_KEY secret, commits evals/results.json, and the Pages deploy re-renders the public leaderboard with real data. - .github/workflows/eval-leaderboard.yml: manual trigger, contents: write, runs run-evals.mjs + build-leaderboard.mjs, commits results.json. - deploy-playground.yml: also trigger on evals/results.json (and the build scripts) so the committed results refresh the live page. - evals/README + CHANGELOG document the CI route. Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -10,6 +10,10 @@ on:
|
||||
paths:
|
||||
- 'skills/**'
|
||||
- 'web/**'
|
||||
- 'evals/results.json'
|
||||
- 'skill-tiers.json'
|
||||
- 'scripts/build-docs.mjs'
|
||||
- 'scripts/build-leaderboard.mjs'
|
||||
- '.github/workflows/deploy-playground.yml'
|
||||
workflow_dispatch:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user