eval: new baseline on the 4-endpoint embed pool index #9

Merged
justin merged 1 commits from eval-baseline-20260610 into main 2026-06-10 20:38:25 -04:00
Owner

New eval/results/baseline-20260610.md: 22 queries against the prod image index rebuilt today on the 4-endpoint GPU pool with the resilient embedder (PR #8). vs the 2026-05-22 baseline: dense MRR 0.539→0.924, bm25+rerank 0.920→0.959, hybrid_rrf+rerank 0.875→0.960. Reranker legs ran against the prod jina-rerank. Provenance header in the file.

🤖 Generated with Claude Code

New `eval/results/baseline-20260610.md`: 22 queries against the prod image index rebuilt today on the 4-endpoint GPU pool with the resilient embedder (PR #8). vs the 2026-05-22 baseline: dense MRR 0.539→0.924, bm25+rerank 0.920→0.959, hybrid_rrf+rerank 0.875→0.960. Reranker legs ran against the prod jina-rerank. Provenance header in the file. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
justin added 1 commit 2026-06-10 20:38:24 -04:00
22 queries against the prod image index rebuilt today on the expanded
GPU pool with the resilient embedder (PR #8): dense MRR 0.539→0.924,
bm25+rerank 0.920→0.959, hybrid_rrf+rerank 0.875→0.960 vs the
2026-05-22 baseline. No regression from mixed-provenance embeddings.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
justin merged commit 992d66e3d1 into main 2026-06-10 20:38:25 -04:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: justin/hvm-docs#9