seed-mcp/eval/results/baseline.md

# seed-mcp retrieval eval — k=5

_21 golden queries × 4 retrievers_

## Summary

| Retriever | Passed | Recall | P@1 | MRR | Avg ms |
|---|---|---|---|---|---|
| **hybrid+rerank** | 21/21 | 100.00% | 90.48% | 0.905 | 2064 |
| **bm25** | 20/21 | 95.24% | 80.95% | 0.833 | 5 |
| **hybrid** | 15/21 | 71.43% | 61.90% | 0.619 | 73 |
| **dense** | 14/21 | 66.67% | 38.10% | 0.440 | 79 |

**Recall** = % of queries where ≥1 top-k chunk satisfied the spec. **P@1** = % where the very first result satisfied it. **MRR** = mean of `1 / rank-of-first-satisfying-result` (0 if missed).

## Per-query results

| Query | bm25 | dense | hybrid | hybrid+rerank |
|---|---|---|---|---|
| `DKC62-08RIB ratings` | ✅ #1 | ❌ | ❌ | ✅ #1 |
| `AG29XF4 disease ratings` | ✅ #1 | ❌ | ❌ | ✅ #1 |
| `WB6430 westbred wheat` | ✅ #1 | ❌ | ❌ | ✅ #1 |
| `E085Z5 corn` | ✅ #1 | ❌ | ❌ | ✅ #1 |
| `AP Iliad wheat performance` | ✅ #1 | ❌ | ❌ | ✅ #1 |
| `drought tolerant corn for sandy soil short season Iowa` | ✅ #2 | ✅ #1 | ✅ #1 | ✅ #1 |
| `soybean cyst nematode SCN resistant variety` | ✅ #1 | ✅ #1 | ✅ #1 | ✅ #1 |
| `Phytophthora resistance Rps3a soybean` | ✅ #1 | ✅ #2 | ✅ #1 | ✅ #1 |
| `XtendFlex soybean Northern Plains` | ❌ | ✅ #1 | ✅ #1 | ✅ #1 |
| `Hard Red Spring wheat stripe rust resistance` | ✅ #1 | ✅ #3 | ✅ #1 | ✅ #1 |
| `Soft White Winter wheat Pacific Northwest` | ✅ #1 | ✅ #5 | ✅ #1 | ✅ #1 |
| `Goss's Wilt resistance corn` | ✅ #1 | ✅ #1 | ✅ #1 | ✅ #1 |
| `best corn 2024 Iowa` | ✅ #1 | ✅ #1 | ✅ #1 | ✅ #1 |
| `Indiana corn yield comparison 2024` | ✅ #1 | ✅ #1 | ✅ #1 | ✅ #1 |
| `AP Iliad Idaho wheat trial` | ✅ #1 | ✅ #5 | ✅ #1 | ✅ #1 |
| `DKC65-95 corn yield in trials` | ✅ #1 | ❌ | ✅ #1 | ✅ #1 |
| `NK1701 corn trials head to head` | ✅ #1 | ❌ | ❌ | ✅ #1 |
| `silage corn high milk per acre dairy` | ✅ #1 | ✅ #1 | ✅ #1 | ✅ #1 |
| `soybean 2025 Minnesota top performers` | ✅ #1 | ✅ #1 | ✅ #1 | ✅ #1 |
| `Pioneer P1142 hybrid recommendation` | ✅ | ✅ | ✅ | ✅ |
| `DKC65-20 yield Alabama trial` | ✅ | ✅ | ✅ | ✅ |