T

claude 54094a0d43 Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

Independent third-party performance data — land-grant programs that test every
entered brand side-by-side with replication + LSD stats. This is the legitimate
way to get Pioneer / DEKALB / Brevant / Channel performance the corpus can't
scrape directly (data_type=trial, results[] shape; falls through the trial
chunker).

- illinois_vt_trials (30 docs, 1,392 rows) — U of Illinois VT. Per-region XLSX
  (openpyxl), corn + soy + WHEAT, 2024+2025. Rich per-site agronomic metadata;
  corn-following-corn vs -soybean kept distinct.
- iowa_icpt_trials (24 docs, 674 rows) — Iowa State ICPT. ASP.NET GridView
  (viewstate postback for year/district), corn + soy by district x season.
- ohio_ocpt_trials (69 docs, 4,647 rows) — OSU/CFAES OCPT. Report PDF
  (pdfplumber; per-site column groups split by header Yield-token count +
  x-coord footnote bucketing), corn + soy per site, 2024+2025.

91 distinct seed brands across the three; majors confirmed present in the
independent rankings: DEKALB 395, Golden Harvest 249, Channel 241, NK 212,
Xitavo 135, LG 103, Pioneer 88, Asgrow 59. (A brand only appears where it
ENTERED a given program — e.g. Brevant not in Iowa, DEKALB/Channel not in
Illinois — true negatives, not parse gaps.)

- rag/chunk.py: gated `include_region` on _render_gh_plot_chunk; the 3 university
  sources route through it so the region/district is in the embedded chunk +
  labeled "variety trial (cross-vendor, independent third-party)". Existing plot
  sources (gh/lg/agrigold/proharvest) unchanged.
- requirements.txt: openpyxl (Illinois XLSX; scrape-time only).
- sources.json + README/CLAUDE/lessons: registered + attributed; lessons
  trial-data + Pioneer entries updated (Pioneer/DEKALB performance now available
  indirectly via these trials).

Validation: all 123 chunk via rag.chunk.chunks_from_trial (0 errors), 0
out-of-range yields, 0 dup keys. Public land-grant data; attribution recorded in
each tos_note. CI rebuilds the index from the committed corpus.

2026-06-10 08:35:50 -04:00

.gitea/workflows

CI fix + Drawbar-stack deploy pattern

2026-05-25 17:23:03 -04:00

corpus

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

deploy

CI fix + Drawbar-stack deploy pattern

2026-05-25 17:23:03 -04:00

docs_mcp

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

eval

Phase 6/7: wire rerank + eval harness — 100% pass on 21 golden queries

2026-05-25 17:02:57 -04:00

rag

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

scrape

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

scripts

seed-mcp scaffold: clone docs-mcp-template, customize for crop_seed PRODUCT_NAME

2026-05-25 12:28:49 -04:00

.gitignore

Phase 4-5: deployable container + corpus snapshot + CI fixes

2026-05-25 13:40:05 -04:00

CLAUDE.md

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

Dockerfile

seed-mcp scaffold: clone docs-mcp-template, customize for crop_seed PRODUCT_NAME

2026-05-25 12:28:49 -04:00

PLAN.md

seed-mcp scaffold: clone docs-mcp-template, customize for crop_seed PRODUCT_NAME

2026-05-25 12:28:49 -04:00

README.md

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

requirements.txt

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

sources.json

Add university-extension variety trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 trial docs)

2026-06-10 08:35:50 -04:00

README.md

seed-mcp

MCP server over the public catalogs of major US row-crop seed vendors — variety identity (what each hybrid IS) plus yield-trial data (how they actually perform in real cooperator fields). Sibling project to crop-chem-docs (pesticide labels), feeding the same Drawbar farm-advisor AI.

Deployed 2026-05-25 on trashpanda as a sibling sidecar to chem-mcp; the Drawbar advisor calls it via the seed: prefix.

What's in the corpus

~9,300 indexed records (one chunk each) across two complementary surfaces:

Variety identity — 2,398 records

Source	Count	Vendor	Brand
`bayer_seeds`	931	Bayer	DEKALB / Channel (corn) / Asgrow (soy) / WestBred (wheat) / Deltapine
`latham`	264	Latham Hi-Tech Seeds	Latham (corn / soy) — independent family brand, Alexander IA
`stine`	217	Stine Seed Company	Stine (corn / soy) — largest US independent, Adel IA
`lg_seeds`	170	AgReliant	LG Seeds (corn / soy / sorghum)
`golden_harvest`	139	Syngenta	Golden Harvest (corn / soy)
`robseeco`	130	RobSeeCo	Rob-See-Co / Innotech (corn / soy) — independent, Elkhorn NE; from the seed-guide PDF
`nk`	122	Syngenta	NK (corn / soy)
`proharvest`	119	ProHarvest Seeds	ProHarvest / Apex (corn / soy / wheat) — independent Corn Belt brand
`agrigold`	111	AgReliant	AgriGold (corn / soy)
`first_choice`	78	1st Choice Seeds	1st Choice (corn / soy / wheat) — employee-owned independent, Rushville IN
`burrus`	64	Burrus Seed	Burrus / Power Plus / DONMARIO (corn / soy) — independent family, Arenzville IL
`ebberts_seeds`	29	Ebbert's Seeds	Ebbert's (corn / soy / wheat) — independent E. Corn Belt breeder
`agripro`	24	Syngenta	AgriPro (wheat — HRW / HRS / HWS / SWW)

Yield-trial data — 6,910 documents

Source	Count	Notes
`gh_plot_reports`	4,299	Golden Harvest plot reports 2024+2025. Cross-vendor head-to-head — DEKALB / NK / GH / Pioneer / Channel all appear in the same trial rankings.
`lg_plot_reports`	1,307	LG Seeds (AgReliant) cross-vendor plots, top-5 per site, 2024+2025.
`agrigold_plot_reports`	1,006	AgriGold (AgReliant) cross-vendor plots, full ranking + rich plot management, 2024+2025.
`proharvest_plots`	161	ProHarvest Seeds per-cooperator harvest reports (corn / soy, 2024+2025). Many are cross-vendor (ProHarvest / Apex vs Pioneer / DEKALB / Becks / Channel / Wyffels). Structured rank/yield/%H2O/test-wt where the PDF fits the template; off-template third-party reports kept verbatim.
`ohio_ocpt_trials`	69	University-extension trial (OSU/CFAES) — corn + soy per-site, 2024+2025. Independent third-party; ranks CHANNEL / DEKALB / NK / Golden Harvest / LG / AgriGold / Beck's etc. side-by-side.
`illinois_vt_trials`	30	University-extension trial (U of Illinois VT) — corn + soy + wheat, 2024+2025. Pioneer / NK + many regionals; rich per-site agronomic metadata.
`iowa_icpt_trials`	24	University-extension trial (Iowa State / ICPT) — corn + soy by district, 2024+2025. Pioneer / DEKALB / Asgrow / NK / Golden Harvest.
`agripro_trials`	14	Regional wheat trial PDF summaries (PNW, Western Plains, Northern Plains, etc.)

The three *_trials university sources are independent third-party performance data — land-grant programs that test every entered brand (incl. majors we can't catalog directly, like Pioneer / DEKALB / Brevant) side-by-side with replication + LSD stats. The publisher is the university; the seed brands live in each row's brand.

Not in the corpus (documented in `docs_mcp/lessons.md`)

Pioneer / Corteva (all brands) — ToS bans automation. This now covers the whole Corteva family — Pioneer, Brevant, Hoegemeyer (the consolidation brand absorbing Seed Consultants / Dairyland / Nu-Tech / Terral), and the upcoming Vylor spinoff — all share the same corteva.com ToU. Curated fallback lesson points the farmer at a local dealer; legitimate Corteva-data paths are an official license (openinnovation@corteva.com) or university-extension trial data.
NK yield-results — fiddly ASMX/SOAP endpoint, needs a dedicated reverse-engineer session.
Bayer per-variety trial data — not publicly indexed (DEKALB / Asgrow trial data flows through Channel reps). Partially covered by the GH plot reports' cross-vendor results.

MCP tools (6)

Tool	Purpose
`search_docs`	Variety IDENTITY — what a hybrid IS (disease ratings, traits, maturity). Hybrid dense+BM25 + cross-encoder rerank + variety-code prefilter.
`search_trials`	Variety PERFORMANCE — head-to-head yield trial results. Filterable by crop, state, year, product.
`get_page`	Full canonical record for one variety + structured ratings header sourced from the sidecar JSON.
`lookup_variety`	Raw sidecar JSON for one variety — fact-check tool; call before quoting any specific rating value.
`list_versions`	Discover facets (sources, vendors, brands, crops) currently indexed.
`crop_seed_api_lessons`	Curated knowledge: Pioneer fallback policy, scale-direction differences across vendors, trait glossary, SCN race coverage notes.

search_docs defaults to data_type="variety"; search_trials uses data_type="trial" — single Chroma collection, metadata-filtered.

Retrieval — eval-validated

From eval/results/baseline.md (21 golden queries, k=5):

Retriever	Pass	Recall	P@1	MRR	Avg ms
hybrid+rerank	21/21	100%	90%	0.905	2064
bm25	20/21	95%	81%	0.833	5
hybrid (no rerank)	15/21	71%	62%	0.619	73
dense	14/21	67%	38%	0.440	79

Deploy config: HYBRID_SEARCH=true + RERANK_URL=http://llama-rerank:8080.

Some surprises worth knowing:

Dense embedding alone is the weakest config. Variety codes (DKC62-08RIB), gene names (Rps3a), and trait codes (XF) have no semantic neighbors — nomic-embed-text returns noise on them.
Hybrid alone is WORSE than BM25 alone. RRF dilutes BM25's strong ranking with dense noise. Don't ship without rerank.
BM25-alone (95% recall, 5 ms) is an excellent fallback when the rerank sidecar is unavailable. The variety-code prefilter in search_docs does heavy lifting.
Anti-hallucination queries pass on every retriever — Pioneer fallback + not-in-corpus product checks hold across all configs.

Quick start

git clone https://git.jpaul.io/justin/seed-mcp.git
cd seed-mcp
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Sample-scrape just to verify wiring:
python -m scrape.runner --source bayer_seeds --limit 3

# Full refresh (all 6 sources; expect ~25 min for gh_plot_reports
# with 4 concurrent workers):
python -m scrape.runner --all --force

# Rebuild Chroma + BM25 from the corpus:
OLLAMA_URL=http://192.168.0.125:11434 PRODUCT_NAME=crop_seed \
  python -m rag.index --rebuild

# Run the eval harness:
RERANK_URL=http://localhost:18080 python -m eval.run_eval \
  --queries eval/queries.jsonl --k 5 \
  --output eval/results/baseline.md

# Local MCP server (stdio for Claude Desktop dev):
PRODUCT_NAME=crop_seed python -m docs_mcp.server --transport stdio

# Local HTTP server (matches production transport):
PRODUCT_NAME=crop_seed python -m docs_mcp.server \
  --transport streamable-http --port 8000

Repo layout

.
├── CLAUDE.md                      # Canonical agent guide. Read first.
├── PLAN.md                        # Template's 13-phase build guide.
├── README.md
├── requirements.txt
├── Dockerfile
├── sources.json                   # Source catalog (one entry per scraper)
├── deploy/docker-compose.yml      # Drop-in compose snippet for Drawbar
├── .gitea/workflows/
│   ├── refresh.yml                # Monthly cron: scrape + index + image push
│   └── image-only.yml             # On-demand code-only ship cycle
├── scrape/
│   ├── runner.py                  # `python -m scrape.runner --source <id>`
│   ├── changelog.py               # Reused from template
│   └── sources/
│       ├── bayer_seeds.py         # ~475 varieties across 3 brands
│       ├── golden_harvest.py      # ~139 varieties (post-discontinued filter)
│       ├── nk.py                  # 122 varieties (corn + soy)
│       ├── agripro.py             # 24 wheat varieties
│       ├── gh_plot_reports.py     # 4,299 cross-vendor yield trials
│       ├── agripro_trials.py      # 14 regional trial PDFs
│       └── becks_pfr.py           # stub — Sanity GROQ research corpus
├── rag/
│   ├── embeddings.py              # nomic-embed-text via Ollama
│   ├── chunk.py                   # one-chunk-per-variety + trial chunker
│   ├── index.py                   # Chroma + BM25 builder
│   └── bm25.py                    # FTS5 lexical index w/ seed-domain facets
├── docs_mcp/
│   ├── server.py                  # FastMCP — 6 tools, hybrid+rerank
│   ├── lessons.md                 # Curated knowledge layer (Pioneer fallback)
│   └── usage.py                   # TimedCall + JSONL telemetry
├── eval/
│   ├── queries.jsonl              # 21 golden queries
│   ├── retrievers.py              # dense / bm25 / hybrid / hybrid+rerank
│   ├── run_eval.py                # MRR / Recall@k / Precision@1
│   └── results/baseline.md        # Current deploy-config eval numbers
└── corpus/                        # Committed scrape output (CI-refreshed)
    ├── bayer_seeds/
    ├── golden_harvest/
    ├── nk/
    ├── agripro/
    ├── gh_plot_reports/
    └── agripro_trials/

Infrastructure

Registry: pushes to 192.168.0.2:1234 (LAN, no CF body cap); deploys pull git.jpaul.io/justin/seed-mcp:latest (public, CF tunnel). Also tagged :<sha12> for rollback pinning and :corpus-YYYY.MM.DD for snapshot pinning.
Embedder pool (CI): 3 GPU-pinned Ollama endpoints, weighted toward .0.125 (RTX 40-series, 242 embeds/sec):
- .0.125:11434 ×4 (4090)
- .0.2:11436 ×2 (GPU-pinned)
- .0.2:11435 ×1 (GPU-pinned)
- Do NOT use .0.2:11434 (not GPU-pinned) or localhost:11434 (works in dev, breaks in CI — runner container has no Ollama on its loopback).
Reranker: shared llama-rerank sidecar on trashpanda's Tesla P4 (jina-reranker-v2-base via llama.cpp). One container serves both seed-mcp and crop-chem-docs. Must be on drawbar-backend_default Docker network — see deploy/docker-compose.yml for the network-attach gotcha that caused silent rerank degradation on chem-mcp prior to 2026-05-25.
PRODUCT_NAME: crop_seed — used in the Chroma collection name (crop_seed_docs), the BM25 db filename (bm25/crop_seed_docs.db), and the crop_seed_api_lessons tool name. Not seed_mcp — that would conflict with the container/service name.

Deploy mechanics

Watchtower handles auto-deploy. Every push to seed-mcp/main that touches docs_mcp/, rag/, scrape/, requirements.txt, Dockerfile, or sources.json triggers image-only.yml:

Checks out main with full corpus
Rebuilds Chroma + BM25 (~3 min on the GPU pool)
docker build + push three tags to the LAN registry
Links the package to the repo via Gitea API
Watchtower on trashpanda polls :latest every 5 min → pulls + recreates drawbar-backend-seed-mcp-1

Corpus refresh runs monthly via refresh.yml (1st of each month, 06:00 UTC) — re-scrapes all GREEN sources, commits any corpus diff, rebuilds indexes, ships a new image with :corpus-YYYY.MM.DD tagged.

See CLAUDE.md for canonical sidecar schemas, the reversed disease-scale gotcha (NK + AgriPro publish 1=best, vs Bayer/GH 9=best), and the scraper conventions.

Status

Phase	Status
0 — scaffold	✅
1 — scrapers (bayer_seeds / golden_harvest / nk / agripro / gh_plot_reports / agripro_trials)	✅
2 — chunk + index	✅
3 — MCP tools (6)	✅
4-5 — Dockerfile + Gitea CI	✅
6 — reranker integration	✅ (eval-validated; deploy uses hybrid+rerank)
7 — eval harness	✅ (21 golden queries, baseline committed)
8 — hybrid search	✅ (default ON)
11 — `crop_seed_api_lessons` curated layer	✅ (Pioneer fallback + 7 other lessons)
13 — weekly_digest	not planned for seed-mcp

Remaining work (deferred, not blocking):

becks_pfr scraper (2,089 research docs via public Sanity GROQ)
2023 GH plot reports backfill (~3,619 more docs)
NK yield-results endpoint reverse-engineer
Channel Seed brand (~320 more Bayer varieties — separate brand under the same sitemap)

README.md Unescape Escape

seed-mcp

What's in the corpus

Variety identity — 2,398 records

Yield-trial data — 6,910 documents

Not in the corpus (documented in docs_mcp/lessons.md)

MCP tools (6)

Retrieval — eval-validated

Quick start

Repo layout

Infrastructure

Deploy mechanics

Status

README.md

Not in the corpus (documented in `docs_mcp/lessons.md`)