# seed-mcp MCP server over the public catalogs of major US row-crop seed vendors — corn, soybeans, wheat. Sibling project to [`crop-chem-docs`](https://git.jpaul.io/justin/crop-chem-docs) (pesticide labels), feeding the same Drawbar farm-advisor AI. The server exposes per-variety records with **agronomic ratings**, **disease tolerance**, **trait stack**, **maturity**, and **regional notes** — so the advisor can answer questions like "which corn hybrid for sandy soil, drought-prone, RM ≤105 in northeast Iowa?" without rummaging through individual brand sites. ## Vendor coverage | Vendor | Verdict | Varieties | Notes | |---|---|---|---| | Bayer seeds (DEKALB + Asgrow + WestBred) | 🟢 | ~475 | Same `cropscience.bayer.us` Next.js infra as crop-chem-docs | | Golden Harvest (Syngenta) | 🟢 | ~175 | Sitemap + server-rendered HTML + Syngenta CDN PDFs | | NK (Syngenta) | 🟢 | 29 | Shares PDF fetcher with Golden Harvest | | AgriPro (Syngenta wheat) | 🟢 | 24 | Drupal Views, server-rendered | | Beck's PFR | 🟡 | 2,089 | Public Sanity GROQ API (no auth) | | Beck's products | 🟡 | 860 | Identity-only until SeedIQ XHR sniffed | | Pioneer (Corteva) | 🔴 | — | ToS bans automation — curated fallback lesson instead | ## Quick start ```bash git clone https://git.jpaul.io/justin/seed-mcp.git cd seed-mcp python -m venv venv && source venv/bin/activate pip install -r requirements.txt # Run one scraper python -m scrape.runner --source bayer_seeds --force # Rebuild indexes python -m rag.index --rebuild # Local MCP server (stdio for Claude Desktop dev) python -m docs_mcp.server --transport stdio ``` ## Tools exposed | Tool | Purpose | |---|---| | `search_docs` | Hybrid + rerank variety search with crop / RM / trait / region filters | | `get_page` | Full variety record by `(source, source_key)` | | `list_versions` | Discover crops, brands, traits, RM/MG ranges, wheat classes | | `corpus_status` | Counts + freshness; useful for health probes | | `crop_seed_api_lessons` | Curated agronomy lessons — Pioneer fallback, disease-scale normalization, regional placement heuristics | ## Build phases This is a clone of [`docs-mcp-template`](https://git.jpaul.io/justin/docs-mcp-template). The 13 phases in `PLAN.md` apply: | Phase | Status | |---|---| | 0 — scaffold | done | | 1 — first scraper (bayer_seeds) | next | | 2 — chunk + index | pending | | 3 — baseline MCP tools | template defaults | | 4-5 — Dockerfile + CI | done (placeholders filled) | | 6 — reranker | shares `llama-rerank` sidecar with crop-chem-docs | | 7 — eval harness | pending (curate ~25 queries) | | 8 — hybrid search | done (template) | | 9 — diff_versions, list_cluster | optional | | 11 — `crop_seed_api_lessons` curated layer | pending | See `CLAUDE.md` for the canonical sidecar schema and the disease-scale-normalization gotcha (Golden Harvest is reversed). ## Infrastructure - **Registry**: `git.jpaul.io/justin/seed-mcp:latest` (Watchtower) / `:corpus-YYYY.MM.DD` (production pin) - **Embedder**: shared Ollama pool with crop-chem-docs (Gitea-host GPUs + Windows Ollama; CI never hits trashpanda's production Ollama) - **Reranker**: shared `llama-rerank` sidecar on trashpanda's Tesla P4 (one container, both MCPs use it) - **PRODUCT_NAME**: `crop_seed` (not `seed_mcp` — used in Chroma collection, BM25 db filename, and `crop_seed_api_lessons` tool)