justin cd4a0f3148 CI fix + Drawbar-stack deploy pattern
CI was failing on the "Rebuild indexes from committed corpus" step
with httpx.ConnectError [Errno 111] — `localhost:11434` in the
OLLAMA_URL pool resolves to the Gitea Actions runner CONTAINER's
own localhost (no Ollama there), not the host. Fix: drop localhost
from CI's pool; it stays useful for dev runs from the workstation
where the TITAN X serves Ollama on the host loopback.

Final CI pool — 3 LAN endpoints, weighted to .0.125 (4090):
  .0.125:11434  ×4 (RTX 40-series, 242 embeds/sec)
  .0.2:11436    ×2 (GPU-pinned,    108 embeds/sec)
  .0.2:11435    ×1 (GPU-pinned,     72 embeds/sec)

deploy/docker-compose.yml — rewrite to match Drawbar's actual
parent-stack pattern, learned by inspecting how chem-mcp is
deployed on trashpanda:

  - Service name `seed-mcp` (matches chem-mcp's pattern). Reached
    via docker DNS as `seed-mcp:8080` from drawbar-backend-api.
  - Internal-only (no host port), expose 8080 only.
  - MCP_PORT=8080 inside container (chem-mcp uses 8080 too).
  - OLLAMA_URL via host.docker.internal:11434 (trashpanda's Ollama
    runs on the host). extra_hosts maps host-gateway.
  - RERANK_URL: http://llama-rerank:8080 — but llama-rerank is on
    the default `bridge` network, not drawbar-backend_default,
    so chem-mcp's reranker silently fails! Documented patch:
       docker network connect drawbar-backend_default llama-rerank
    Fixes rerank for BOTH chem-mcp (today: dense-only fallback)
    and the new seed-mcp.
  - Watchtower label set so CI pushes to :latest auto-deploy.

Documented llama-rerank service block as an alternative for
bringing the sidecar fully into the parent compose stack, with the
ubatch-size flag the seed corpus needs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 17:23:03 -04:00

seed-mcp

MCP server over the public catalogs of major US row-crop seed vendors — corn, soybeans, wheat. Sibling project to crop-chem-docs (pesticide labels), feeding the same Drawbar farm-advisor AI.

The server exposes per-variety records with agronomic ratings, disease tolerance, trait stack, maturity, and regional notes — so the advisor can answer questions like "which corn hybrid for sandy soil, drought-prone, RM ≤105 in northeast Iowa?" without rummaging through individual brand sites.

Vendor coverage

Vendor Verdict Varieties Notes
Bayer seeds (DEKALB + Asgrow + WestBred) 🟢 ~475 Same cropscience.bayer.us Next.js infra as crop-chem-docs
Golden Harvest (Syngenta) 🟢 ~175 Sitemap + server-rendered HTML + Syngenta CDN PDFs
NK (Syngenta) 🟢 29 Shares PDF fetcher with Golden Harvest
AgriPro (Syngenta wheat) 🟢 24 Drupal Views, server-rendered
Beck's PFR 🟡 2,089 Public Sanity GROQ API (no auth)
Beck's products 🟡 860 Identity-only until SeedIQ XHR sniffed
Pioneer (Corteva) 🔴 ToS bans automation — curated fallback lesson instead

Quick start

git clone https://git.jpaul.io/justin/seed-mcp.git
cd seed-mcp
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Run one scraper
python -m scrape.runner --source bayer_seeds --force

# Rebuild indexes
python -m rag.index --rebuild

# Local MCP server (stdio for Claude Desktop dev)
python -m docs_mcp.server --transport stdio

Tools exposed

Tool Purpose
search_docs Hybrid + rerank variety search with crop / RM / trait / region filters
get_page Full variety record by (source, source_key)
list_versions Discover crops, brands, traits, RM/MG ranges, wheat classes
corpus_status Counts + freshness; useful for health probes
crop_seed_api_lessons Curated agronomy lessons — Pioneer fallback, disease-scale normalization, regional placement heuristics

Build phases

This is a clone of docs-mcp-template. The 13 phases in PLAN.md apply:

Phase Status
0 — scaffold done
1 — first scraper (bayer_seeds) next
2 — chunk + index pending
3 — baseline MCP tools template defaults
4-5 — Dockerfile + CI done (placeholders filled)
6 — reranker shares llama-rerank sidecar with crop-chem-docs
7 — eval harness pending (curate ~25 queries)
8 — hybrid search done (template)
9 — diff_versions, list_cluster optional
11 — crop_seed_api_lessons curated layer pending

See CLAUDE.md for the canonical sidecar schema and the disease-scale-normalization gotcha (Golden Harvest is reversed).

Infrastructure

  • Registry: git.jpaul.io/justin/seed-mcp:latest (Watchtower) / :corpus-YYYY.MM.DD (production pin)
  • Embedder: shared Ollama pool with crop-chem-docs (Gitea-host GPUs + Windows Ollama; CI never hits trashpanda's production Ollama)
  • Reranker: shared llama-rerank sidecar on trashpanda's Tesla P4 (one container, both MCPs use it)
  • PRODUCT_NAME: crop_seed (not seed_mcp — used in Chroma collection, BM25 db filename, and crop_seed_api_lessons tool)
S
Description
MCP server over US row-crop seed/hybrid variety data (corn, soybeans, wheat). Sibling to crop-chem-docs. Feeds Drawbar farmer advisor.
Readme 23 MiB
Languages
Python 99.7%
Dockerfile 0.3%