User flagged that Channel is expanding into their area — re-walked
the cropscience.bayer.us sitemap and found 8 additional brand×crop
paths beyond the original DEKALB/Asgrow/WestBred triple. Patches
the scraper to walk all of them; total Bayer varieties roughly
doubles from 475 to 931 and the corpus picks up first-ever
coverage in sorghum (36), cotton (30), canola (6), and silage as a
distinct crop (was conflated with corn before).
Net new varieties: 456
Channel corn=181 soy=67 silage=54 sorghum=18 (320)
DEKALB silage=82 sorghum=18 canola=6 (106)
Deltapine cotton=30 (30)
scrape/sources/bayer_seeds.py
- Replace `BRANDS` (brand → 1 path) and `CROP_SUFFIX` (brand → 1
suffix) with a flatter `BRAND_PATHS` list of (brand, url_path,
crop, is_primary_for_brand) entries. Channel and DEKALB are now
multi-crop brands; the same scraper walks every brand×crop pair.
- source_key derivation: for a brand's PRIMARY crop, strip the
trailing `-<crop>` suffix (matches the existing deployed source
keys for DEKALB corn / Asgrow soy / WestBred wheat). For
SECONDARY crops, KEEP the suffix so DEKALB-the-same-SKU sold as
both grain corn and silage gets two distinct source_keys
(collision-safe and unambiguous for `lookup_variety`).
- New `--crop` CLI filter for incremental backfills.
- Log line shows brand + crop alongside source_key for visibility.
rag/chunk.py
- Channel + Deltapine pages use slightly different characteristics
group labels (DISEASE not DISEASE RATINGS, AGRONOMIC
CHARACTERISTICS not GROWTH/HARVEST, plus MATURITY / ADAPTATION /
HERBICIDES / OTHER). Fold them into the DISEASE / AGRONOMIC /
MANAGEMENT label sets so the chunker buckets them correctly
into the standard sections.
Smoke-tested cross-brand × cross-crop queries against the rebuilt
index (5,529 chunks total) — all 6 sample queries surface the
right brand+crop at top-3:
Channel corn 110 RM → 210-25TRE BRAND
Channel soy 2.5 MG IA → 2622RXF BRAND
Deltapine cotton XF → DP 1820 B3XF BRAND
Sorghum dryland Kansas → 6B95 BRAND (Channel)
Silage corn WI dairy → DKC64-44RIB BRAND BLEND (silage variant)
Canola Northern Plains → DK401TL BRAND
Watchtower will pull the new image on the next push; deploy is
unchanged otherwise.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The scaffold-era README was out of sync with the shipped product:
- Vendor counts stale (recon estimates, not actual deployed counts)
- Trial data sources (gh_plot_reports + agripro_trials) entirely
unmentioned
- Tool list listed `corpus_status` (doesn't exist) and missed both
`lookup_variety` and `search_trials`
- Build-phase table showed everything as "pending" / "next" but
Phases 1-8 + 11 all shipped
Rewrite to reflect the deployed state:
- Corpus inventory: 760 variety records + 4,313 trial documents =
5,073 chunks across 6 sources
- All 6 MCP tools documented with their purpose
- Eval baseline table (hybrid+rerank wins 100%, P@1 90%, MRR 0.905)
with the surprising findings (dense alone is noise; hybrid w/o
rerank is WORSE than BM25 alone)
- Deploy mechanics: Watchtower chain, 4-GPU embedder pool, shared
llama-rerank sidecar with the network-attach gotcha
- Status table: ✅ on the phases that shipped, deferred work list
(becks_pfr, 2023 plot backfill, NK trials, Channel Seed brand)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI was failing on the "Rebuild indexes from committed corpus" step
with httpx.ConnectError [Errno 111] — `localhost:11434` in the
OLLAMA_URL pool resolves to the Gitea Actions runner CONTAINER's
own localhost (no Ollama there), not the host. Fix: drop localhost
from CI's pool; it stays useful for dev runs from the workstation
where the TITAN X serves Ollama on the host loopback.
Final CI pool — 3 LAN endpoints, weighted to .0.125 (4090):
.0.125:11434 ×4 (RTX 40-series, 242 embeds/sec)
.0.2:11436 ×2 (GPU-pinned, 108 embeds/sec)
.0.2:11435 ×1 (GPU-pinned, 72 embeds/sec)
deploy/docker-compose.yml — rewrite to match Drawbar's actual
parent-stack pattern, learned by inspecting how chem-mcp is
deployed on trashpanda:
- Service name `seed-mcp` (matches chem-mcp's pattern). Reached
via docker DNS as `seed-mcp:8080` from drawbar-backend-api.
- Internal-only (no host port), expose 8080 only.
- MCP_PORT=8080 inside container (chem-mcp uses 8080 too).
- OLLAMA_URL via host.docker.internal:11434 (trashpanda's Ollama
runs on the host). extra_hosts maps host-gateway.
- RERANK_URL: http://llama-rerank:8080 — but llama-rerank is on
the default `bridge` network, not drawbar-backend_default,
so chem-mcp's reranker silently fails! Documented patch:
docker network connect drawbar-backend_default llama-rerank
Fixes rerank for BOTH chem-mcp (today: dense-only fallback)
and the new seed-mcp.
- Watchtower label set so CI pushes to :latest auto-deploy.
Documented llama-rerank service block as an alternative for
bringing the sidecar fully into the parent compose stack, with the
ubatch-size flag the seed corpus needs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CORPUS — 4,299 GH plot reports added (3,797 written + 502 from the
earlier slow run + 319 sitemap-listed URLs that 404'd as
discontinued). Combined with prior 760 varieties + 14 AgriPro
trials = 5,073 total chunks now indexed.
scrape/sources/gh_plot_reports.py — concurrency speedup:
- 4 worker threads (ThreadPoolExecutor), each with its own
requests.Session for connection-pool efficiency.
- Shared class-level rate limiter (0.25 sec between ANY two
requests across all threads). Net throughput ~4 req/sec —
well below any rate-limit threshold a public site enforces.
- Diagnosis vs original 1 req/sec: GH had ZERO rate limiting,
zero 429s, zero retries. The 1 sec self-throttle was just too
conservative. Bench:
1 worker / 1.0 sec throttle: ~0.4 plots/sec (190 min ETA)
4 workers / 0.25 sec throttle: ~3 plots/sec (~25 min actual)
rag/chunk.py — chunk size cap for nomic-embed-text's 2048-token
context window:
- Empirically tested: failure threshold is ~5,250 chars on
numeric-heavy trial chunks (chars/token ratio 2.4 vs 3.5 for
prose). Cap at 4,500 chars to be safely under at worst-case
2.2 chars/token.
- Applied to BOTH variety and trial chunks. Marked truncated
chunks with metadata.embed_truncated = True; FULL text stays
in the on-disk .md for get_page to return verbatim.
.gitea/workflows/{refresh,image-only}.yml — OLLAMA_URL pool
restructured for the 4 GPU-pinned endpoints. Bench (50-chunk
batches on nomic-embed-text):
.0.125:11434 (RTX 40-series) 242 embeds/sec ← weight ×4
.0.2:11436 (GPU-pinned) 108 embeds/sec ← weight ×2
.0.2:11435 (GPU-pinned) 72 embeds/sec ← weight ×1
localhost (TITAN X) 37 embeds/sec ← weight ×1
Weighting is done by listing the URL multiple times in
OLLAMA_URL since the embedder uses round-robin. .0.2:11434 is
explicitly EXCLUDED — it isn't pinned to a specific GPU.
Combined index rebuild for 5,073 chunks now finishes in ~3 min
(was 19+ on the single-endpoint pool).
Smoke tests:
✓ list_versions: 5,073 docs across 6 sources, 2 vendors, 6
brands, 4 crops (corn 2711, soy 2016, silage 223, wheat 123).
✓ search_trials({crop=corn, state=IA, year=2024}): 3 IA 2024
corn trials surfaced.
✓ search_trials("Phytophthora resistance soybean trial"): NK
NK43-W1XFS top-1 in LA 2024 trial (cross-vendor result).
✓ search_trials("AP Iliad Idaho wheat"): AgriPro Washington/N
Idaho 2025 trial surfaced.
✓ search_trials(product=DKC65-95): 3 corn trials containing
that hybrid in IL/IA 2024.
✓ search_trials(product=NK1701): 3 corn trials in AR/MS 2024.
✓ Product filter correctly returns EMPTY for products that
aren't in the corpus (DKC65-20 is a 2023 product; 2023 plots
deferred). Anti-hallucination contract preserved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
.gitea/workflows/refresh.yml — add scrape steps for the new trial
sources (agripro_trials, gh_plot_reports) so the monthly cron
refreshes them alongside the variety sources. gh_plot_reports
is the heaviest single source (~4,600 docs @ 1 req/sec ≈ 70 min);
runs late so an earlier failure doesn't waste time before failing.
Commit-message variable count expanded to surface the trial counts.
docs_mcp/lessons.md — new "trial-data" section telling the agent:
- The two surfaces (search_docs = identity, search_trials = perf)
are complementary; how to route a farmer question to each.
- What's indexed (GH plot reports cross-vendor, AgriPro regional
PDFs) vs what's not (Bayer per-variety trials, NK yield results,
Pioneer, university extension trials).
- Recommended workflow: search_trials → identify top performers →
lookup_variety on each to verify identity → don't fabricate.
- How to read a GH plot report (per-column headers vary by crop:
corn/soy use Yield/MST/Test Weight, silage uses Ton/Acre +
Milk + Beef columns).
- Single-data-point caveat: one plot is one cooperator's field;
look across multiple plots for a robust recommendation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This PR introduces TRIAL data — yield-performance results from real
field trials — as a SEPARATE data type alongside variety identity.
The two are complementary:
search_docs → "What's the disease resistance of DKC62-08RIB?"
(variety identity — what it IS)
search_trials → "Which corn hybrid won the IA 2024 trials?"
(performance data — how it PERFORMED)
scrape/sources/gh_plot_reports.py — Golden Harvest plot reports
- 4,618 expected (2024+2025; 2023 deferred to a backfill pass).
- URL: /<crop>/plot-report/<state>/<year>/<plot_id>
- Cross-vendor: each plot lists products from multiple brands
(NK / DEKALB / Golden Harvest / Enogen / Pioneer / Channel) side
by side at one cooperator's field — the kind of independent
comparison data Bayer doesn't publish itself.
- Generic per-column metrics dict (Yield/MST/Test Weight/$/Ac for
corn+soy, Ton/Acre + Milk + Beef columns for silage).
- Politeness: 1 req/sec, retries on 429/5xx, no redirect-follow.
scrape/sources/agripro_trials.py — AgriPro regional trial PDFs
- 14 unique PDFs (38 sitemap links deduped) at /trials-data
- pdfplumber text extraction, region/year detection from filename
- Verbatim PDF text preserved in chunk body so variety + yield
number adjacency drives retrieval (AP Iliad's Aberdeen ID yield
matches a query about "AP Iliad Idaho yield")
rag/chunk.py — chunks_from_trial() dispatching by source
- Plot reports: identity preamble + Top-5 by primary metric + full
ranking table. Metric labels chosen from the data (corn/soy use
"Yield", silage uses "Ton/Acre").
- AgriPro PDFs: identity preamble + verbatim trial body inline so
per-location yields surface for region+variety queries.
- Variety chunks get data_type="variety" metadata; trial chunks get
data_type="trial". Single Chroma collection; the tool router
filters by data_type rather than maintaining two collections.
rag/index.py — dispatch by sidecar's data_type field
rag/bm25.py — new filter columns (data_type, year, state)
docs_mcp/server.py — sixth MCP tool: search_trials(crop?, state?,
year?, product?, k=10)
- Filters trial chunks via where={"data_type": "trial", ...}
- Optional product substring post-filter for "DKC62-08RIB Iowa 2024"
style searches
- search_docs now defaults to data_type="variety" so trial chunks
don't bleed into variety identity queries
- Tool docstring routes the agent: "use lookup_variety to verify
identity details on any trial winner you surface"
NK trial endpoint (/NKSeeds/wsProxy.asmx/GetPlotResult) is documented
as deferred — the ASMX-SOAP shape returned empty XML on initial
probe. Bayer per-variety yield data is not publicly indexed at all
— documented in the trial-scope note (DEKALB/Asgrow trial data flows
through Channel reps, not the web). AgRevival research books exist
as 10 large annual PDFs but are deferred (low ROI per parse).
Initial corpus shipped in this PR: 14 AgriPro trial PDFs. The 4,618
Golden Harvest plot reports are scraping in background and will be
added in a follow-up corpus-snapshot PR (~70 min ETA).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
agripro (24 varieties)
- Drupal Views form scrape via /search-agripro-brand-varieties with
explicit GET params (sidesteps the AJAX-only-on-load default that
returns an empty form skeleton).
- Per-variety parse: <h1>, .field--node--variety-type--variety,
.field--node--tag-line--variety, .field--node--body, plus the
three rated sections (Agronomics / Grain / Disease) with their
<div class="row"><div class="label">label</div><div>value</div>
pairs.
- Wheat-class distribution: 12 HRS, 7 SWW, 3 HRW, 1 HWS, 1 Barley
— provides the Northern Plains HRS coverage WestBred lacks.
nk (122 varieties — recon's "29" was outdated; the current NK seed
finder lists 41 corn + 81 soy)
- ASP.NET WebForms endpoint:
POST /NKSeeds/{Corn,Soy}ProductFinder.aspx/GetProducts returns
{"d": "<html>"} where the inner HTML is one <div class="sf-result">
per variety. BeautifulSoup tokenizes the whole blob.
- Per-card: product code (NK8005, NK008-P8XF), RM/MG from the
title <span>, "Brands Available" trait variants, marketing
positioning + bullet strengths, tech-sheet PDF URL.
- pdfplumber text extraction on the tech-sheet PDFs adds:
* corn disease ratings (Gray Leaf Spot, NCLB, Goss's Wilt,
Anthracnose, Tar Spot, Fusarium, etc.) where the PDF prints
"Label N" lines (text-extractable)
* soybean Phytophthora source genes (Rps1c, Rps3a, ...)
* soybean SCN race coverage
* soybean agronomic ratings (Emergence, Standability, Shatter
Tolerance, Green Stem) with text-extractable 1-9 values
* soybean soil-type adaptation (Best/Good/Fair/Poor) for drought
prone / high pH / poorly drained / etc.
- Agronomic rating BARS for corn (Emergence, Stalk Strength,
Drought) are not text-extractable; we record the labels with an
explicit "rated in PDF chart, see tech sheet" value so the agent
can direct the farmer at the source for those numbers.
Scale-direction correction in lessons.md:
- NK and AgriPro both use 1 = best, lower = more resistant — the
REVERSED convention vs Bayer / Golden Harvest. NK's tech-sheet
footer literally prints "1-9 Scale: 1 = Best, 9 = Worst".
AgriPro positioning on stripe-rust-resistant varieties (AP Iliad
with Stripe Rust 1, Eyespot 2) confirms the same direction.
- sources-not-yet-indexed section trimmed to just Beck's PFR +
Beck's products — everything else IS now in the corpus.
Cross-vendor coverage after this PR: 760 varieties.
bayer_seeds 475 (DEKALB 288 / Asgrow 102 / WestBred 85)
golden_harvest 139
nk 122 (41 corn / 81 soy)
agripro 24 (12 HRS / 7 SWW / 3 HRW / 1 HWS / 1 Barley)
Vendors: Bayer, Syngenta. Brands: 6. Crops: corn, soy, wheat (109
wheat now, up from 85).
requirements.txt: pdfplumber>=0.11 for NK tech-sheet parsing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
deploy/docker-compose.yml — replace <product>/<registry> placeholders
with concrete values for Drawbar's stack:
- image: git.jpaul.io/justin/seed-mcp:latest (CF tunnel for pulls; CI
pushes via LAN 192.168.0.2:1234 to avoid 100 MB body cap)
- container_name: seed-mcp
- port 8001:8000 (8001 host-side to not collide with crop-chem-docs
on 8000)
- PRODUCT_NAME=crop_seed, hybrid search enabled, stateless HTTP
- llama-rerank shared with crop-chem-docs (NOT redefined here —
expected to already be in Drawbar's parent compose network)
- networks.drawbar-mcp external: true so seed-mcp joins the existing
cross-MCP shared network
.gitignore — corpus/ is now COMMITTED, not ignored. The monthly
refresh workflow scrapes and commits corpus changes; the image-only
workflow rebuilds indexes from the committed corpus. Allowing the
corpus to flow through git means the :corpus-YYYY.MM.DD image tag
pins to a specific seed-catalog snapshot. chroma/ and bm25/ remain
ignored — those are deterministically derived from corpus.
Initial committed snapshot: 614 varieties.
- bayer_seeds: 475 (DEKALB 288 + Asgrow 102 + WestBred 85)
- golden_harvest: 139 (Syngenta corn + soy; 36 sitemap URLs
302-redirected = discontinued)
rag/chunk.py — normalize brand and crop to uppercase/lowercase in
Chroma metadata so cross-vendor brand-filter lookups don't break on
casing inconsistency (Bayer stores "DEKALB", Golden Harvest stores
"Golden Harvest"; _build_where uppercases user-supplied brand which
matched the former but not the latter pre-fix). Sidecar JSON keeps
original casing for display.
Stub scrapers (nk, agripro, becks_pfr, becks_products) — change
return code from 2 to 0 so the monthly-refresh CI workflow doesn't
fail on deferred sources. Real implementations will return 0 on
success / 1 on failure when they ship.
Smoke-tested cross-vendor retrieval against the 614-chunk index:
- list_versions shows both vendors with correct facet counts
- broad "corn hybrid 100 RM" query returns both DEKALB and Golden
Harvest hits in top 5
- brand='Golden Harvest' filter returns 3 GH-only varieties
- variety-code prefilter still works (E085Z5 → top hit on GH)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the fifth MCP tool — crop_seed_api_lessons(topic?) — backed by
docs_mcp/lessons.md, the ONLY source of opinionated content in the
server. Everything else (search_docs, get_page, lookup_variety)
returns verbatim from vendor catalogs; lessons.md fills the gaps
the corpus can't cover.
The Pioneer fallback is the critical anti-hallucination piece:
Pioneer's ToS bans automation, so the corpus has no Pioneer data.
Without this tool, an agent might surface Bayer/Asgrow chunks as
mediocre matches for a Pioneer query. The tool's docstring tells
the agent to call it on any Pioneer / P-series question; the
'pioneer' section says clearly:
"I don't have Pioneer's variety data indexed... please consult
Pioneer or an extension service."
"Do NOT invent Pioneer hybrid ratings."
Other lesson sections cover knowledge the agent needs to interpret
search_docs / get_page output correctly:
- rating-scales: Bayer 1-9, Golden Harvest 9-to-1, what
R/MR/S/Rps1c/R3 mean in soybean disease columns
- maturity-semantics: corn RM days vs soybean MG vs wheat class +
qualitative early/medium/late
- trait-glossary: SSRIB, VT2PRIB, XF, E3, Conkesta, Clearfield, etc.
- scn-resistance: race coverage + Peking vs PI 88788 source
- regional-listings: how to interpret Bayer's "local profiles"
- sources-not-yet-indexed: which vendors aren't in the corpus yet
- checking-your-work: always call lookup_variety before quoting
Lesson lookup prefers slug-match (returns just `rating-scales` for
topic="rating", not every section that mentions ratings); falls
back to body-match only when no slug matches.
Smoke-tested with topic=pioneer, topic=rating, topic=trait,
topic=zzzzzz (no match), and topic=None (full index = 10K chars,
8 sections).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 — Chunking and indexing
- rag/chunk.py: replace template chunker with seed-variety-specific
chunks_from_variety(). One chunk per variety (varieties are small
and named-rating retrieval signal is best kept together). Output
is rebuilt deterministically from the sidecar JSON: every value is
verbatim from the source, only framing language ("Disease ratings
(1-9, 9=best):") is template glue. Anti-hallucination contract:
same sidecar in → same chunk out, never a fabricated rating.
Metadata flattened to Chroma-safe primitives (str/int/float/bool):
source, source_key, vendor, brand, crop, product_name,
product_id, source_url, rm (corn), mg (soy), wheat_class,
release_year, trait_codes_csv, rating_scale.
- rag/index.py: walks corpus/<source>/<source_key>.json sidecars
via the new chunker. Default PRODUCT_NAME=crop_seed so the
Chroma collection is crop_seed_docs.
- rag/bm25.py: filterable columns updated to seed-domain facets
(source/vendor/brand/crop/source_key) instead of the template's
version/platform/product.
Phase 3 — MCP server tools wired up
- search_docs: hybrid dense (Chroma) + BM25 (FTS5) retrieval with
RRF fusion. Optional filters: crop, brand, vendor, source.
Variety-code prefilter pins exact source_key / product_name /
hybrid_prefix matches at the top — dense embeddings have no
semantic neighbor for tokens like "DKC62-08RIB" and RRF can let
noise float to #1 without this pin. Each response carries the
variety's source URL inline so the agent can cite.
- get_page(source, source_key): emits a structured ratings header
(verbatim from sidecar, table per characteristics group, vendor
positioning, regional listings) followed by the raw indexed body.
This is the canonical fact-check surface.
- list_versions(): facet discovery — distinct sources, vendors,
brands, crops across the corpus.
- lookup_variety(source_key, source?): returns the raw sidecar JSON
for one variety. The agent should call this BEFORE quoting any
specific rating value to a farmer — guaranteed verbatim.
Smoke tests against 475 indexed Bayer varieties:
- list_versions returns 475 varieties, 1 source, 1 vendor, 3 brands,
3 crops with correct per-brand counts (288/102/85).
- Semantic ag queries find the right candidates: short-season
drought-tolerant corn → DKC44-97RIB at RM 94 (in 90-95 band);
SCN+MG3 soybean → Asgrow XF varieties with explicit SCN R3 ratings;
Phytophthora Rps3a soy → AG07XF4 (right gene); stripe-rust
wheat → WestBred WB1376CLP (Yellow Rust 2 = best).
- Variety-code lookups work via prefilter: DKC62-08RIB, AG29XF4,
WB6430 all return as #1 hit. BM25 confirms ranking unambiguously
(top-1 score -13.2 vs -8.5 for #2 on "DKC62-08RIB ratings").
- Server boots cleanly in stdio AND streamable-http modes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace stub with working scraper for all three Bayer seed brands.
Discovery uses the public sitemap-dynamic.xml (475 varieties:
288 DEKALB corn + 102 Asgrow soy + 85 WestBred wheat — matches recon).
Per-variety detail comes from the page's __NEXT_DATA__ JSON island.
Each variety writes corpus/bayer_seeds/<source_key>.{md,json} with:
- Identity (brand, crop, hybridLabel, productId, releaseYear)
- Maturity routed per crop (RM for corn, MG for soy, qualitative for wheat)
- Trait stack (code + full name)
- Positioning + strengths narrative
- Characteristics groups (DISEASE RATINGS, GROWTH, MANAGEMENT, HARVEST,
etc.) preserved verbatim from source so the chunker can re-bucket
into canonical disease/agronomic flats per CLAUDE.md schema
- Regional seed-guide listings with agronomist contacts
- _scale_direction tag (Bayer = "1-9 (9 = best)") for chunker
Smoke-tested all three brands (--limit 2 each, plus --product, --force,
and scrape.runner dispatch). Politeness: 1 req/sec, retries on 429/5xx
with Retry-After honored.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>