a54fac240f
Image rebuild (skip scrape) / build (push) Successful in 5m54s
Co-authored-by: claude <claude@jpaul.io> Co-committed-by: claude <claude@jpaul.io>
424 lines
19 KiB
Markdown
424 lines
19 KiB
Markdown
# crop_seed API lessons
|
||
|
||
Curated knowledge that **does not live in the scraped corpus** but
|
||
that an agent needs to interpret search_docs / get_page results
|
||
correctly. This file is the source for `crop_seed_api_lessons(topic)`.
|
||
|
||
Each section starts with a `## <slug>` heading; the tool returns
|
||
sections whose slug matches the topic (substring match) or all
|
||
sections when `topic` is None.
|
||
|
||
This file is the *only* place the MCP injects opinionated content.
|
||
Everything else returned by the server comes verbatim from a vendor
|
||
catalog. Lessons here should be: factual, well-cited, and conservative
|
||
about what the corpus actually contains.
|
||
|
||
---
|
||
|
||
## pioneer
|
||
|
||
**Pioneer (Corteva) is NOT in the seed-mcp corpus.** The MCP intentionally
|
||
does not scrape pioneer.com. Per Pioneer's published Terms of Service:
|
||
|
||
> "you shall not use any manual or automated software, devices or
|
||
> other processes (including but not limited to spiders, robots,
|
||
> scrapers, crawlers, avatars, data mining tools or the like) to
|
||
> 'scrape' or download data from the Services"
|
||
|
||
Pioneer additionally runs Imperva WAF on their catalog API and
|
||
login-walls dealer lookup, so a polite scrape isn't technically
|
||
feasible either.
|
||
|
||
**Agent behavior when a farmer asks about Pioneer / P-series hybrids:**
|
||
|
||
1. Tell the farmer plainly: "I don't have Pioneer's variety data
|
||
indexed — Pioneer doesn't allow automated indexing of their
|
||
catalog."
|
||
2. Recommend pioneer.com → *Find a Sales Rep* (or call 1-800-247-6803)
|
||
for hybrid specs and local recommendations.
|
||
3. If the farmer wants an independent third-party rating for a
|
||
specific Pioneer hybrid, mention that **university extension
|
||
performance trials** (Iowa State, Illinois Crop Sciences, Purdue,
|
||
Nebraska, Ohio State) publish multi-location yield data on
|
||
Pioneer hybrids alongside competitors — useful if they want
|
||
apples-to-apples data without going through Pioneer's own
|
||
marketing.
|
||
4. **Do NOT invent Pioneer hybrid ratings.** If asked "what's the
|
||
disease tolerance of P1142AM?", the only correct answer is
|
||
"I don't have that data — please consult Pioneer or an
|
||
extension service."
|
||
|
||
This is the canonical anti-hallucination policy for the seed-mcp.
|
||
There is no Pioneer data; there is no inference. Direct the farmer
|
||
to a primary source.
|
||
|
||
---
|
||
|
||
## rating-scales
|
||
|
||
Different vendors publish ratings on different conventions. The
|
||
chunker normalizes the *labels* in the chunk preamble but always
|
||
preserves the source's `_scale_direction` field in the sidecar.
|
||
|
||
**Bayer (DEKALB / Asgrow / WestBred)**: `1-9 (9 = best)`. A
|
||
GRAY LEAF SPOT rating of 8 means EXCELLENT tolerance. A rating of 2
|
||
means SUSCEPTIBLE.
|
||
|
||
**Syngenta Golden Harvest**: `9-to-1 (9 = best, 1 = worst)` —
|
||
this is the *direction* Golden Harvest publishes, but the *meaning*
|
||
of high numbers is the same: high = best. Where the chunker says
|
||
"normalize" for Golden Harvest, that just means we've already
|
||
re-stated it as `1-9 (9 = best)` in the chunk preamble; the source's
|
||
`_scale_direction` field still says `9-to-1` so you can detect the
|
||
provenance.
|
||
|
||
**Syngenta NK and AgriPro**: `1-9 (1 = best, lower = more
|
||
resistant)`. **REVERSED from Bayer and Golden Harvest.** NK's
|
||
tech-sheet PDFs literally print *"1-9 Scale: 1 = Best, 9 = Worst"*
|
||
in the footer; AgriPro's positioning on stripe-rust-resistant
|
||
varieties (e.g. AP Iliad with Stripe Rust 1, Eyespot 2) confirms
|
||
the same direction. On NK, this applies both to disease tolerance
|
||
AND to numeric agronomic ratings (Emergence, Standability, Shatter
|
||
Tolerance, Green Stem — all 1 = best). Cross-vendor comparisons
|
||
MUST consult the `_scale_direction` field in each side's sidecar
|
||
before drawing conclusions.
|
||
|
||
(Agronomic ratings on AgriPro are qualitative —
|
||
"Excellent / Very Good / Good / Fair" — and have no direction
|
||
issue. NK's soybean tech sheets ALSO publish soil-type adaptation
|
||
as Best/Good/Fair/Poor labels which are qualitative.)
|
||
|
||
**Beck's**: ratings live behind SeedIQ login; only identity-level
|
||
data is publicly available, so most disease/agronomic ratings are
|
||
absent from Beck's records in this corpus.
|
||
|
||
**ProHarvest Seeds**: **mixed scales** on one record. *Disease
|
||
Tolerance* is `1-9 numeric, 9 = best / most tolerant` (same direction
|
||
as Bayer — no flip; `NA` = not rated). *General Characteristics* and
|
||
*Agronomic Features* are qualitative (`Excellent / Very Good / Good /
|
||
Average`) with a few raw numerics (GDD pollination/black-layer, kernel
|
||
rows). *Soil Adaptability* uses `HR` (highly recommended) / `R`
|
||
(recommended). The single `_scale_direction` line on the record states
|
||
all three. Ebbert's-style independent brand, but ratings ARE parsed
|
||
into structured groups so they're retrievable.
|
||
|
||
**Latham Hi-Tech Seeds**: numeric ratings `~1-9 where LOWER = BETTER`
|
||
(1 = best / most tolerant / most resistant) — **REVERSED from Bayer,
|
||
same direction as NK / AgriPro**. There's no on-page legend; the
|
||
direction was derived empirically (top-rated stalks/roots cluster at
|
||
1.0–1.5, weak traits at 3.0–3.5). Categorical values pass through
|
||
verbatim: SCN source (`PI 88788`), Phytophthora gene (`Rps 1k`),
|
||
Anthracnose (`ASR`). `NA`/blank = not rated.
|
||
|
||
**Stine Seed**: **corn** is `1-9 numeric, 9 = Excellent / best`
|
||
(HIGHER = better, same as Bayer — read from the on-page legend:
|
||
9 Excellent … 5 Below Average). **Soybeans are QUALITATIVE** (vigor
|
||
Excellent/Very Good/Good; disease Resistant/Strong/Good/Susceptible
|
||
where Resistant/Strong = best), with SCN source + RPS gene passed
|
||
through, not a number. So a Stine corn "8" is strong but a Latham
|
||
"8" is weak — never compare the raw numbers across these two.
|
||
|
||
**Burrus Seed**: numeric ratings `1-10, 10 = best / most tolerant`
|
||
(HIGHER = better; observed range 4–10). Herbicide tolerances and Bt
|
||
insect-protection are `Yes/No` (verbatim). `NR`/blank/`0`/`-` = not
|
||
rated. Covers brands Burrus / Power Plus / DONMARIO.
|
||
|
||
**1st Choice Seeds**: `0-10, HIGHER = better` (0-4 Below Average,
|
||
5 Average, 6 Good, 7 Very Good, 8 Excellent, 9-10 Superior). Many
|
||
older corn hybrids publish only partial ratings (source gap); wheat
|
||
is identity-only.
|
||
|
||
**RobSeeCo** (Rob-See-Co + Innotech): `1-9, 9 = Best` (HIGHER =
|
||
better, same direction as Bayer / Stine-corn); `-` = not available.
|
||
Plant Height 9=Tall, Ear Height 9=High; Planting Rate L/ML/M/MH/H;
|
||
**Product Fit Geography A=All, C=Central, E=East, W=West, CW=Central+West**
|
||
(a placement code, not a rating). Soy disease uses letter codes
|
||
(R/MR/S) + an SCN source (e.g. Peking) + Rps gene. Sourced from the
|
||
seed-guide PDF, so it's identity + structured ratings but no live web
|
||
page per variety.
|
||
|
||
**⚠️ Direction is NOT consistent across the independents.** HIGHER =
|
||
better: Bayer, Golden Harvest, Stine(corn), ProHarvest(disease),
|
||
Burrus(1-10), 1st Choice(0-10), **RobSeeCo(1-9)**. LOWER = better
|
||
(1 = best): NK, AgriPro, **Latham**. Qualitative (no direction):
|
||
Stine(soy), ProHarvest(general/agronomic), AgriPro(agronomic),
|
||
Ebbert's. A raw numeric rating is meaningless without its
|
||
`_scale_direction`.
|
||
|
||
**Always check the chunk's "Rating scale" line or call
|
||
`lookup_variety(source_key)` and look at `_scale_direction` if you
|
||
are unsure.** Cross-vendor comparisons are valid AFTER you've
|
||
confirmed each side uses the same direction.
|
||
|
||
**Non-numeric values** appear for some characteristics and should be
|
||
read literally:
|
||
- `R`, `MR`, `S` for soybean disease resistance = Resistant / Moderately
|
||
Resistant / Susceptible (not 1-9).
|
||
- `Rps1c`, `Rps3a`, `Rps1k`, etc. = specific Phytophthora resistance
|
||
gene present.
|
||
- `R1`, `R3` (under SOYBEAN CYST NEMATODE) = effective against
|
||
SCN race 1 / race 3.
|
||
- `A`, `B`, `C` under HERBICIDE sensitivity = grade letters where A
|
||
is most tolerant.
|
||
|
||
---
|
||
|
||
## maturity-semantics
|
||
|
||
Maturity is encoded differently per crop. Don't conflate the units.
|
||
|
||
**Corn — Relative maturity (RM days)**: integer roughly 75-120.
|
||
Lower = shorter season, suitable for higher latitudes / shorter
|
||
growing windows. 110 RM is a Central Iowa default; 85 RM suits
|
||
northern Minnesota or short-season silage; 115+ RM fits southern
|
||
Indiana / southern Illinois / Missouri Delta. The number is
|
||
**Pioneer-style RM days**, normalized across the industry.
|
||
|
||
**Soybeans — Maturity group (MG)**: float 00 (zero-zero) to 9.0
|
||
expressed with one decimal. A "3.5 MG" soybean is for central
|
||
Iowa. Northern North Dakota / Minnesota plant 0.0–1.5 MG. Mid-South
|
||
plants 5.0+. Each tenth of an MG ≈ 7-10 days of additional season.
|
||
Sidecar field: `maturity_group` (e.g. "3.5", "0.7").
|
||
|
||
**Wheat — Class + heading**: Winter / spring decision is separate
|
||
from "class" (HRW / HRS / SRW / SWW / SWS / durum):
|
||
- HRW = Hard Red Winter — Plains states bread wheat
|
||
- HRS = Hard Red Spring — Northern Plains, North Dakota, Montana
|
||
- SRW = Soft Red Winter — Eastern Corn Belt, Ohio Valley
|
||
- SWW = Soft White Winter — Pacific Northwest
|
||
- SWS = Soft White Spring — Pacific Northwest
|
||
- Durum — North Dakota / Montana, pasta wheat
|
||
Maturity is qualitative: Early / Medium-Early / Medium / Medium-Late / Late.
|
||
**WestBred's product page JSON does not always expose the wheat class
|
||
as a structured field** — sometimes it's only in the marketing
|
||
narrative (e.g. "WB1376CLP is a Soft White Winter Clearfield® Plus
|
||
Wheat variety"). Read `positioning_statement` carefully when the
|
||
sidecar's `wheat_class` is null.
|
||
|
||
---
|
||
|
||
## trait-glossary
|
||
|
||
Common trait codes that appear in `trait_stack`:
|
||
|
||
**Corn:**
|
||
- `SSRIB` — SmartStax® RIB Complete® corn blend (above + below-ground
|
||
insect protection + Roundup Ready + LibertyLink, with refuge-in-bag)
|
||
- `VT2PRIB` — VT Double PRO® RIB Complete® (above-ground insect
|
||
protection + Roundup Ready, refuge-in-bag)
|
||
- `VT4PRIB` — VT4 PRO® RIB Complete® (newer above-ground protection)
|
||
- `Trecepta` — Trecepta® (Trecepta + Roundup Ready + LibertyLink, for
|
||
earworm + western bean cutworm pressure)
|
||
- `SmartStax PRO` — SmartStax® PRO® (RNAi corn rootworm)
|
||
- `PowerCore` — PowerCore® Refuge Advanced (older above-ground stack)
|
||
- `Conventional` — no biotech traits (organic / specialty channels)
|
||
|
||
**Soybeans:**
|
||
- `XF` — XtendFlex® (Roundup Ready 2 Xtend + dicamba + glufosinate)
|
||
- `Xtend` — Roundup Ready 2 Xtend® (dicamba + glyphosate)
|
||
- `RR2Y` — Roundup Ready 2 Yield® (glyphosate only)
|
||
- `E3` — Enlist E3® (2,4-D + glyphosate + glufosinate)
|
||
- `LL/LL+GT27` — LibertyLink® / LibertyLink + GT27 (glufosinate +
|
||
glyphosate + isoxaflutole)
|
||
- `Conkesta E3` — Bt-stack for caterpillar pressure (BR/AR markets)
|
||
- `SR` — SR® (sulfonylurea-tolerant, Asgrow-specific)
|
||
|
||
**Wheat:**
|
||
- `Clearfield` / `CLP` — Clearfield® / Clearfield® Plus (imazamox
|
||
tolerance)
|
||
- `CoAXium` — CoAXium® (quizalofop tolerance) — note: AgriPro's
|
||
catalog flag, NOT in the WestBred corpus.
|
||
|
||
Always render the full trait name (`trait_descriptions`) when telling
|
||
the farmer "this variety has X trait" — bare trait codes are
|
||
ambiguous in print.
|
||
|
||
---
|
||
|
||
## scn-resistance
|
||
|
||
Soybean Cyst Nematode resistance ratings are critical for fields
|
||
with SCN pressure (most of the Corn Belt). Read carefully:
|
||
|
||
- `R3` under SOYBEAN CYST NEMATODE = Resistant to race 3 (the most
|
||
common race nationally). Most "SCN-resistant" soybeans on the
|
||
market are R3.
|
||
- `R1, R3` = Resistant to both race 1 AND race 3. Higher value;
|
||
useful in long-rotation SCN fields where race shifts have occurred.
|
||
- `MR3` = Moderately Resistant to race 3. Some yield loss expected
|
||
under high SCN pressure.
|
||
- `S` = Susceptible.
|
||
- Some Bayer Asgrow XF lines (e.g. AG29XF4) use **Peking-type SCN
|
||
resistance**, which is genetically distinct from the more common
|
||
PI 88788 source. Peking is more durable when SCN populations
|
||
have eroded PI 88788 effectiveness. Look for "Peking type" in the
|
||
positioning statement.
|
||
|
||
**Recommended workflow when a farmer asks about SCN:** call
|
||
`search_docs` with the user's MG range + "SCN-resistant", then
|
||
`lookup_variety` on the top 2-3 candidates to verify the exact race
|
||
coverage and resistance source.
|
||
|
||
---
|
||
|
||
## regional-listings
|
||
|
||
The `regional_recommendations` array in each sidecar is sourced from
|
||
Bayer's "local profiles" — varieties get assigned to regional Seed
|
||
Guide bundles (e.g. *"2026 Washington, Oregon, SEED GUIDE"*) with a
|
||
named regional agronomist contact. This is the closest signal we have
|
||
to *"is this variety recommended for the farmer's geography?"* but
|
||
note:
|
||
|
||
- A variety being absent from a regional listing **does not** mean
|
||
it's unsuitable — Bayer's local agronomists curate these lists.
|
||
- Listings are vendor-side recommendations, not third-party trial
|
||
data.
|
||
- When the farmer mentions a region, try filtering or scanning for
|
||
varieties whose `regional_recommendations[].product_list_name`
|
||
mentions that region.
|
||
|
||
Other vendors handle regional placement differently. Golden Harvest
|
||
publishes a separate "plot report" system per state/year/site;
|
||
NK publishes ratings as PDF tech sheets without regional flags.
|
||
|
||
---
|
||
|
||
## sources-not-yet-indexed
|
||
|
||
If `list_versions()` doesn't show a vendor in the `vendor` facet, the
|
||
corpus does not have it yet. Direct the farmer to that vendor's
|
||
public catalog or their seed dealer.
|
||
|
||
**Already indexed**: Bayer (DEKALB / Asgrow / WestBred), Syngenta
|
||
(Golden Harvest, NK, AgriPro).
|
||
|
||
**Not yet indexed**:
|
||
|
||
- **Beck's PFR (research)** — 2,089 head-to-head trial documents
|
||
on the public Sanity GROQ API. Different shape from variety
|
||
records — these are studies, not hybrids. Surfacing them would
|
||
benefit a separate tool (e.g. `search_pfr_studies`) rather than
|
||
share a corpus with variety identity.
|
||
- **Beck's products** — ~860 products. Identity-only (SeedIQ login
|
||
gates the ratings).
|
||
|
||
---
|
||
|
||
## trial-data
|
||
|
||
The MCP exposes TWO complementary surfaces:
|
||
|
||
* **`search_docs`** — variety IDENTITY (what a hybrid IS):
|
||
disease ratings, trait stack, maturity, vendor positioning.
|
||
* **`search_trials`** — variety PERFORMANCE (how it ACTUALLY did):
|
||
ranked yield at specific cooperator fields and regions.
|
||
|
||
**Indexed trial sources**:
|
||
|
||
- **Golden Harvest plot reports** (~4,600 cross-vendor head-to-head
|
||
trials, 2024+2025). Each trial = one cooperator's field at a
|
||
specific state/year, comparing products from multiple brands
|
||
(NK / DEKALB / Golden Harvest / Enogen / Pioneer / Channel, etc.)
|
||
side by side. **This is the closest thing to independent
|
||
comparison data the corpus has** — Bayer doesn't publish its own
|
||
trial data, but GH publishes plots where DEKALB hybrids appear
|
||
alongside their competitors.
|
||
- **AgriPro regional trial PDFs** (14 PDFs) — multi-year
|
||
multi-location wheat performance for Northern Plains / Pacific
|
||
Northwest / Plains regions. Variety + per-location yields
|
||
preserved verbatim.
|
||
- **LG Seeds + AgriGold plot reports** (AgReliant) — additional
|
||
cross-vendor corn/soy plots (same head-to-head structure as the
|
||
GH reports).
|
||
- **ProHarvest Seeds plot reports** (corn + soy, 2024+2025) —
|
||
per-cooperator harvest reports from an independent Corn Belt brand.
|
||
Many are cross-vendor (ProHarvest / Apex vs Pioneer / DEKALB /
|
||
Becks / Merschman, etc.). Structured rank/yield/%H2O/test-weight
|
||
tables where the PDF fits ProHarvest's template; foreign-format
|
||
third-party reports are kept verbatim (`raw_text`) so the yields
|
||
are still searchable. Image-only PDFs (no text layer) are skipped.
|
||
- **University-extension variety trials** (`illinois_vt_trials`,
|
||
`iowa_icpt_trials`, `ohio_ocpt_trials`, 2024+2025) — **the
|
||
independent third-party gold standard.** Land-grant programs (U of
|
||
Illinois VT, Iowa State ICPT, Ohio OCPT) that test every *entered*
|
||
brand side-by-side at the same sites with replication + LSD stats.
|
||
The publisher is the university; the seed brands are in each row's
|
||
`brand`. **This is where Pioneer / DEKALB / Channel / Brevant
|
||
performance is legitimately available** (they enter these public
|
||
trials even though we can't scrape their own sites). Caveat: a brand
|
||
only appears where it *entered* — e.g. Brevant didn't enter Iowa
|
||
ICPT, DEKALB/Channel didn't enter Illinois VT; absence in one
|
||
program is a true negative, not missing data. Illinois adds wheat;
|
||
Iowa/Ohio are corn+soy. (Purdue PCPP + other states deferred.)
|
||
|
||
**Recommended workflow when a farmer asks about performance**:
|
||
|
||
1. Call `search_trials(crop, state, year, ...)` to find trials
|
||
from the relevant region/season.
|
||
2. Identify the top performers in the rankings.
|
||
3. Call `lookup_variety(source_key=...)` for each leading hybrid to
|
||
verify identity (RM, traits, disease ratings) — confirm the
|
||
variety actually fits the farmer's situation, not just that it
|
||
won someone else's trial.
|
||
4. If the leading variety is from a brand whose trial data isn't
|
||
directly published (e.g. DEKALB), the GH plot reports often
|
||
show it competing — that's still the agent's best public
|
||
third-party signal.
|
||
|
||
**Trial data NOT in the corpus** (don't fabricate):
|
||
|
||
- **DEKALB / Asgrow / Channel** per-variety yield trials —
|
||
Bayer keeps these in rep tools, not on the public catalog. The
|
||
GH plot reports surface DEKALB/Asgrow performance indirectly,
|
||
but per-variety dedicated trials aren't indexed.
|
||
- **NK yield results** — the data exists at
|
||
`syngenta-us.com/nk/yield-results` but the ASMX endpoint is
|
||
fiddly; not yet scraped. The variety identity is in the corpus
|
||
(`search_docs` finds it), just not the per-region trial yields.
|
||
- **Pioneer trials** — ToS bans automation, so we have no Pioneer
|
||
*identity* data and don't scrape Pioneer's own results. BUT
|
||
Pioneer *performance* IS now available indirectly via the
|
||
university-extension trials (and the GH/ProHarvest plots) where
|
||
Pioneer entered — search those for Pioneer head-to-head yields;
|
||
for Pioneer variety specs, direct the farmer to a dealer.
|
||
- **University extension trials** — NOW INDEXED for IL / IA / OH
|
||
(`illinois_vt_trials` / `iowa_icpt_trials` / `ohio_ocpt_trials`,
|
||
2024+2025). Purdue PCPP and other states (NE / WI / MN / the
|
||
Dakotas / Kansas wheat) are not yet indexed — a future enrichment.
|
||
|
||
**Reading a GH plot report**:
|
||
|
||
Each plot has a cooperator name (the farmer running the trial), a
|
||
state, a year, planting/harvest dates, population, row width, and a
|
||
ranked table of products. The columns vary by crop:
|
||
|
||
- **Corn / Soy**: Rank | Brand | Product | Traits | Yield BU/Ac
|
||
| %MST | Test Weight | Gross Revenue
|
||
- **Silage**: Rank | Brand | Product | Traits | Ton/Acre
|
||
| Milk Per Acre | Milk Per Ton | Beef Per Acre | Beef Per Ton
|
||
|
||
Rank 1 = top performer at that site/year. Note that a single plot
|
||
is one data point — for a robust recommendation, look across
|
||
multiple plots from the same region.
|
||
|
||
---
|
||
|
||
## checking-your-work
|
||
|
||
Before quoting a specific number to a farmer, **always** call
|
||
`lookup_variety(source_key=...)` to confirm. The chunk text inside a
|
||
search_docs response is a faithful render of the sidecar, but the
|
||
sidecar IS the source of truth. Quoting from the canonical sidecar
|
||
makes you robust against:
|
||
|
||
- Chunk-text formatting bugs (e.g. a rare unicode issue trimming a
|
||
value).
|
||
- Future chunker changes (a re-index might rewrite the body).
|
||
- Cross-vendor scale-direction differences (the sidecar's
|
||
`_scale_direction` lets you state the convention explicitly).
|
||
|
||
If `lookup_variety` returns "not found" but `search_docs` surfaced the
|
||
chunk, that's a bug — please report it. (In normal operation, every
|
||
chunk's `source_key` round-trips to a valid sidecar.)
|