Merge pull request 'Phase 11: crop_seed_api_lessons tool + Pioneer fallback' (#3) from api-lessons-pioneer-fallback into main
Image rebuild (skip scrape) / build (push) Failing after 7s
Image rebuild (skip scrape) / build (push) Failing after 7s
This commit was merged in pull request #3.
This commit is contained in:
@@ -0,0 +1,259 @@
|
|||||||
|
# crop_seed API lessons
|
||||||
|
|
||||||
|
Curated knowledge that **does not live in the scraped corpus** but
|
||||||
|
that an agent needs to interpret search_docs / get_page results
|
||||||
|
correctly. This file is the source for `crop_seed_api_lessons(topic)`.
|
||||||
|
|
||||||
|
Each section starts with a `## <slug>` heading; the tool returns
|
||||||
|
sections whose slug matches the topic (substring match) or all
|
||||||
|
sections when `topic` is None.
|
||||||
|
|
||||||
|
This file is the *only* place the MCP injects opinionated content.
|
||||||
|
Everything else returned by the server comes verbatim from a vendor
|
||||||
|
catalog. Lessons here should be: factual, well-cited, and conservative
|
||||||
|
about what the corpus actually contains.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## pioneer
|
||||||
|
|
||||||
|
**Pioneer (Corteva) is NOT in the seed-mcp corpus.** The MCP intentionally
|
||||||
|
does not scrape pioneer.com. Per Pioneer's published Terms of Service:
|
||||||
|
|
||||||
|
> "you shall not use any manual or automated software, devices or
|
||||||
|
> other processes (including but not limited to spiders, robots,
|
||||||
|
> scrapers, crawlers, avatars, data mining tools or the like) to
|
||||||
|
> 'scrape' or download data from the Services"
|
||||||
|
|
||||||
|
Pioneer additionally runs Imperva WAF on their catalog API and
|
||||||
|
login-walls dealer lookup, so a polite scrape isn't technically
|
||||||
|
feasible either.
|
||||||
|
|
||||||
|
**Agent behavior when a farmer asks about Pioneer / P-series hybrids:**
|
||||||
|
|
||||||
|
1. Tell the farmer plainly: "I don't have Pioneer's variety data
|
||||||
|
indexed — Pioneer doesn't allow automated indexing of their
|
||||||
|
catalog."
|
||||||
|
2. Recommend pioneer.com → *Find a Sales Rep* (or call 1-800-247-6803)
|
||||||
|
for hybrid specs and local recommendations.
|
||||||
|
3. If the farmer wants an independent third-party rating for a
|
||||||
|
specific Pioneer hybrid, mention that **university extension
|
||||||
|
performance trials** (Iowa State, Illinois Crop Sciences, Purdue,
|
||||||
|
Nebraska, Ohio State) publish multi-location yield data on
|
||||||
|
Pioneer hybrids alongside competitors — useful if they want
|
||||||
|
apples-to-apples data without going through Pioneer's own
|
||||||
|
marketing.
|
||||||
|
4. **Do NOT invent Pioneer hybrid ratings.** If asked "what's the
|
||||||
|
disease tolerance of P1142AM?", the only correct answer is
|
||||||
|
"I don't have that data — please consult Pioneer or an
|
||||||
|
extension service."
|
||||||
|
|
||||||
|
This is the canonical anti-hallucination policy for the seed-mcp.
|
||||||
|
There is no Pioneer data; there is no inference. Direct the farmer
|
||||||
|
to a primary source.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## rating-scales
|
||||||
|
|
||||||
|
Different vendors publish ratings on different conventions. The
|
||||||
|
chunker normalizes the *labels* in the chunk preamble but always
|
||||||
|
preserves the source's `_scale_direction` field in the sidecar.
|
||||||
|
|
||||||
|
**Bayer (DEKALB / Asgrow / WestBred)**: `1-9 (9 = best)`. A
|
||||||
|
GRAY LEAF SPOT rating of 8 means EXCELLENT tolerance. A rating of 2
|
||||||
|
means SUSCEPTIBLE.
|
||||||
|
|
||||||
|
**Syngenta Golden Harvest**: `9-to-1 (9 = best, 1 = worst)` —
|
||||||
|
this is the *direction* Golden Harvest publishes, but the *meaning*
|
||||||
|
of high numbers is the same: high = best. Where the chunker says
|
||||||
|
"normalize" for Golden Harvest, that just means we've already
|
||||||
|
re-stated it as `1-9 (9 = best)` in the chunk preamble; the source's
|
||||||
|
`_scale_direction` field still says `9-to-1` so you can detect the
|
||||||
|
provenance.
|
||||||
|
|
||||||
|
**Syngenta NK / AgriPro**: `1-9 (9 = best)`. Same as Bayer.
|
||||||
|
|
||||||
|
**Beck's**: ratings live behind SeedIQ login; only identity-level
|
||||||
|
data is publicly available, so most disease/agronomic ratings are
|
||||||
|
absent from Beck's records in this corpus.
|
||||||
|
|
||||||
|
**Always check the chunk's "Rating scale" line or call
|
||||||
|
`lookup_variety(source_key)` and look at `_scale_direction` if you
|
||||||
|
are unsure.** Cross-vendor comparisons are valid AFTER you've
|
||||||
|
confirmed each side uses the same direction.
|
||||||
|
|
||||||
|
**Non-numeric values** appear for some characteristics and should be
|
||||||
|
read literally:
|
||||||
|
- `R`, `MR`, `S` for soybean disease resistance = Resistant / Moderately
|
||||||
|
Resistant / Susceptible (not 1-9).
|
||||||
|
- `Rps1c`, `Rps3a`, `Rps1k`, etc. = specific Phytophthora resistance
|
||||||
|
gene present.
|
||||||
|
- `R1`, `R3` (under SOYBEAN CYST NEMATODE) = effective against
|
||||||
|
SCN race 1 / race 3.
|
||||||
|
- `A`, `B`, `C` under HERBICIDE sensitivity = grade letters where A
|
||||||
|
is most tolerant.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## maturity-semantics
|
||||||
|
|
||||||
|
Maturity is encoded differently per crop. Don't conflate the units.
|
||||||
|
|
||||||
|
**Corn — Relative maturity (RM days)**: integer roughly 75-120.
|
||||||
|
Lower = shorter season, suitable for higher latitudes / shorter
|
||||||
|
growing windows. 110 RM is a Central Iowa default; 85 RM suits
|
||||||
|
northern Minnesota or short-season silage; 115+ RM fits southern
|
||||||
|
Indiana / southern Illinois / Missouri Delta. The number is
|
||||||
|
**Pioneer-style RM days**, normalized across the industry.
|
||||||
|
|
||||||
|
**Soybeans — Maturity group (MG)**: float 00 (zero-zero) to 9.0
|
||||||
|
expressed with one decimal. A "3.5 MG" soybean is for central
|
||||||
|
Iowa. Northern North Dakota / Minnesota plant 0.0–1.5 MG. Mid-South
|
||||||
|
plants 5.0+. Each tenth of an MG ≈ 7-10 days of additional season.
|
||||||
|
Sidecar field: `maturity_group` (e.g. "3.5", "0.7").
|
||||||
|
|
||||||
|
**Wheat — Class + heading**: Winter / spring decision is separate
|
||||||
|
from "class" (HRW / HRS / SRW / SWW / SWS / durum):
|
||||||
|
- HRW = Hard Red Winter — Plains states bread wheat
|
||||||
|
- HRS = Hard Red Spring — Northern Plains, North Dakota, Montana
|
||||||
|
- SRW = Soft Red Winter — Eastern Corn Belt, Ohio Valley
|
||||||
|
- SWW = Soft White Winter — Pacific Northwest
|
||||||
|
- SWS = Soft White Spring — Pacific Northwest
|
||||||
|
- Durum — North Dakota / Montana, pasta wheat
|
||||||
|
Maturity is qualitative: Early / Medium-Early / Medium / Medium-Late / Late.
|
||||||
|
**WestBred's product page JSON does not always expose the wheat class
|
||||||
|
as a structured field** — sometimes it's only in the marketing
|
||||||
|
narrative (e.g. "WB1376CLP is a Soft White Winter Clearfield® Plus
|
||||||
|
Wheat variety"). Read `positioning_statement` carefully when the
|
||||||
|
sidecar's `wheat_class` is null.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## trait-glossary
|
||||||
|
|
||||||
|
Common trait codes that appear in `trait_stack`:
|
||||||
|
|
||||||
|
**Corn:**
|
||||||
|
- `SSRIB` — SmartStax® RIB Complete® corn blend (above + below-ground
|
||||||
|
insect protection + Roundup Ready + LibertyLink, with refuge-in-bag)
|
||||||
|
- `VT2PRIB` — VT Double PRO® RIB Complete® (above-ground insect
|
||||||
|
protection + Roundup Ready, refuge-in-bag)
|
||||||
|
- `VT4PRIB` — VT4 PRO® RIB Complete® (newer above-ground protection)
|
||||||
|
- `Trecepta` — Trecepta® (Trecepta + Roundup Ready + LibertyLink, for
|
||||||
|
earworm + western bean cutworm pressure)
|
||||||
|
- `SmartStax PRO` — SmartStax® PRO® (RNAi corn rootworm)
|
||||||
|
- `PowerCore` — PowerCore® Refuge Advanced (older above-ground stack)
|
||||||
|
- `Conventional` — no biotech traits (organic / specialty channels)
|
||||||
|
|
||||||
|
**Soybeans:**
|
||||||
|
- `XF` — XtendFlex® (Roundup Ready 2 Xtend + dicamba + glufosinate)
|
||||||
|
- `Xtend` — Roundup Ready 2 Xtend® (dicamba + glyphosate)
|
||||||
|
- `RR2Y` — Roundup Ready 2 Yield® (glyphosate only)
|
||||||
|
- `E3` — Enlist E3® (2,4-D + glyphosate + glufosinate)
|
||||||
|
- `LL/LL+GT27` — LibertyLink® / LibertyLink + GT27 (glufosinate +
|
||||||
|
glyphosate + isoxaflutole)
|
||||||
|
- `Conkesta E3` — Bt-stack for caterpillar pressure (BR/AR markets)
|
||||||
|
- `SR` — SR® (sulfonylurea-tolerant, Asgrow-specific)
|
||||||
|
|
||||||
|
**Wheat:**
|
||||||
|
- `Clearfield` / `CLP` — Clearfield® / Clearfield® Plus (imazamox
|
||||||
|
tolerance)
|
||||||
|
- `CoAXium` — CoAXium® (quizalofop tolerance) — note: AgriPro's
|
||||||
|
catalog flag, NOT in the WestBred corpus.
|
||||||
|
|
||||||
|
Always render the full trait name (`trait_descriptions`) when telling
|
||||||
|
the farmer "this variety has X trait" — bare trait codes are
|
||||||
|
ambiguous in print.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## scn-resistance
|
||||||
|
|
||||||
|
Soybean Cyst Nematode resistance ratings are critical for fields
|
||||||
|
with SCN pressure (most of the Corn Belt). Read carefully:
|
||||||
|
|
||||||
|
- `R3` under SOYBEAN CYST NEMATODE = Resistant to race 3 (the most
|
||||||
|
common race nationally). Most "SCN-resistant" soybeans on the
|
||||||
|
market are R3.
|
||||||
|
- `R1, R3` = Resistant to both race 1 AND race 3. Higher value;
|
||||||
|
useful in long-rotation SCN fields where race shifts have occurred.
|
||||||
|
- `MR3` = Moderately Resistant to race 3. Some yield loss expected
|
||||||
|
under high SCN pressure.
|
||||||
|
- `S` = Susceptible.
|
||||||
|
- Some Bayer Asgrow XF lines (e.g. AG29XF4) use **Peking-type SCN
|
||||||
|
resistance**, which is genetically distinct from the more common
|
||||||
|
PI 88788 source. Peking is more durable when SCN populations
|
||||||
|
have eroded PI 88788 effectiveness. Look for "Peking type" in the
|
||||||
|
positioning statement.
|
||||||
|
|
||||||
|
**Recommended workflow when a farmer asks about SCN:** call
|
||||||
|
`search_docs` with the user's MG range + "SCN-resistant", then
|
||||||
|
`lookup_variety` on the top 2-3 candidates to verify the exact race
|
||||||
|
coverage and resistance source.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## regional-listings
|
||||||
|
|
||||||
|
The `regional_recommendations` array in each sidecar is sourced from
|
||||||
|
Bayer's "local profiles" — varieties get assigned to regional Seed
|
||||||
|
Guide bundles (e.g. *"2026 Washington, Oregon, SEED GUIDE"*) with a
|
||||||
|
named regional agronomist contact. This is the closest signal we have
|
||||||
|
to *"is this variety recommended for the farmer's geography?"* but
|
||||||
|
note:
|
||||||
|
|
||||||
|
- A variety being absent from a regional listing **does not** mean
|
||||||
|
it's unsuitable — Bayer's local agronomists curate these lists.
|
||||||
|
- Listings are vendor-side recommendations, not third-party trial
|
||||||
|
data.
|
||||||
|
- When the farmer mentions a region, try filtering or scanning for
|
||||||
|
varieties whose `regional_recommendations[].product_list_name`
|
||||||
|
mentions that region.
|
||||||
|
|
||||||
|
Other vendors handle regional placement differently. Golden Harvest
|
||||||
|
publishes a separate "plot report" system per state/year/site;
|
||||||
|
NK publishes ratings as PDF tech sheets without regional flags.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## sources-not-yet-indexed
|
||||||
|
|
||||||
|
These vendors are planned but not yet in the corpus. Don't assume
|
||||||
|
their data is present:
|
||||||
|
|
||||||
|
- **Golden Harvest (Syngenta)** — ~175 varieties, sitemap-driven
|
||||||
|
scrape pending.
|
||||||
|
- **NK (Syngenta)** — 29 varieties.
|
||||||
|
- **AgriPro (Syngenta wheat)** — 24 wheat varieties (HRW, HRS, HWS,
|
||||||
|
SWW, SWS). The only wheat coverage we expect to have outside
|
||||||
|
WestBred.
|
||||||
|
- **Beck's PFR (research)** — 2,089 head-to-head trial documents.
|
||||||
|
Different shape from variety records — these are studies, not
|
||||||
|
hybrids.
|
||||||
|
- **Beck's products** — 860 products. Identity-only (SeedIQ login
|
||||||
|
gates the ratings).
|
||||||
|
|
||||||
|
If `list_versions()` doesn't show a vendor in the `vendor` facet, the
|
||||||
|
corpus does not have it yet. Direct the farmer to that vendor's
|
||||||
|
public catalog or their seed dealer.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## checking-your-work
|
||||||
|
|
||||||
|
Before quoting a specific number to a farmer, **always** call
|
||||||
|
`lookup_variety(source_key=...)` to confirm. The chunk text inside a
|
||||||
|
search_docs response is a faithful render of the sidecar, but the
|
||||||
|
sidecar IS the source of truth. Quoting from the canonical sidecar
|
||||||
|
makes you robust against:
|
||||||
|
|
||||||
|
- Chunk-text formatting bugs (e.g. a rare unicode issue trimming a
|
||||||
|
value).
|
||||||
|
- Future chunker changes (a re-index might rewrite the body).
|
||||||
|
- Cross-vendor scale-direction differences (the sidecar's
|
||||||
|
`_scale_direction` lets you state the convention explicitly).
|
||||||
|
|
||||||
|
If `lookup_variety` returns "not found" but `search_docs` surfaced the
|
||||||
|
chunk, that's a bug — please report it. (In normal operation, every
|
||||||
|
chunk's `source_key` round-trips to a valid sidecar.)
|
||||||
@@ -369,6 +369,40 @@ def _structured_ratings_block(sidecar: dict) -> str:
|
|||||||
return "\n".join(lines).rstrip() + "\n"
|
return "\n".join(lines).rstrip() + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Curated lessons — docs_mcp/lessons.md is the canonical source.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
LESSONS_FILE = Path(__file__).resolve().parent / "lessons.md"
|
||||||
|
_lessons_cache: list[tuple[str, str]] | None = None
|
||||||
|
|
||||||
|
|
||||||
|
def _load_lessons() -> list[tuple[str, str]]:
|
||||||
|
"""Parse lessons.md into ``[(slug, body), ...]`` sections.
|
||||||
|
|
||||||
|
Sections are delimited by ``## <slug>`` headings. The slug is the
|
||||||
|
`<slug>` token (whitespace stripped); the body is everything between
|
||||||
|
that heading and the next ``## `` heading (or EOF).
|
||||||
|
"""
|
||||||
|
global _lessons_cache
|
||||||
|
if _lessons_cache is not None:
|
||||||
|
return _lessons_cache
|
||||||
|
out: list[tuple[str, str]] = []
|
||||||
|
if not LESSONS_FILE.exists():
|
||||||
|
_lessons_cache = out
|
||||||
|
return out
|
||||||
|
text = LESSONS_FILE.read_text(encoding="utf-8")
|
||||||
|
parts = re.split(r"(?m)^## (.+)$", text)
|
||||||
|
# parts = [preamble, slug1, body1, slug2, body2, ...]
|
||||||
|
for i in range(1, len(parts), 2):
|
||||||
|
slug = parts[i].strip()
|
||||||
|
body = parts[i + 1] if i + 1 < len(parts) else ""
|
||||||
|
# Drop trailing horizontal rule that separates sections.
|
||||||
|
body = re.sub(r"\n---\s*$", "", body).strip()
|
||||||
|
out.append((slug, body))
|
||||||
|
_lessons_cache = out
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
# ===========================================================================
|
# ===========================================================================
|
||||||
# Tools
|
# Tools
|
||||||
# ===========================================================================
|
# ===========================================================================
|
||||||
@@ -711,6 +745,78 @@ def lookup_variety(
|
|||||||
return "\n".join(out)
|
return "\n".join(out)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
def crop_seed_api_lessons(
|
||||||
|
topic: Annotated[
|
||||||
|
str | None,
|
||||||
|
Field(description=(
|
||||||
|
"OPTIONAL topic — match against lesson section slugs or body "
|
||||||
|
"(substring, case-insensitive). Known slugs: pioneer, "
|
||||||
|
"rating-scales, maturity-semantics, trait-glossary, "
|
||||||
|
"scn-resistance, regional-listings, sources-not-yet-indexed, "
|
||||||
|
"checking-your-work. Omit for the full curated index."
|
||||||
|
)),
|
||||||
|
] = None,
|
||||||
|
) -> str:
|
||||||
|
"""Curated knowledge that does NOT live in the scraped corpus —
|
||||||
|
vendor scale-direction notes, trait glossary, maturity semantics,
|
||||||
|
SCN resistance interpretation, the **Pioneer fallback policy**,
|
||||||
|
and rules for fact-checking your work.
|
||||||
|
|
||||||
|
Call this tool when:
|
||||||
|
|
||||||
|
* The user asks about **Pioneer** or any P-series hybrid — Pioneer
|
||||||
|
is intentionally NOT scraped (ToS bans it); the lesson tells you
|
||||||
|
what to say instead.
|
||||||
|
* You need to compare ratings across vendors — different vendors
|
||||||
|
publish on different scale directions.
|
||||||
|
* You're parsing a trait code or disease abbreviation you don't
|
||||||
|
recognize.
|
||||||
|
* Before quoting a specific rating value to a farmer — the
|
||||||
|
``checking-your-work`` lesson reminds you to call
|
||||||
|
``lookup_variety`` to confirm.
|
||||||
|
|
||||||
|
This tool is **the only source of opinionated content** in the
|
||||||
|
server. Everything else returned by search_docs / get_page /
|
||||||
|
lookup_variety is verbatim from vendor catalogs.
|
||||||
|
"""
|
||||||
|
with TimedCall("crop_seed_api_lessons", {"topic": topic}) as _call:
|
||||||
|
sections = _load_lessons()
|
||||||
|
if not sections:
|
||||||
|
_call.set(sections_returned=0)
|
||||||
|
return "_(no lessons file present — docs_mcp/lessons.md missing)_"
|
||||||
|
|
||||||
|
if not topic:
|
||||||
|
_call.set(sections_returned=len(sections))
|
||||||
|
return "\n\n---\n\n".join(
|
||||||
|
f"## {slug}\n\n{body}" for slug, body in sections
|
||||||
|
)
|
||||||
|
|
||||||
|
needle = topic.strip().lower()
|
||||||
|
# Prefer slug matches (most specific). Fall back to body match
|
||||||
|
# only when no slug matches — keeps a query like "rating" from
|
||||||
|
# returning every section that happens to mention the word.
|
||||||
|
slug_matches: list[tuple[str, str]] = []
|
||||||
|
body_matches: list[tuple[str, str]] = []
|
||||||
|
for slug, body in sections:
|
||||||
|
if needle in slug.lower():
|
||||||
|
slug_matches.append((slug, body))
|
||||||
|
elif needle in body.lower():
|
||||||
|
body_matches.append((slug, body))
|
||||||
|
matched = slug_matches if slug_matches else body_matches
|
||||||
|
|
||||||
|
_call.set(sections_returned=len(matched), topic=topic)
|
||||||
|
if not matched:
|
||||||
|
slugs = ", ".join(s for s, _ in sections)
|
||||||
|
return (
|
||||||
|
f"_(no lesson section matched topic '{topic}'. "
|
||||||
|
f"Available slugs: {slugs}.)_"
|
||||||
|
)
|
||||||
|
return "\n\n---\n\n".join(
|
||||||
|
f"## {slug}\n\n{body}" for slug, body in matched
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
# ===========================================================================
|
# ===========================================================================
|
||||||
# Entry point
|
# Entry point
|
||||||
# ===========================================================================
|
# ===========================================================================
|
||||||
|
|||||||
Reference in New Issue
Block a user