Add RobSeeCo (Rob-See-Co + Innotech): 130 corn/soy varieties from the seed-guide PDF

Independent regional brand (Elkhorn, NE; rolled up Federal Hybrids / Big Cob /
Kiser / Rupp's grain-forage). No structured web catalog — the lineup lives in
the 2026 Seed Guide PDF — so this is a PDF-extraction identity source.

- robseeco (130: 87 corn + 43 soy; Rob-See-Co 105 + Innotech 25). Downloads the
  guide (cached under var/, gitignored), dedups the duplicated pages, parses the
  corn (p5-8) + soy (p19-26) ratings tables. Rotated/vertical column headers
  reconstructed by clustering rotated words; cells mapped by x-center alignment;
  descriptive 2-col cards joined by code for trait_stack + strengths. Masters
  Choice silage + sorghum scoped out (row-crop core only).
- SCALE 1-9, 9=Best (higher=better, like Bayer/Stine-corn); column map verified
  against the card bullets (e.g. RC2500 "rapid drydown"->Drydown 8, "short
  plant"->Plant Height 5; RC4779 "industry-leading tar spot"->Tar Spot 7).

Validation: all 130 chunk via rag.chunk.chunks_from_variety (0 errors), 0
duplicate keys, 0 out-of-range ratings (misalignment check), RM/MG all sane.

robseeco.com robots permissive (Squarespace AI-block toggle off; no ToS scrape
clause; PDF on a public CDN). docs: sources.json + README/CLAUDE inventory
(2,398 variety records) + rating-scales lesson (added RobSeeCo to the
higher=better group + the cross-vendor direction warning).
This commit is contained in:
2026-06-09 23:29:11 -04:00
parent 84ad2b1de6
commit 2425a79f0c
265 changed files with 23133 additions and 6 deletions
+1
View File
@@ -34,6 +34,7 @@ and the `crop_seed_api_lessons` tool).
| LG Seeds (AgReliant) | 🟢 | 170 | `lgseeds.com` JSON XHR (+ `lg_plot_reports` trials) |
| Golden Harvest (Syngenta) | 🟢 | 139 | sitemap.xml + server-rendered HTML + Syngenta CDN PDFs (+ `gh_plot_reports` trials) |
| NK (Syngenta) | 🟢 | 122 | static HTML + Syngenta CDN PDFs (shares fetcher with Golden Harvest) |
| **RobSeeCo** (independent, NE) | 🟢 | **130** | **PDF-extraction** of the 2026 Seed Guide (Squarespace; no web catalog). Rob-See-Co + Innotech corn/soy. Scale 1-9 (9=best). Pages duplicated → dedup |
| **ProHarvest Seeds** (independent, IL) | 🟢 | **119** | WordPress REST API (`/wp/v2/seed` + `/seed/<slug>/` detail pages) (+ `proharvest_plots` trials) |
| AgriGold (AgReliant) | 🟢 | 111 | `agrigold.com` server-rendered HTML (+ `agrigold_plot_reports` trials) |
| **1st Choice Seeds** (independent, IN) | 🟢 | **78** | WordPress (CPTs not in REST); per-crop sitemap → detail HTML. Scale 0-10 higher=better. corn/soy/wheat |