Add RobSeeCo (Rob-See-Co + Innotech): 130 corn/soy varieties from the seed-guide PDF

Independent regional brand (Elkhorn, NE; rolled up Federal Hybrids / Big Cob /
Kiser / Rupp's grain-forage). No structured web catalog — the lineup lives in
the 2026 Seed Guide PDF — so this is a PDF-extraction identity source.

- robseeco (130: 87 corn + 43 soy; Rob-See-Co 105 + Innotech 25). Downloads the
  guide (cached under var/, gitignored), dedups the duplicated pages, parses the
  corn (p5-8) + soy (p19-26) ratings tables. Rotated/vertical column headers
  reconstructed by clustering rotated words; cells mapped by x-center alignment;
  descriptive 2-col cards joined by code for trait_stack + strengths. Masters
  Choice silage + sorghum scoped out (row-crop core only).
- SCALE 1-9, 9=Best (higher=better, like Bayer/Stine-corn); column map verified
  against the card bullets (e.g. RC2500 "rapid drydown"->Drydown 8, "short
  plant"->Plant Height 5; RC4779 "industry-leading tar spot"->Tar Spot 7).

Validation: all 130 chunk via rag.chunk.chunks_from_variety (0 errors), 0
duplicate keys, 0 out-of-range ratings (misalignment check), RM/MG all sane.

robseeco.com robots permissive (Squarespace AI-block toggle off; no ToS scrape
clause; PDF on a public CDN). docs: sources.json + README/CLAUDE inventory
(2,398 variety records) + rating-scales lesson (added RobSeeCo to the
higher=better group + the cross-vendor direction warning).
This commit is contained in:
2026-06-09 23:29:11 -04:00
parent 84ad2b1de6
commit 2425a79f0c
265 changed files with 23133 additions and 6 deletions
+3 -2
View File
@@ -10,9 +10,9 @@ vendors — **variety identity** (what each hybrid IS) plus **yield-trial data**
## What's in the corpus
**~9,050 indexed records** (one chunk each) across two complementary surfaces:
**~9,200 indexed records** (one chunk each) across two complementary surfaces:
### Variety identity — 2,268 records
### Variety identity — 2,398 records
| Source | Count | Vendor | Brand |
|---|---|---|---|
@@ -21,6 +21,7 @@ vendors — **variety identity** (what each hybrid IS) plus **yield-trial data**
| `stine` | 217 | Stine Seed Company | Stine (corn / soy) — **largest US independent, Adel IA** |
| `lg_seeds` | 170 | AgReliant | LG Seeds (corn / soy / sorghum) |
| `golden_harvest` | 139 | Syngenta | Golden Harvest (corn / soy) |
| `robseeco` | 130 | RobSeeCo | Rob-See-Co / Innotech (corn / soy) — **independent, Elkhorn NE; from the seed-guide PDF** |
| `nk` | 122 | Syngenta | NK (corn / soy) |
| `proharvest` | 119 | ProHarvest Seeds | ProHarvest / Apex (corn / soy / wheat) — **independent Corn Belt brand** |
| `agrigold` | 111 | AgReliant | AgriGold (corn / soy) |