Add university-extension trials: Illinois VT + Iowa ICPT + Ohio OCPT (+123 cross-vendor trial docs) #19
Reference in New Issue
Block a user
Delete Branch "add-university-trials"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Adds the university-extension variety trials as cross-vendor
data_type=trialsources — the legitimate, independent path to Pioneer / DEKALB / Brevant / Channel performance the corpus can't scrape directly. Land-grant programs test every entered brand side-by-side at the same sites with replication + LSD stats.illinois_vt_trialsiowa_icpt_trialsohio_ocpt_trials+123 trial docs / 6,713 ranked entries. 91 distinct seed brands, with the majors we couldn't catalog directly now independently present: DEKALB 395, Golden Harvest 249, Channel 241, NK 212, Xitavo 135, LG 103, Pioneer 88, Asgrow 59. (A brand appears only where it entered a program — Brevant absent from Iowa, DEKALB/Channel absent from Illinois — verified true negatives, not parse gaps.)
Chunker: added a gated
include_regionto_render_gh_plot_chunk; the three university sources route through it so the region/district is in the embedded chunk (many same-state/year tables) + framed as "variety trial (cross-vendor, independent third-party)". Existing plot sources (gh/lg/agrigold/proharvest) verified unchanged (no region, "plot report" wording).Hard parts handled: Iowa's year/district is an ASP.NET viewstate POSTBACK (no GET URLs); Ohio's PDF has per-site column groups split by the header's Yield-token count + x-coordinate footnote bucketing, with a site-count sanity gate (0 skips/fallbacks at baseline); Illinois uses header-anchored XLSX cell mapping + a self-locating metadata block.
Validation: all 123 chunk via
chunks_from_trial(0 errors), 0 out-of-range yields, 0 dup keys;"Yield"is the canonical metric key throughout.Legality: all three are public land-grant extension data (published for farmers); no anti-scraping clauses — attribution recorded per
tos_note(UIUC VT / Iowa ICPT-ISU / Ohio OCPT-OSU CFAES).openpyxladded (Illinois XLSX; scrape-time only — not imported by the image-only CI rebuild). 2024+2025 baseline; older years + Purdue deferred behind--include-old.Docs: README/CLAUDE inventory (now 2,398 variety + 6,910 trial) + lessons trial-data/Pioneer entries updated. CI rebuilds the index from the committed corpus.