Merge pull request 'Trial-data scrapers + search_trials MCP tool (cross-vendor yield trials)' (#7) from trial-data-scrapers into main
Image rebuild (skip scrape) / build (push) Failing after 12s
Image rebuild (skip scrape) / build (push) Failing after 12s
This commit was merged in pull request #7.
This commit is contained in:
@@ -0,0 +1,44 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-2024-pnw-combined",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2024 Pacific Northwest Combined Summary, Three-Year Data",
|
||||
"filename": "2024%20PNW%20Combined.pdf",
|
||||
"region": "Pacific Northwest",
|
||||
"wheat_class_section": null,
|
||||
"year": 2024,
|
||||
"years_covered": [
|
||||
2024
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Olympia",
|
||||
"AP Exceed",
|
||||
"SY Ovation",
|
||||
"SY Dayton",
|
||||
"SY Assure",
|
||||
"AP Iliad",
|
||||
"LCS Shine",
|
||||
"LCS Artdeco",
|
||||
"Norwest Duet",
|
||||
"LCS Hulk",
|
||||
"PNW Hailey",
|
||||
"LCS Sonic",
|
||||
"Norwest Tandem",
|
||||
"UI Magic CL+",
|
||||
"LCS Blackjack",
|
||||
"LCS Drive",
|
||||
"LCS Jefe",
|
||||
"LCS Kamiack"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2024-09/2024%20PNW%20Combined.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2024-09/2024%20PNW%20Combined.pdf"
|
||||
],
|
||||
"page_text_chars": 2613,
|
||||
"fetched_at": "2026-05-25T19:11:04.196638+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,54 @@
|
||||
# 2024 Pacific Northwest Combined Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Region:** Pacific Northwest
|
||||
- **Year:** 2024
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2024-09/2024%20PNW%20Combined.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Olympia, AP Exceed, SY Ovation, SY Dayton, SY Assure, AP Iliad, LCS Shine, LCS Artdeco, Norwest Duet, LCS Hulk, PNW Hailey, LCS Sonic, Norwest Tandem, UI Magic CL+, LCS Blackjack, LCS Drive, LCS Jefe, LCS Kamiack
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2024 Pacific Northwest Combined Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2022-2024
|
||||
3-Yr Combined 2-Yr Combined Combined Moses Lake, Walla Walla, Aberdeen, Craigmont, Twin Falls,
|
||||
Variety (2022-2024) (2023-2024) (2024) WA WA ID ID ID
|
||||
Soft White Yield TWT Yield TWT Yield TWT Yield Yield Yield Yield Yield
|
||||
Winter Wheat Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A Bu/A Bu/A Bu/A
|
||||
AP Olympia 153.6 63.7 146.5 63.5 154.7 65.4 201.6 189.5 113.8 124.5 144.2
|
||||
AP Exceed 150.5 63.2 147.2 63.2 156.6 65.2 188.0 191.2 124.2 124.7 154.8
|
||||
SY Ovation 148.5 63.2 146.3 63.1 148.9 64.8 186.4 179.4 117.1 120.1 141.3
|
||||
SY Dayton 146.4 63.2 143.1 63.2 155.4 65.1 192.6 189.4 115.4 128.9 150.6
|
||||
SY Assure 141.6 64.0 138.7 63.9 141.3 65.4 144.6 172.6 119.7 133.4 136.1
|
||||
AP Iliad 144.7 63.1 150.7 65.1 173.2 175.3 116.3 135.4 153.2
|
||||
LCS Shine 150.5 62.9 146.0 62.8 156.1 64.4 182.0 169.0 137.3 125.8 166.5
|
||||
LCS Artdeco 149.4 62.5 145.9 62.6 158.9 64.2 197.9 183.8 139.9 122.3 150.3
|
||||
Norwest Duet 148.5 62.3 148.7 62.0 162.1 64.3 220.0 196.5 129.3 123.9 140.8
|
||||
LCS Hulk 147.9 64.0 145.1 63.6 156.3 65.3 204.1 174.1 133.5 123.5 146.4
|
||||
PNW Hailey 147.3 63.9 142.0 63.9 152.1 65.4 188.6 192.7 112.1 126.1 141.0
|
||||
LCS Sonic 146.8 62.8 143.7 62.7 150.3 64.7 190.3 186.6 124.2 125.6 125.0
|
||||
Norwest Tandem 144.2 62.4 143.2 62.3 156.1 64.6 196.1 169.8 137.5 130.1 146.8
|
||||
UI Magic CL+ 143.4 63.7 139.4 63.4 150.5 65.4 179.5 183.6 124.2 123.2 142.1
|
||||
LCS Blackjack 147.7 60.5 162.0 63.1 203.8 198.0 122.3 131.9 154.0
|
||||
LCS Drive 141.7 61.3 153.0 63.8 181.4 181.0 125.8 122.9 153.8
|
||||
LCS Jefe 158.0 65.1 208.3 191.0 113.4 123.0 154.3
|
||||
LCS Kamiack 151.1 65.1 181.1 178.9 122.9 133.2 139.6
|
||||
Mean General 148.7 63.2 146.4 62.9 156.0 64.8 192.3 186.6 126.6 127.6 147.1
|
||||
LSD General (5%) EE 7.8 0.7 9.2 1.0 13.0 1.3 18.5 ns 4.5 8.9 ns
|
||||
CV (Effective) 6.5 1.8 6.5 1.9 5.4 1.5 4.7 6.7 1.8 3.4 6.7
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
NS = Non Significant
|
||||
Locations
|
||||
2022—Colfax, WA; Aberdeen, Genesee, and Twin Falls, ID
|
||||
2023—Moses Lake and Walla Walla, WA; Craigmont, Genesee, and Twin Falls, ID
|
||||
2024—Moses Lake and Walla Walla, WA; Aberdeen, Craigmont, and Twin Falls, ID
|
||||
© 2023 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company. 8-30-24
|
||||
```
|
||||
@@ -0,0 +1,35 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-2025-np-perf-data-sd-web",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Performance Summary, Syngenta Data",
|
||||
"filename": "2025%20NP%20Perf%20Data%20SD%20web.pdf",
|
||||
"region": null,
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"SY Valda",
|
||||
"AP Dagr",
|
||||
"AP Iconic",
|
||||
"AP Elevate",
|
||||
"AP Murdock",
|
||||
"AP Gunsmoke CL2",
|
||||
"SY Ingmar",
|
||||
"AP Revolution",
|
||||
"LCS Trigger"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20SD%20web.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20SD%20web.pdf"
|
||||
],
|
||||
"page_text_chars": 5882,
|
||||
"fetched_at": "2026-05-25T19:11:13.388036+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,104 @@
|
||||
# 2025 Performance Summary, Syngenta Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20SD%20web.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** SY Valda, AP Dagr, AP Iconic, AP Elevate, AP Murdock, AP Gunsmoke CL2, SY Ingmar, AP Revolution, LCS Trigger
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Performance Summary, Syngenta Data
|
||||
South Dakota
|
||||
2025 Yield bu/ac
|
||||
South Dakota Protein Test Wt. Heading Height
|
||||
Variety Avg. Agar Miller Northville Selby % lbs/bu 1-9 1-9
|
||||
AgriPro HY141 59.4 35.8 57.6 72.0 72.1 15.9 57.1 5 6
|
||||
SY Valda 56.8 35.3 63.3 61.2 67.3 16.0 58.2 5 5
|
||||
AP Dagr 56.4 29.7 54.5 69.8 71.5 16.0 55.6 6 4
|
||||
AgriPro HY155 56.2 31.1 57.3 65.6 70.9 16.9 57.6 5 6
|
||||
AgriPro HY162 54.8 30.4 52.2 68.2 68.5 16.2 56.7 5 6
|
||||
AP Iconic 54.1 29.0 52.0 68.9 66.4 17.0 56.3 5 6
|
||||
AP Elevate 53.3 27.1 51.1 71.5 63.8 16.8 56.2 6 4
|
||||
AP Murdock 48.5 29.8 46.6 51.0 66.7 17.5 57.1 4 4
|
||||
AP Gunsmoke CL2 47.6 30.1 44.8 64.5 51.2 17.8 55.7 5 5
|
||||
SY Ingmar 44.3 23.1 40.5 59.3 54.3 17.8 57.2 5 5
|
||||
AP Revolution 43.0 31.5 35.7 38.9 66.1 17.3 57.9 4 4
|
||||
LCS Trigger 63.5 34.0 67.6 77.8 74.6 15.2 58.1 6 6
|
||||
ND Stampede 60.2 47.3 54.3 66.5 72.6 17.1 58.2 5 5
|
||||
Brawn-SD 58.8 42.1 55.4 69.0 68.9 15.4 59.5 5 NA
|
||||
WB9641 57.0 32.2 58.1 71.6 66.0 16.0 56.8 6 5
|
||||
MN-Torgy 56.2 34.0 54.0 67.4 69.4 17.9 58.9 6 6
|
||||
WB9645 55.9 37.5 51.3 67.2 67.6 16.1 56.8 7 6
|
||||
Ascend-SD 54.7 33.2 51.1 66.3 68.3 16.9 57.7 6 8
|
||||
Driver 52.0 30.9 44.4 70.2 62.5 16.7 57.4 5 7
|
||||
WB9642 49.4 34.7 30.0 65.8 67.0 16.4 59.1 6 4
|
||||
WB9590 42.6 41.3 25.8 50.4 52.9 17.9 57.1 4 4
|
||||
Mean 53.7 33.7 50.3 64.4 66.3 16.7 57.4
|
||||
LSD (5%) 8.4 7.1 9.9 5.8 1.0 2.4
|
||||
CV (%) 9.7 9.9 9.0 13.0 4.3 2.6 1.2
|
||||
No. of Locs. 4 2 2
|
||||
Numbers in bold type are in the top yielding group and considered statistically similar.
|
||||
Numerical ratings: Heading: 1= early; Height: 1 = short; Lodging: 1 = no lodging; Disease 1 = tolerant
|
||||
These agronomic assessments are made by Syngenta scientists and reflect each variety’s relative performance within these characteristics through the 2025 crop year. Specific conditions may cause
|
||||
variations within those characteristics. These relative protection values are based on current pest and disease populations. These have been known to shift periodically and may cause changes in specific
|
||||
evaluations. Resistance to many other diseases and pests is sensitive to environmental conditions, plant development stages and the presence and intensity of other diseases which may result in specific
|
||||
evaluation inconsistencies. This chart is updated annually to reflect the most current trends.
|
||||
AgriPro hybrid wheat seed sold commercially contains 75-95% hybrid seed, as required by the Federal Seed Act. Plot trial data for AgriPro hybrids represents performance using seed lots with nearly
|
||||
100% hybrid seed.
|
||||
© 2025 Syngenta. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company. Some or all of the varieties may be protected under one or more of the following: Plant Variety
|
||||
Protection, United States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. NP - 10/2025
|
||||
|
||||
Three-Year Performance Summary, Syngenta Data (2023-2025)
|
||||
South Dakota
|
||||
Yield Average bu/ac Economic Return1 Agronomics and Disease
|
||||
Protein Test Wt. Gross Heading Height Lodging BLS FHB
|
||||
Variety 3-yr 2-yr 2025 % lb/bu $/A Rank 1-9 1-9 1-9 1-9 1-9
|
||||
AgriPro HY141 65.0 65.7 59.4 15.4 60.2 370.0 3 5 6 6 5 4
|
||||
AP Dagr 63.1 62.5 56.4 15.6 59.8 364.2 10 6 4 5 4 5
|
||||
AgriPro HY162 62.7 61.4 54.8 15.7 59.9 364.5 9 5 6 5 5 4
|
||||
AP Iconic 62.2 62.1 54.1 16.2 60.1 368.4 5 5 6 3 4 4
|
||||
SY Valda 62.2 62.5 56.8 16.1 60.5 368.3 6 5 5 5 4 4
|
||||
AgriPro HY155 61.8 61.5 56.2 16.5 60.3 365.8 7 5 6 5 5 4
|
||||
AP Elevate 61.6 60.9 53.3 16.3 60.2 364.9 8 6 4 3 4 4
|
||||
AP Gunsmoke CL2 56.9 55.6 47.6 16.8 60.0 336.9 12 5 5 3 5 4
|
||||
AP Murdock 53.1 54.7 48.5 16.9 60.2 314.3 13 4 4 4 4 4
|
||||
AP Revolution 52.8 53.2 43.0 16.9 60.7 312.9 14 4 4 4 3 3
|
||||
SY Ingmar 52.4 51.7 44.3 17.2 60.7 309.9 15 5 5 3 3 3
|
||||
LCS Trigger 71.9 70.9 63.5 14.3 60.9 376.2 2 6 6 NA 4 NA
|
||||
Brawn-SD 64.5 62.1 58.8 15.5 61.7 369.3 4 5 NA NA 4 5
|
||||
Ascend-SD 64.4 62.5 54.7 16.7 60.8 381.2 1 6 8 7 3 4
|
||||
Driver 60.5 58.5 52.0 16.3 60.5 358.1 11 5 7 NA NA NA
|
||||
ND Stampede 60.2 5 5 5 4 5
|
||||
WB9641 57.0 6 5 4 6 5
|
||||
MN-Torgy 56.2 6 6 5 3 3
|
||||
WB9645 55.9 7 6 5 6 5
|
||||
WB9642 49.4 6 4 6 6 4
|
||||
WB9590 42.6 4 4 2 6 6
|
||||
Mean 61.0 60.4 53.7 16.2 60.4
|
||||
LSD (5%) 4.3 4.5 8.4 0.7 0.4
|
||||
CV (%) 7.7 7.9 9.74 0.8 2.2
|
||||
No. of Locs. 10 8 4 6 6
|
||||
Numbers in bold type are in the top yielding group and considered statistically similar.
|
||||
Numerical ratings: Heading: 1= early; Height: 1 = short; Lodging: 1 = no lodging; Disease 1 = tolerant
|
||||
2025 Locations: Agar, Miller, Northville, and Selby SD
|
||||
2024 Locations: Agar, Roscoe, Northville, and Selby, SD
|
||||
2023 Locations: Selby, and Webster, SD
|
||||
1 Economic return calculated by using the three-year yield average multiplied by the average grain price ($5.12/bu). (+) 8 cents per 1/5th premium over 14% protein, (-) 10 cents
|
||||
per 1/5 discount under 14% protein
|
||||
These agronomic assessments are made by Syngenta scientists and reflect each variety’s relative performance within these characteristics through the 2025crop year. Specific conditions may cause variations
|
||||
within those characteristics. These relative protection values are based on current pest and disease populations. These have been known to shift periodically and may cause changes in specific evaluations.
|
||||
Resistance to many other diseases and pests is sensitive to environmental conditions, plant development stages and the presence and intensity of other diseases which may result in specific evaluation
|
||||
inconsistencies. This chart is updated annually to reflect the most current trends.
|
||||
AgriPro hybrid wheat seed sold commercially contains 75-95% hybrid seed, as required by the Federal Seed Act. Plot trial data for AgriPro hybrids represents performance using seed lots with nearly 100%
|
||||
hybrid seed.
|
||||
© 2025 Syngenta. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company. Some or all of the varieties may be protected under one or more of the following: Plant Variety
|
||||
Protection, United States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. NP - 10/2025
|
||||
```
|
||||
@@ -0,0 +1,36 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-2025-np-perf-data-web-east",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Performance Summary, Syngenta Data",
|
||||
"filename": "2025%20NP%20Perf%20Data%20web%20East.pdf",
|
||||
"region": null,
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Iconic",
|
||||
"AP Elevate",
|
||||
"SY Valda",
|
||||
"AP Murdock",
|
||||
"AP Smith",
|
||||
"AP Dagr",
|
||||
"AP Gunsmoke CL2",
|
||||
"SY",
|
||||
"SY Ingmar",
|
||||
"LCS Boom"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20web%20East.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20web%20East.pdf"
|
||||
],
|
||||
"page_text_chars": 7194,
|
||||
"fetched_at": "2026-05-25T19:11:10.521992+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,114 @@
|
||||
# 2025 Performance Summary, Syngenta Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20web%20East.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Iconic, AP Elevate, SY Valda, AP Murdock, AP Smith, AP Dagr, AP Gunsmoke CL2, SY, SY Ingmar, LCS Boom
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Performance Summary, Syngenta Data
|
||||
Eastern North Dakota and Minnesota
|
||||
2025 Yield bu/ac
|
||||
Test
|
||||
Minnesota North Dakota
|
||||
Prot. Wt.
|
||||
Variety Avg. Crookston Glyndon Warren Wolverton Cando Casselton Langdon McVille Park River Thompson % lbs/bu
|
||||
AgriPro HY141 90.8 87.2 75.8 85.7 90.7 88.4 103.5 108.6 92.7 82.0 93.9 13.5 59.5
|
||||
AgriPro HY155 90.8 93.7 74.7 93.3 88.3 84.6 91.8 101.3 99.1 85.3 96.1 14.0 60.2
|
||||
AgriPro HY162 86.7 85.8 73.1 88.9 92.5 75.8 86.3 103.8 90.1 82.3 88.6 13.5 60.2
|
||||
AP Iconic 83.5 92.2 69.0 75.1 76.9 86.8 85.5 104.1 86.6 73.6 85.5 13.5 58.7
|
||||
AP Elevate 81.9 84.3 66.1 74.2 74.9 78.1 81.2 104.7 93.9 73.2 87.9 13.5 59.7
|
||||
SY Valda 81.2 74.6 71.6 94.7 73.2 81.0 82.8 91.0 82.5 72.9 87.3 13.9 58.4
|
||||
AP Murdock 80.2 90.0 64.7 85.2 78.6 75.7 82.3 96.2 87.4 55.3 87.1 13.5 59.5
|
||||
AP Smith 80.2 80.8 64.9 83.0 70.3 71.3 85.0 91.5 95.1 71.6 88.6 14.4 59.1
|
||||
AP Dagr 79.0 79.2 68.1 90.5 71.2 75.5 82.1 88.1 79.5 59.0 97.0 13.2 57.6
|
||||
AP Gunsmoke CL2 77.4 81.7 63.1 79.7 67.8 82.3 83.7 93.5 88.1 49.9 83.9 13.3 58.1
|
||||
SY 611 CL2 76.7 78.0 63.1 73.2 71.1 74.4 77.3 100.8 83.3 62.7 82.8 14.5 58.9
|
||||
SY Ingmar 74.8 70.0 62.1 80.8 66.6 69.4 85.0 89.0 73.1 65.0 86.7 14.5 59.1
|
||||
ND Stampede 87.1 86.8 69.0 90.3 88.7 77.2 91.0 100.5 100.9 76.2 90.4 15.1 58.8
|
||||
WB9645 85.7 96.2 72.7 92.2 73.9 74.2 95.1 103.0 99.2 69.1 81.4 13.3 58.9
|
||||
Ascend-SD 85.5 95.6 69.0 88.8 90.5 79.7 88.3 93.1 90.6 78.0 81.7 14.1 60.5
|
||||
WB9641 85.5 74.7 64.2 93.8 75.0 88.1 92.1 104.8 98.3 65.4 98.9 12.7 58.6
|
||||
WB9590 84.8 81.3 63.8 88.1 76.9 77.0 86.2 103.1 93.0 77.0 101.4 14.8 59.5
|
||||
WB9606 83.8 87.4 70.8 90.6 69.7 78.8 96.4 95.5 94.2 75.3 79.0 12.9 60.5
|
||||
Faller 82.5 88.9 62.2 92.1 69.4 83.8 83.0 98.6 87.6 66.5 92.6 14.0 58.9
|
||||
WB9642 81.9 83.7 65.7 92.0 77.8 81.1 80.8 91.9 85.2 73.2 88.0 13.6 60.9
|
||||
MN-Torgy 80.9 89.4 65.1 84.6 80.9 82.0 76.0 92.6 94.5 70.3 73.7 14.6 60.5
|
||||
MN-Rothsay 80.8 82.2 65.8 91.1 69.7 82.2 80.5 96.1 91.1 60.7 88.3 13.9 58.8
|
||||
WB9719 75.9 84.6 60.3 88.8 63.8 72.7 79.1 97.8 78.4 42.9 90.7 13.3 57.5
|
||||
LCS Boom 74.3 84.0 52.5 77.5 67.6 79.3 76.5 91.7 91.0 46.4 76.9 14.2 60.6
|
||||
ND Thresher 73.3 81.9 56.3 86.9 71.9 73.7 70.5 86.7 70.2 56.2 78.9 14.4 56.6
|
||||
Mean 82.2 84.8 66.4 86.9 76.6 79.2 85.2 97.5 89.2 68.3 87.6 13.8 59.2
|
||||
LSD (5%) 4.8 11.5 4.7 13.2 13.1 11.0 — 10.6 10.3 9.2 11.4 0.5 0.7
|
||||
CV (%) 6.5 6.6 4.3 7.3 8.2 8.4 12.1 6.6 7.0 7.8 6.4 6.9 2.1
|
||||
No. of Locs. 10 8 10
|
||||
Numbers in bold type are in the top yielding group and considered statistically similar.
|
||||
Numerical ratings: Heading: 1= Early, Height: 1 = Short
|
||||
These agronomic assessments are made by Syngenta scientists and reflect each variety’s relative performance within these characteristics through the 2025 crop year. Specific conditions may cause variations
|
||||
within those characteristics. These relative protection values are based on current pest and disease populations. These have been known to shift periodically and may cause changes in specific evaluations.
|
||||
Resistance to many other diseases and pests is sensitive to environmental conditions, plant development stages and the presence and intensity of other diseases which may result in specific evaluation
|
||||
inconsistencies. This chart is updated annually to reflect the most current trends.
|
||||
AgriPro hybrid wheat seed sold commercially contains 75-95% hybrid seed, as required by the Federal Seed Act. Plot trial data for AgriPro hybrids represents performance using seed lots with nearly 100%
|
||||
hybrid seed.
|
||||
© 2025 Syngenta. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company. Some or all of the varieties may be protected under one or more of the following: Plant Variety
|
||||
Protection, United States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. NP - 10/2025
|
||||
|
||||
Three-Year Performance Summary, Syngenta Data (2023-2025)
|
||||
Eastern North Dakota and Minnesota
|
||||
Yield Average bu/ac Economic Return1 Agronomics and Disease
|
||||
Protein Test Wt. Heading Height Lodging BLS FHB
|
||||
Variety 3-yr 2-yr 2025 % lb/bu Gross $/A Rank 1-9 1-9 1-9 1-9 1-9
|
||||
AgriPro HY155 92.4 91.5 90.8 14.4 60.2 488.8 1 5 6 5 5 4
|
||||
AgriPro HY141 91.7 90.6 90.8 13.8 59.8 461.9 5 5 6 6 5 4
|
||||
AgriPro HY162 91.0 89.3 86.7 13.6 60.3 450.0 8 5 6 5 5 4
|
||||
AP Iconic 87.5 87.1 83.5 14.0 59.4 446.3 9 5 6 3 4 4
|
||||
AP Elevate 85.9 86.0 81.9 14.5 60.1 458.5 6 6 4 3 4 4
|
||||
AP Dagr 85.7 82.6 79.0 13.7 58.7 426.2 17 6 4 5 4 5
|
||||
SY Valda 85.5 84.5 81.2 14.2 58.8 446.2 10 5 5 5 4 4
|
||||
AP Smith 82.6 82.0 80.2 14.8 59.6 450.3 7 6 4 2 3 4
|
||||
AP Murdock 82.4 82.4 80.2 14.3 60.0 431.5 16 4 4 4 4 4
|
||||
AP Gunsmoke CL2 81.2 80.9 77.4 14.6 58.9 434.8 15 5 5 3 5 4
|
||||
SY 611 CL2 80.7 79.5 76.7 14.9 59.3 443.4 11 5 4 4 4 3
|
||||
SY Ingmar 79.8 79.7 74.8 15.0 59.7 440.1 13 5 5 3 3 3
|
||||
WB9606 87.7 85.8 83.8 13.8 60.9 438.9 14 5 6 5 6 5
|
||||
MN-Rothsay 86.3 84.3 80.8 14.8 59.9 468.5 3 7 3 3 4 5
|
||||
WB9590 85.5 84.4 84.8 15.1 59.3 476.8 2 4 4 2 6 6
|
||||
Faller 85.0 82.9 82.5 14.2 59.3 443.3 12 6 7 7 3 3
|
||||
MN-Torgy 84.0 82.4 80.9 15.1 60.9 466.3 4 6 6 5 3 3
|
||||
WB9719 81.6 77.9 75.9 14.0 58.6 417.1 18 6 5 2 5 6
|
||||
Ascend-SD 84.3 85.5 6 8 7 3 4
|
||||
ND Thresher 74.4 73.3 6 5 7 3 5
|
||||
ND Stampede 87.1 5 5 5 4 5
|
||||
WB9645 85.7 7 6 5 6 5
|
||||
WB9641 85.5 6 5 4 6 5
|
||||
WB9642 81.9 6 4 6 6 4
|
||||
LCS Boom 74.3 3 5 4 6 4
|
||||
Mean 85.4 83.6 82.5 14.6 59.7
|
||||
LSD (5%) 3.3 4.0 4.8 0.5 0.8
|
||||
CV (%) 6.7 6.9 6.5 2.8 2.1
|
||||
No. of Locs. 23 16 10 22 21
|
||||
Numbers in bold type are in the top yielding group and considered statistically similar.
|
||||
Numerical ratings: Heading: 1= early; Height: 1 = short; Disease: 1 = no disease
|
||||
2025 Locations: Crookston, Glyndon, Warren, and Wolverton MN; Cando, Casselton, Langdon, McVille, Park River, and Thompson ND
|
||||
2024 Locations: Casselton, Drayton, Langdon, and Park River, ND; Crookston and Warren, MN
|
||||
2023 Locations: Casselton, McVille, Park River, and Valley City, ND; Crookston, Glyndon, and Warren, MN
|
||||
1 Economic return calculated by using the three-year yield average multiplied by the average grain price ($5.12/bu). (+) 8 cents per 1/5th premium over 14% protein, (-) 10 cents
|
||||
per 1/5 discount under 14% protein.
|
||||
These agronomic assessments are made by Syngenta scientists and reflect each variety’s relative performance within these characteristics through the 2025 crop year. Specific conditions may cause
|
||||
variations within those characteristics. These relative protection values are based on current pest and disease populations. These have been known to shift periodically and may cause changes in specific
|
||||
evaluations. Resistance to many other diseases and pests is sensitive to environmental conditions, plant development stages and the presence and intensity of other diseases which may result in specific
|
||||
evaluation inconsistencies. This chart is updated annually to reflect the most current trends.
|
||||
AgriPro hybrid wheat seed sold commercially contains 75-95% hybrid seed, as required by the Federal Seed Act. Plot trial data for AgriPro hybrids represents performance using seed lots with nearly 100%
|
||||
hybrid seed.
|
||||
© 2025 Syngenta. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company. Some or all of the varieties may be protected under one or more of the following: Plant Variety
|
||||
Protection, United States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. NP - 10/2025
|
||||
```
|
||||
@@ -0,0 +1,36 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-2025-np-perf-data-web-west",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Performance Summary, Syngenta Data",
|
||||
"filename": "2025%20NP%20Perf%20Data%20web%20west.pdf",
|
||||
"region": null,
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Iconic",
|
||||
"AP Elevate",
|
||||
"SY",
|
||||
"SY Valda",
|
||||
"AP Smith",
|
||||
"AP Dagr",
|
||||
"AP Gunsmoke CL2",
|
||||
"AP Murdock",
|
||||
"SY Ingmar",
|
||||
"LCS Boom"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20web%20west.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20web%20west.pdf"
|
||||
],
|
||||
"page_text_chars": 6380,
|
||||
"fetched_at": "2026-05-25T19:11:11.464402+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,113 @@
|
||||
# 2025 Performance Summary, Syngenta Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-11/2025%20NP%20Perf%20Data%20web%20west.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Iconic, AP Elevate, SY, SY Valda, AP Smith, AP Dagr, AP Gunsmoke CL2, AP Murdock, SY Ingmar, LCS Boom
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Performance Summary, Syngenta Data
|
||||
Western North Dakota
|
||||
2025 Yield bu/ac
|
||||
North Dakota
|
||||
Prot. Test Wt.
|
||||
Variety Avg. Berthold Coleharbor Harvey New Leipzig Steele Velva % lbs/bu
|
||||
AgriPro HY162 78.9 45.9 110.4 89.1 67.3 61.1 99.3 14.6 60.2
|
||||
AgriPro HY155 77.5 45.4 110.9 85.1 65.2 60.3 98.0 15.3 59.7
|
||||
AP Iconic 75.8 50.2 100.4 88.9 67.4 55.2 92.7 14.5 59.9
|
||||
AgriPro HY141 75.1 46.8 101.8 75.8 68.4 59.1 98.6 14.3 60.0
|
||||
AP Elevate 74.6 52.8 104.9 81.6 57.1 61.4 89.6 14.9 60.7
|
||||
SY 611 CL2 73.3 54.0 98.7 74.3 66.3 55.9 90.5 15.5 60.3
|
||||
SY Valda 72.5 56.7 94.3 78.3 62.1 52.0 91.7 14.5 58.9
|
||||
AP Smith 72.1 56.9 98.2 81.9 46.9 58.8 90.1 15.2 59.9
|
||||
AP Dagr 70.7 59.2 94.3 63.9 60.9 53.3 92.4 14.4 59.6
|
||||
AP Gunsmoke CL2 70.5 51.2 92.5 77.9 61.0 53.5 87.0 15.5 60.5
|
||||
AP Murdock 69.2 50.7 97.5 69.7 64.7 48.1 84.2 15.0 60.8
|
||||
SY Ingmar 65.8 49.0 95.8 68.5 57.3 42.7 81.4 15.5 58.7
|
||||
ND Stampede 78.1 51.0 107.3 86.8 62.0 68.5 93.2 16.1 60.3
|
||||
WB9641 76.8 58.8 104.9 87.2 51.6 62.8 95.4 14.1 59.8
|
||||
Faller 76.5 47.1 98.6 85.1 63.6 68.1 96.6 15.2 60.4
|
||||
WB9645 75.6 51.3 100.9 76.0 63.4 58.9 102.8 14.1 60.6
|
||||
MN-Torgy 75.1 58.5 93.0 83.0 68.5 63.2 84.6 15.8 61.8
|
||||
LCS Boom 75.0 49.2 100.7 87.7 70.8 54.1 87.7 15.0 62.5
|
||||
WB9606 74.0 49.1 109.1 84.1 52.7 55.5 93.3 14.4 60.6
|
||||
Ascend-SD 73.0 50.7 98.9 84.4 50.3 68.4 85.0 15.1 62.1
|
||||
WB9590 72.2 52.6 104.4 74.8 59.4 49.8 92.1 15.6 60.4
|
||||
WB9642 71.8 48.0 96.3 66.9 72.3 57.1 90.4 14.9 60.4
|
||||
WB9719 71.7 50.4 98.6 71.1 69.1 53.2 87.9 15.0 59.7
|
||||
MN-Rothsay 71.0 48.0 100.7 89.1 36.5 61.4 90.5 13.5 60.4
|
||||
ND Thresher 68.4 49.1 93.5 68.8 58.3 55.6 85.0 15.8 59.2
|
||||
Mean 73.7 51.3 100.8 79.1 61.3 57.8 91.7 14.9 60.3
|
||||
LSD (5%) 6.6 10.4 12.5 8.9 — 6.2 1.1 1.6
|
||||
CV (%) 7.7 11.8 6.3 9.6 8.7 — 3.2 5.1 2.3
|
||||
No. of Locs. 6 5 6
|
||||
Numbers in bold type are in the top yielding group and considered statistically similar.
|
||||
Numerical ratings: Heading: 1= Early, Height: 1 = Short
|
||||
These agronomic assessments are made by Syngenta scientists and reflect each variety’s relative performance within these characteristics through the 2025 crop year. Specific conditions may cause variations
|
||||
within those characteristics. These relative protection values are based on current pest and disease populations. These have been known to shift periodically and may cause changes in specific evaluations.
|
||||
Resistance to many other diseases and pests is sensitive to environmental conditions, plant development stages and the presence and intensity of other diseases which may result in specific evaluation
|
||||
inconsistencies. This chart is updated annually to reflect the most current trends.
|
||||
AgriPro hybrid wheat seed sold commercially contains 75-95% hybrid seed, as required by the Federal Seed Act. Plot trial data for AgriPro hybrids represents performance using seed lots with nearly 100%
|
||||
hybrid seed.
|
||||
© 2025 Syngenta. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company. Some or all of the varieties may be protected under one or more of the following: Plant Variety
|
||||
Protection, United States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. NP - 10/2025
|
||||
|
||||
Three-Year Performance Summary, Syngenta Data (2023-2025)
|
||||
Western North Dakota
|
||||
Yield Average bu/ac Economic Return1 Agronomics and Disease
|
||||
Protein Test Wt. Heading Height Lodging BLS FHB
|
||||
Variety 3-yr 2-yr 2025 % lb/bu Gross $/A Rank 1-9 1-9 1-9 1-9 1-9
|
||||
AgriPro HY162 84.9 81.3 78.9 14.2 60.0 440.5 4 5 6 5 5 4
|
||||
AP Iconic 83.4 78.5 75.8 14.3 59.9 438.3 7 5 6 3 4 4
|
||||
AgriPro HY141 83.3 79.6 75.1 14.1 59.9 430.5 11 5 6 6 5 4
|
||||
AgriPro HY155 82.5 79.4 77.5 14.8 59.4 449.9 2 5 6 5 5 4
|
||||
AP Elevate 80.4 76.9 74.6 14.8 60.3 438.6 6 6 4 3 4 4
|
||||
AP Dagr 79.8 73.4 70.7 14.1 59.5 412.3 16 6 4 5 4 5
|
||||
SY Valda 79.5 73.4 72.5 14.3 59.1 416.3 15 5 5 5 4 4
|
||||
SY 611 CL2 78.2 73.5 73.3 15.1 59.7 435.7 8 5 4 4 4 3
|
||||
AP Smith 77.9 74.0 72.1 15.0 59.8 431.3 10 6 4 2 3 4
|
||||
AP Gunsmoke CL2 76.9 72.8 70.5 15.4 60.1 435.7 9 5 5 3 5 4
|
||||
AP Murdock 76.6 74.1 69.2 14.9 60.5 418.6 14 4 4 4 4 4
|
||||
SY Ingmar 74.7 71.4 65.8 15.4 59.5 424.5 13 5 5 3 3 3
|
||||
WB9606 82.9 78.6 74.0 14.1 60.2 429.1 12 5 6 5 6 5
|
||||
Faller 81.9 76.9 76.5 14.7 59.8 443.0 3 6 7 7 3 3
|
||||
WB9719 80.6 75.7 71.7 14.9 59.4 440.1 5 6 5 2 5 6
|
||||
WB9590 78.4 75.7 72.2 15.7 59.9 453.6 1 4 4 2 6 6
|
||||
MN-Torgy 77.2 75.1 6 6 5 3 3
|
||||
Ascend-SD 75.0 73.0 6 8 7 3 4
|
||||
MN-Rothsay 74.0 71.0 7 3 3 4 5
|
||||
ND Thresher 69.2 68.4 6 5 7 3 5
|
||||
ND Stampede 78.1 5 5 5 4 5
|
||||
WB9641 76.8 6 5 4 6 5
|
||||
WB9645 75.6 7 6 5 6 5
|
||||
LCS Boom 75.0 3 5 4 6 4
|
||||
WB9642 71.8 6 4 6 6 4
|
||||
Mean 80.1 75.5 73.7 14.7 59.8
|
||||
LSD (5%) 4.3 5.7 6.6 0.5 1.2
|
||||
CV (%) 6.9 8.1 7.7 5.9 2.2
|
||||
No. of Locs. 13 9 6 8 9
|
||||
Numbers in bold type are in the top yielding group and considered statistically similar.
|
||||
Numerical ratings: Heading: 1= early; Height: 1 = short; Disease: 1 = no disease
|
||||
2025 Locations: Berthold, Coleharbor, Harvey, New Leipzig, Steele, and Velva ND
|
||||
2024 Locations: Coleharbor, Harvey, and Velva, ND
|
||||
2023 Locations: Berthold, Coleharbor, New Leipzig, and Velva, ND
|
||||
1 Economic return calculated by using the three-year yield average multiplied by the average grain price ($5.12/bu). (+) 8 cents per 1/5th premium over 14% protein, (-) 10 cents
|
||||
per 1/5 discount under 14% protein.
|
||||
These agronomic assessments are made by Syngenta scientists and reflect each variety’s relative performance within these characteristics through the 2025 crop year. Specific conditions may cause
|
||||
variations within those characteristics. These relative protection values are based on current pest and disease populations. These have been known to shift periodically and may cause changes in specific
|
||||
evaluations. Resistance to many other diseases and pests is sensitive to environmental conditions, plant development stages and the presence and intensity of other diseases which may result in specific
|
||||
evaluation inconsistencies. This chart is updated annually to reflect the most current trends.
|
||||
AgriPro hybrid wheat seed sold commercially contains 75-95% hybrid seed, as required by the Federal Seed Act. Plot trial data for AgriPro hybrids represents performance using seed lots with nearly 100%
|
||||
hybrid seed.
|
||||
© 2025 Syngenta. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company. Some or all of the varieties may be protected under one or more of the following: Plant Variety
|
||||
Protection, United States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. NP - 10/2025
|
||||
```
|
||||
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-central-plains-dryland-2025-r1",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Central Plains Dryland Summary, Three-Year Data",
|
||||
"filename": "Central%20Plains%20Dryland%202025%20r1.pdf",
|
||||
"region": "Central Plains",
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"SY Wolverine",
|
||||
"AP Roadrunner",
|
||||
"AP Sunbird",
|
||||
"AP Bigfoot",
|
||||
"AP Prolific",
|
||||
"SY Monument",
|
||||
"LCS Atomic AX"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-07/Central%20Plains%20Dryland%202025%20r1.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-07/Central%20Plains%20Dryland%202025%20r1.pdf"
|
||||
],
|
||||
"page_text_chars": 2530,
|
||||
"fetched_at": "2026-05-25T19:11:09.410687+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,56 @@
|
||||
# 2025 Central Plains Dryland Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Region:** Central Plains
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-07/Central%20Plains%20Dryland%202025%20r1.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** SY Wolverine, AP Roadrunner, AP Sunbird, AP Bigfoot, AP Prolific, SY Monument, LCS Atomic AX
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Central Plains Dryland Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Belleville, Junction City, Pratt, Salina,
|
||||
Variety (2023-2025) (2024-2025) (2025) KS* KS KS KS
|
||||
Hard Winter Wheat Yield TWT Yield TWT Yield TWT Yield Yield Yield Yield
|
||||
Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A Bu/A Bu/A
|
||||
AP24 AX 63.9 58.8 70.7 58.9 73.8 55.3 64.8 94.7 59.0 76.7
|
||||
SY Wolverine 62.0 60.6 66.2 60.6 66.1 56.9 58.2 86.2 67.9 51.9
|
||||
AP Roadrunner 61.3 57.8 64.9 57.4 61.3 53.9 54.1 79.7 52.9 58.5
|
||||
AP Sunbird 61.2 61.0 67.2 61.3 66.6 57.9 59.0 86.7 54.3 66.5
|
||||
AP Bigfoot 60.0 61.3 65.2 61.5 64.7 59.0 53.1 86.6 53.3 65.9
|
||||
AP Prolific 59.6 60.4 64.1 60.4 61.8 58.2 46.0 86.2 45.8 69.2
|
||||
SY Monument 59.2 59.6 62.5 59.8 56.8 55.9 48.2 71.6 48.9 58.6
|
||||
Bob Dole 56.4 59.1 60.2 58.9 57.7 55.2 47.0 68.3 50.9 64.6
|
||||
Showdown 62.9 60.1 69.2 60.1 65.2 55.7 56.4 83.0 54.7 66.6
|
||||
Rockstar 61.7 58.8 67.4 59.1 65.5 55.8 54.0 83.8 55.9 68.3
|
||||
KS Providence 61.1 60.4 66.3 60.3 65.2 57.2 52.5 88.9 59.5 59.7
|
||||
WB4401 60.4 60.3 65.5 60.2 61.5 56.1 51.7 78.8 57.0 58.5
|
||||
LCS Atomic AX 59.4 61.7 65.2 61.8 65.1 59.2 52.3 88.0 50.0 70.1
|
||||
WB4523 58.2 59.5 62.8 59.6 57.5 56.2 40.6 68.2 60.0 61.2
|
||||
WB4422 69.6 59.7 71.0 56.3 62.5 87.3 63.6 70.6
|
||||
KS Bill Snyder 73.1 59.0 59.3 94.6 57.4 81.2
|
||||
WB4699 64.0 55.3 51.0 84.2 56.9 64.1
|
||||
High Cotton 63.8 58.2 43.8 90.9 59.4 61.2
|
||||
Polansky Goldenhawk 63.0 56.4 52.7 86.5 55.9 56.8
|
||||
Doublestop CLP 60.7 57.7 53.9 74.0 51.3 63.4
|
||||
Mean General 60.6 60.0 65.9 60.0 63.7 56.7 51.7 83.6 54.3 65.4
|
||||
LSD General (5%) EE 3.9 1.1 4.6 1.2 8.3 2.7 9.5 10.7 8.1 9.9
|
||||
CV (Effective) 9.0 2.4 8.8 2.5 9.2 3.3 11.2 7.8 9.1 9.2
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
* Location was affected by a Wheat Streak Mosaic Virus infestation, which resulted in reduced yield of susceptible varieties.
|
||||
Locations
|
||||
2023 — Belleville, Conway Springs, Junction City, and Salina, KS; Carrier, OK
|
||||
2024 — Belleville, Conway Springs, Junction City, Palco, Pratt, and Salina, KS; Carrier, OK
|
||||
2025 — Belleville, Junction City, Pratt, and Salina, KS
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,32 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-montana-2025-web",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Montana Summary, Three-Year Data",
|
||||
"filename": "Montana%202025%20web.pdf",
|
||||
"region": "Montana",
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Solid",
|
||||
"SY Monument",
|
||||
"AP Sunbird",
|
||||
"SY",
|
||||
"LCS Steel AX",
|
||||
"LCS Julep"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-08/Montana%202025%20web.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-08/Montana%202025%20web.pdf"
|
||||
],
|
||||
"page_text_chars": 1922,
|
||||
"fetched_at": "2026-05-25T19:11:12.271213+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,53 @@
|
||||
# 2025 Montana Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Region:** Montana
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-08/Montana%202025%20web.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Solid, SY Monument, AP Sunbird, SY, LCS Steel AX, LCS Julep
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Montana Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Fort Benton, Conrad, Billings, Sawfly
|
||||
Variety (2023-2025) (2024-2025) (2025) MT MT MT Protein Damage
|
||||
Hard Red Yield Yield Yield TWT Yield Yield Yield % 1-9
|
||||
Winter Wheat Bu/A Bu/A Bu/A Lb/Bu Bu/A Bu/A Bu/A
|
||||
AP24 AX 63.4 62.1 71.8 61.3 59.5 47.0 109.0 12.2 5
|
||||
AP Solid 63.4 61.9 71.6 63.9 58.7 43.0 113.1 13.4 4
|
||||
SY Monument 62.5 62.5 72.9 61.6 58.9 42.4 117.3 12.1 5
|
||||
AP Sunbird 61.5 61.8 70.7 61.1 61.7 38.9 111.4 12.6 6
|
||||
AP18 AX 59.0 58.3 68.5 62.9 56.3 41.2 108.1 12.7 6
|
||||
SY 517 CL2 53.4 53.1 63.4 56.3 55.4 33.7 101.2 13.9 5
|
||||
Keldin 65.6 63.5 75.0 61.9 69.5 39.4 115.9 12.9 5
|
||||
Bobcat 63.5 63.2 73.2 62.5 68.5 47.8 103.4 13.1 2
|
||||
MT WarCat 61.4 59.5 69.7 62.9 58.6 42.3 108.4 13.2 3
|
||||
Warhorse 57.9 55.3 64.5 54.8 57.1 32.8 103.6 14 2
|
||||
WB4523 63.8 76.9 62.0 69.8 42.4 118.6 11.6 4
|
||||
WB4483 60.0 68.4 54.8 61.0 35.9 108.1 13.8 5
|
||||
StandClear CLP 59.2 68.2 62.6 61.2 39.7 103.8 13.2 5
|
||||
Scorpio 56.9 68.1 53.6 61.9 34.2 108.2 12.5 6
|
||||
WB4733 CLP 56.3 64.5 62.3 55.2 38.3 99.9 13.9 4
|
||||
DG Ramsay 76.9 62.0 73.4 37.5 119.9 13 5
|
||||
LCS Steel AX 75.2 62.7 62.0 44.4 119.2 12.2 5
|
||||
4739AX 73.9 62.8 67.6 43.2 110.8 13.3 4
|
||||
WB4510CLP 72.3 63.2 64.1 34.5 118.4 12.2 6
|
||||
LCS Julep 69.9 63.2 62.8 39.1 107.9 13.4 5
|
||||
Sawfly: Locations
|
||||
1-2 = Excellent 2023 — Conrad and Fort Benton, MT
|
||||
3-4 = Very Good 2024 — Billings, Chester, Conrad, and Fort Benton, MT
|
||||
5 = Good 2025 — Billings, Conrad, and Fort Benton, MT
|
||||
6-7 = Fair
|
||||
8-9 = Poor
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-ne-colorado-2025",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Northeast Colorado Dryland Summary, Three-Year Data",
|
||||
"filename": "NE%20Colorado%202025.pdf",
|
||||
"region": "NE Colorado",
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Sunbird",
|
||||
"AP Bigfoot",
|
||||
"SY Wolverine",
|
||||
"AP Solid",
|
||||
"AP Roadrunner",
|
||||
"SY Monument",
|
||||
"WB-Grainfield"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-07/NE%20Colorado%202025.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-07/NE%20Colorado%202025.pdf"
|
||||
],
|
||||
"page_text_chars": 2389,
|
||||
"fetched_at": "2026-05-25T19:11:01.256427+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,58 @@
|
||||
# 2025 Northeast Colorado Dryland Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Region:** NE Colorado
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-07/NE%20Colorado%202025.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Sunbird, AP Bigfoot, SY Wolverine, AP Solid, AP Roadrunner, SY Monument, WB-Grainfield
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Northeast Colorado Dryland Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Crook, Julesburg, Yuma,
|
||||
Variety (2023-2025) (2024-2025) (2025) CO CO* CO
|
||||
Hard Winter Wheat Yield TWT Yield TWT Yield TWT Yield Yield
|
||||
Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A
|
||||
AP Sunbird 81.5 59.5 82.7 60.0 86.5 57.9 61.6 102.9 95.1
|
||||
AP24 AX 79.2 56.8 79.8 57.1 86.7 54.9 63.7 97.9 98.5
|
||||
AP18 AX 78.5 57.2 78.7 57.4 85.0 55.2 59.5 101.2 94.3
|
||||
AP Bigfoot 76.5 59.0 76.6 58.9 79.0 56.6 60.4 86.6 90.0
|
||||
SY Wolverine 75.5 59.1 77.1 59.7 80.5 57.2 61.9 91.0 88.5
|
||||
AP Solid 73.3 59.2 73.9 59.4 76.9 56.4 62.4 86.0 82.4
|
||||
AP Roadrunner 72.3 56.8 73.6 56.7 80.1 54.6 63.9 90.5 86.0
|
||||
SY Monument 67.5 58.2 65.4 58.6 70.3 56.7 62.4 67.7 80.7
|
||||
AG Golden 78.3 56.4 79.0 56.7 86.1 54.9 71.2 93.6 93.4
|
||||
Langin 78.2 57.4 81.4 57.7 86.7 55.8 62.6 95.5 102.0
|
||||
WB4422 77.3 59.5 78.4 59.9 83.2 57.7 65.8 90.7 92.9
|
||||
KS Dallas 76.3 59.7 72.9 59.5 76.0 57.4 55.0 82.3 90.8
|
||||
WB4595 75.8 60.2 76.9 60.2 81.4 57.3 63.7 90.1 90.4
|
||||
High Country 75.4 59.2 75.1 59.1 79.2 57.0 60.4 89.6 87.7
|
||||
Amplify SF 73.7 58.8 72.1 59.0 73.8 56.5 61.6 77.7 82.2
|
||||
KS Hamilton 72.2 58.5 71.1 58.5 76.3 56.1 54.5 78.0 96.3
|
||||
TAM 115 65.5 60.4 63.8 60.2 70.5 58.3 49.8 85.0 76.6
|
||||
Kivari AX 78.1 57.7 83.3 55.9 63.3 87.1 99.5
|
||||
WB-Grainfield 86.3 57.3 59.0 99.8 100.0
|
||||
Canvas 85.1 55.9 67.4 94.4 93.7
|
||||
KS Bill Snyder 80.4 56.9 63.2 92.9 85.3
|
||||
KS Mako 78.4 57.2 58.1 84.6 92.3
|
||||
Mean General 75.2 58.6 75.5 58.7 79.5 56.4 60.4 88.8 89.4
|
||||
LSD General (5%) EE 5.5 1.2 6.2 1.5 9.6 0.0 8.3 16.0 13.0
|
||||
CV (Effective) 8.1 1.9 8.6 2.0 9.9 2.4 8.3 11.0 8.9
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
* Location was affected by a Wheat Streak Mosaic Virus infestation, which resulted in reduced yield of susceptible varieties.
|
||||
Locations
|
||||
2023 — Julesburg and Yuma, CO; Colby, KS
|
||||
2024 — Crook and Julesburg, CO; Ingalls, KS
|
||||
2025 — Crook, Julesburg, and Yuma, CO
|
||||
© 2024 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-plains-irrigated-2025",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Plains Irrigated Summary, Three-Year Data",
|
||||
"filename": "Plains%20Irrigated%202025.pdf",
|
||||
"region": "Plains Irrigated",
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"SY Wolverine",
|
||||
"AP Sunbird",
|
||||
"AP Prolific",
|
||||
"AP Bigfoot",
|
||||
"SY Grit",
|
||||
"AP Roadrunner",
|
||||
"SY Monument"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-07/Plains%20Irrigated%202025.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-07/Plains%20Irrigated%202025.pdf"
|
||||
],
|
||||
"page_text_chars": 2164,
|
||||
"fetched_at": "2026-05-25T19:11:03.219401+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,52 @@
|
||||
# 2025 Plains Irrigated Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Region:** Plains Irrigated
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-07/Plains%20Irrigated%202025.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** SY Wolverine, AP Sunbird, AP Prolific, AP Bigfoot, SY Grit, AP Roadrunner, SY Monument
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Plains Irrigated Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Hugoton, Scott City, Imperial, Dalhart,
|
||||
Variety (2023-2025) (2024-2025) (2025) KS KS NE TX
|
||||
Hard Winter Wheat Yield TWT Yield TWT Yield TWT Yield Yield Yield Yield
|
||||
Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A Bu/A Bu/A
|
||||
SY Wolverine 95.0 57.6 94.8 58.6 94.5 59.9 99.9 102.4 95.4 80.3
|
||||
AP Sunbird 93.9 58.8 94.7 59.1 89.5 59.7 74.8 107.8 96.8 78.5
|
||||
AP Prolific 93.6 59.8 92.6 60.1 92.3 61.3 104.9 95.7 92.6 76.0
|
||||
AP Bigfoot 90.9 59.1 90.5 59.2 87.3 59.7 67.6 111.3 91.7 78.7
|
||||
SY Grit 89.2 58.1 90.9 58.7 91.8 58.9 96.4 94.8 96.9 79.0
|
||||
AP Roadrunner 88.7 56.7 89.3 57.2 81.7 57.1 57.6 108.2 83.8 77.1
|
||||
SY Monument 83.8 57.1 83.0 57.7 79.3 59.5 77.3 79.6 83.7 76.6
|
||||
WB4422 94.9 59.2 95.8 60.0 90.9 61.1 107.0 98.5 89.0 69.1
|
||||
TAM 114 91.7 60.7 93.6 61.2 89.1 61.7 77.6 107.9 92.5 78.3
|
||||
Canvas 89.6 58.0 92.2 59.2 84.5 59.4 68.1 101.9 86.1 81.7
|
||||
WB4792 89.3 56.8 92.1 58.0 88.3 57.9 94.6 104.7 73.0 80.8
|
||||
Epoch 88.9 58.4 88.5 58.9 88.6 59.6 101.9 97.4 87.0 68.0
|
||||
Langin 88.5 58.8 90.8 59.4 84.9 60.1 62.9 99.3 91.9 85.5
|
||||
TAM 115 76.3 59.2 76.2 59.7 76.5 62.0 63.8 93.4 75.4 73.5
|
||||
WB4523 93.5 58.4 90.9 111.5 93.4 78.1
|
||||
WB4303 93.1 57.4 102.9 93.8 95.7 80.2
|
||||
KS Mako 90.1 60.7 88.9 101.2 96.8 73.3
|
||||
Mean General 88.9 57.8 90.0 58.5 87.5 59.2 85.9 99.3 87.6 77.1
|
||||
LSD General (5%) EE 8.0 1.8 9.2 1.9 13.2 2.3 18.8 9.4 8.8 8.6
|
||||
CV (Effective) 10.2 3.5 10.6 3.0 8.5 2.6 13.4 5.8 6.1 6.8
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
Locations
|
||||
2023 — Ingalls, KS; Imperial, NE
|
||||
2024 — Dalhart, TX; Hugoton and Ingalls, KS; Imperial, NE
|
||||
2025 — Hugoton and Scott City, KS; Imperial, NE; Dalhart, TX
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,34 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-sc-ks-nc-ok-2024-0",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2024 South-Central Kansas, North-Central Oklahoma Summary, Three-Year Data",
|
||||
"filename": "SC%20KS%20NC%20OK%202024_0.pdf",
|
||||
"region": null,
|
||||
"wheat_class_section": null,
|
||||
"year": 2024,
|
||||
"years_covered": [
|
||||
2024
|
||||
],
|
||||
"varieties_found": [
|
||||
"SY Monument",
|
||||
"AP Prolific",
|
||||
"AP Roadrunner",
|
||||
"SY Wolverine",
|
||||
"AP Sunbird",
|
||||
"AP EverRock",
|
||||
"AP Bigfoot",
|
||||
"LCS Atomic AX"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2024-07/SC%20KS%20NC%20OK%202024_0.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2024-07/SC%20KS%20NC%20OK%202024_0.pdf"
|
||||
],
|
||||
"page_text_chars": 2157,
|
||||
"fetched_at": "2026-05-25T19:11:08.283651+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,54 @@
|
||||
# 2024 South-Central Kansas, North-Central Oklahoma Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Year:** 2024
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2024-07/SC%20KS%20NC%20OK%202024_0.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** SY Monument, AP Prolific, AP Roadrunner, SY Wolverine, AP Sunbird, AP EverRock, AP Bigfoot, LCS Atomic AX
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2024 South-Central Kansas, North-Central Oklahoma Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2022-2024
|
||||
3-Yr Combined 2-Yr Combined Combined Conway Pratt, Carrier,
|
||||
Variety (2022-2024) (2023-2024) (2024) Springs, KS KS OK
|
||||
Hard Winter Wheat Yield TWT Yield TWT Yield TWT Yield Yield Yield
|
||||
Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A Bu/A
|
||||
AP24 AX 57.6 62.7 54.1 62.4 62.6 63.7 75.2 36.5 76.2
|
||||
SY Monument 57.5 63.7 56.9 63.7 63.2 64.8 63.8 46.9 79.0
|
||||
AP Prolific 57.3 64.1 55.7 63.7 61.8 64.7 61.5 42.5 81.4
|
||||
AP Roadrunner 56.4 61.4 54.0 60.6 59.0 61.0 63.8 34.8 78.3
|
||||
SY Wolverine 55.8 64.5 52.8 64.2 58.7 65.6 56.6 44.6 75.0
|
||||
AP18 AX 55.4 63.0 52.8 62.9 62.2 64.2 70.7 37.3 78.5
|
||||
AP Sunbird 53.7 64.0 52.0 64.1 62.0 66.1 69.5 45.5 70.9
|
||||
Bob Dole 53.6 62.9 53.5 62.5 60.6 63.6 62.0 40.1 79.7
|
||||
AP EverRock 52.4 63.6 50.8 63.2 58.4 64.6 60.0 37.5 77.7
|
||||
AP Bigfoot 52.0 63.9 51.3 63.9 61.0 65.6 61.7 45.3 76.0
|
||||
WB4401 53.6 63.7 52.3 64.4 59.7 65.9 63.2 31.7 84.1
|
||||
Showdown 59.7 64.4 69.7 65.6 77.0 47.2 85.1
|
||||
Rockstar 56.5 62.4 64.2 63.1 62.9 49.3 80.4
|
||||
KS Providence 56.0 63.4 64.4 64.4 71.1 38.7 83.4
|
||||
LCS Atomic AX 54.3 64.8 62.3 66.2 65.8 42.9 78.2
|
||||
WB4523 52.9 63.7 60.5 65.0 56.2 39.9 85.4
|
||||
KS Hatchett 52.9 63.5 60.8 64.5 65.8 39.5 77.0
|
||||
WB4422 61.3 64.6 63.5 43.5 77.0
|
||||
KS Mako 59.0 65.5 58.7 49.9 68.3
|
||||
Mean General 55.0 63.4 53.8 63.5 59.7 64.0 62.2 41.2 75.7
|
||||
LSD General (5%) EE NS 1.4 NS 1.9 8.5 3.7 7.7 8.6 9.3
|
||||
CV (Effective) 7.9 1.7 8.7 2.0 8.8 2.2 7.6 12.8 7.5
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
NS = Non Significant
|
||||
Locations
|
||||
2022 — Conway Springs and Partridge, KS; Carrier, OK
|
||||
2023 — Conway Springs, KS; Carrier, OK
|
||||
2024 — Conway Springs and Pratt, KS; Carrier, OK
|
||||
© 2024 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,36 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-south-dakota-2025",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 South Dakota Winter Wheat Summary, Three-Year Data",
|
||||
"filename": "South%20Dakota%202025.pdf",
|
||||
"region": null,
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Prolific",
|
||||
"AP Sunbird",
|
||||
"AP Bigfoot",
|
||||
"SY",
|
||||
"SY Wolverine",
|
||||
"SY WOLF",
|
||||
"SY Monument",
|
||||
"AP Clair",
|
||||
"AP Solid",
|
||||
"LCS Helix AX"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-08/South%20Dakota%202025.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-08/South%20Dakota%202025.pdf"
|
||||
],
|
||||
"page_text_chars": 1916,
|
||||
"fetched_at": "2026-05-25T19:11:14.252920+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,52 @@
|
||||
# 2025 South Dakota Winter Wheat Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-08/South%20Dakota%202025.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Prolific, AP Sunbird, AP Bigfoot, SY, SY Wolverine, SY WOLF, SY Monument, AP Clair, AP Solid, LCS Helix AX
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 South Dakota Winter Wheat Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Agar, Presho,
|
||||
Variety (2023-2025) (2024-2025) (2025) SD SD
|
||||
Hard Winter Wheat Yield TWT Yield TWT Yield TWT Yield Yield
|
||||
Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A
|
||||
AP18 AX 57.3 59.3 64.7 58.6 57.2 54.2 52.5 61.9
|
||||
AP24 AX 56.8 59.0 64.2 57.8 61.2 53.9 59.9 62.5
|
||||
AP Prolific 56.1 59.5 62.9 58.8 58.6 54.6 56.1 61.1
|
||||
AP Sunbird 55.9 59.5 58.5 59.1 58.4 56.1 56.8 59.9
|
||||
AP Bigfoot 55.6 59.6 62.9 58.9 64.1 55.9 62.2 66.1
|
||||
SY 517 CL2 52.5 60.8 56.6 60.3 59.9 57.8 60.0 59.7
|
||||
SY Wolverine 52.3 58.5 56.9 58.5 53.5 55.4 47.5 59.5
|
||||
SY WOLF 51.4 59.1 55.5 57.9 53.5 53.7 49.4 57.6
|
||||
SY Monument 50.1 57.3 56.1 55.9 56.6 51.9 54.7 58.6
|
||||
AP Clair 49.2 59.7 57.8 59.5 52.9 56.2 50.6 55.1
|
||||
AP Solid 54.7 56.8 47.8 61.7
|
||||
WB4422 58.2 59.9 63.8 58.4 65.3 55.5 63.5 67.1
|
||||
LCS Helix AX 57.2 61.1 66.0 60.6 65.2 57.9 65.0 65.3
|
||||
Winner 56.5 60.5 66.7 59.7 65.9 57.0 68.7 63.1
|
||||
SD Andes 55.7 60.6 60.9 59.8 54.2 55.8 50.3 58.1
|
||||
SD Midland 54.6 59.8 59.4 58.7 54.4 54.1 49.7 59.0
|
||||
Kivari AX 54.3 56.8 50.7 53.2 44.5 56.9
|
||||
SD Pheasant 54.1 56.4 49.7 58.5
|
||||
Mean General 54.6 59.6 60.4 58.5 57.7 55.2 55.2 60.3
|
||||
LSD General (5%) EE 5.4 1.2 7.7 1.5 7.4 1.8 13.0 6.1
|
||||
CV (Effective) 10.3 1.8 8.7 1.9 11.2 2.0 14.3 6.2
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
Locations
|
||||
2023 — Hayes and Ideal, SD
|
||||
2024 — Hayes and Ideal, SD
|
||||
2025 — Agar and Presho, SD
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-southern-idaho-2025",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Southern Idaho Summary, Three-Year Data",
|
||||
"filename": "Southern%20Idaho%202025.pdf",
|
||||
"region": "Southern Idaho",
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Exceed",
|
||||
"AP Olympia",
|
||||
"SY Dayton",
|
||||
"SY Assure",
|
||||
"SY Ovation",
|
||||
"AP Iliad",
|
||||
"SY Raptor",
|
||||
"LCS Shine",
|
||||
"LCS Hulk",
|
||||
"LCS Artdeco",
|
||||
"Norwest Duet",
|
||||
"Norwest Tandem",
|
||||
"LCS Jefe",
|
||||
"LCS Kamiack"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-09/Southern%20Idaho%202025.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-09/Southern%20Idaho%202025.pdf"
|
||||
],
|
||||
"page_text_chars": 2095,
|
||||
"fetched_at": "2026-05-25T19:11:06.237182+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,49 @@
|
||||
# 2025 Southern Idaho Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Region:** Southern Idaho
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-09/Southern%20Idaho%202025.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Exceed, AP Olympia, SY Dayton, SY Assure, SY Ovation, AP Iliad, SY Raptor, LCS Shine, LCS Hulk, LCS Artdeco, Norwest Duet, Norwest Tandem, LCS Jefe, LCS Kamiack
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Southern Idaho Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Blackfoot, Nampa, Rupert, Twin Falls,
|
||||
Variety (2023-2025) (2024-2025) (2025) ID ID ID ID
|
||||
Soft White Yield TWT Yield TWT Yield TWT Yield Yield Yield Yield
|
||||
Winter Wheat Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A Bu/A Bu/A
|
||||
AP Exceed 166.9 62.2 165.3 62.6 177.8 62.1 179.0 176.4 141.9 213.7
|
||||
AP Olympia 164.9 61.0 163.8 61.4 181.2 60.8 183.6 163.9 156.7 220.5
|
||||
SY Dayton 161.5 61.2 160.4 61.4 174.9 60.5 176.4 177.4 146.4 199.5
|
||||
SY Assure 161.5 62.0 158.7 61.9 174.7 61.5 180.2 165.2 155.6 197.7
|
||||
SY Ovation 160.2 61.1 155.7 60.8 169.2 59.9 178.2 175.0 121.7 202.0
|
||||
AP Iliad 160.1 61.4 157.0 61.6 167.9 60.7 177.6 175.6 108.9 209.4
|
||||
SY Raptor 153.1 59.6 150.9 59.7 160.7 58.9 175.5 165.3 130.5 171.5
|
||||
LCS Shine 164.0 61.3 164.6 61.5 170.7 61.0 165.9 160.6 166.9 189.5
|
||||
LCS Hulk 161.7 62.6 159.9 62.8 171.1 62.8 164.2 181.5 146.9 192.0
|
||||
LCS Artdeco 157.6 61.5 156.5 61.5 164.2 60.9 164.6 182.7 131.1 178.3
|
||||
Norwest Duet 154.8 61.2 155.1 61.5 166.2 61.1 170.4 159.8 139.6 194.9
|
||||
Norwest Tandem 152.2 60.8 149.6 61.2 155.0 60.3 151.6 152.2 121.8 194.2
|
||||
LCS Jefe 165.5 62.6 180.8 61.8 174.2 160.8 169.7 218.5
|
||||
LCS Kamiack 159.8 61.9 174.9 61.5 169.3 176.8 143.5 209.9
|
||||
Mean General 160.9 61.4 159.4 61.6 170.9 61.0 171.0 169.4 141.4 201.8
|
||||
LSD General (5%) EE ns 1.1 ns 1.1 14.1 1.3 12.5 ns ns ns
|
||||
CV (Effective) 6.9 2.1 7.2 1.8 8.4 1.7 3.5 5.5 14.5 7.6
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
Locations
|
||||
2023 — Twin Falls, ID
|
||||
2024 — Blackfoot and Twin Falls, ID
|
||||
2025 — Blackfoot, Nampa, Rupert, and Twin Falls, ID
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-washington-n-idaho-2025",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Washington/Northern Idaho Summary, Three-Year Data",
|
||||
"filename": "Washington%3AN%20Idaho%202025.pdf",
|
||||
"region": null,
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"SY Raptor",
|
||||
"AP Olympia",
|
||||
"AP Exceed",
|
||||
"SY Ovation",
|
||||
"SY Dayton",
|
||||
"AP Iliad",
|
||||
"SY Assure",
|
||||
"Norwest Duet",
|
||||
"LCS Artdeco",
|
||||
"LCS Shine",
|
||||
"LCS Hulk",
|
||||
"Norwest Tandem",
|
||||
"LCS Kamiack",
|
||||
"LCS Jefe"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-08/Washington%3AN%20Idaho%202025.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-08/Washington%3AN%20Idaho%202025.pdf"
|
||||
],
|
||||
"page_text_chars": 1892,
|
||||
"fetched_at": "2026-05-25T19:11:05.110770+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,48 @@
|
||||
# 2025 Washington/Northern Idaho Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-08/Washington%3AN%20Idaho%202025.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** SY Raptor, AP Olympia, AP Exceed, SY Ovation, SY Dayton, AP Iliad, SY Assure, Norwest Duet, LCS Artdeco, LCS Shine, LCS Hulk, Norwest Tandem, LCS Kamiack, LCS Jefe
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Washington/Northern Idaho Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Genesee, Walla Walla,
|
||||
Variety (2023-2025) (2024-2025) (2025) ID WA
|
||||
Soft White Yield TWT Yield TWT Yield TWT Yield Yield
|
||||
Winter Wheat Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A
|
||||
SY Raptor 146.1 63.0 157.6 64.8 156.9 65.0 157.4 156.3
|
||||
AP Olympia 142.2 64.2 154.3 66.1 151.1 65.8 159.6 142.5
|
||||
AP Exceed 141.8 63.6 155.3 65.4 152.6 65.3 160.0 145.2
|
||||
SY Ovation 140.8 63.3 148.7 65.4 147.4 65.7 154.9 139.8
|
||||
SY Dayton 140.0 63.6 158.4 65.6 157.7 65.8 172.3 143.1
|
||||
AP Iliad 139.8 63.4 151.0 65.3 147.0 65.2 149.4 144.6
|
||||
SY Assure 138.7 64.3 152.0 66.3 150.9 66.0 149.6 152.1
|
||||
Norwest Duet 144.1 62.3 157.4 64.7 155.3 65.0 157.5 153.1
|
||||
LCS Artdeco 140.3 62.6 155.5 64.4 157.5 64.8 160.9 154.0
|
||||
LCS Shine 139.3 63.1 148.6 64.7 150.7 65.2 146.1 155.4
|
||||
LCS Hulk 138.4 63.9 152.4 65.8 156.6 65.9 167.2 146.0
|
||||
Norwest Tandem 137.0 62.7 152.9 65.0 157.3 65.3 166.1 148.5
|
||||
LCS Kamiack 165.0 66.5 174.3 66.2 188.8 159.7
|
||||
LCS Jefe 160.1 65.2 167.4 66.0 178.9 155.9
|
||||
Mean General 142.5 63.3 156.7 65.3 157.3 65.5 162.0 152.7
|
||||
LSD General (5%) EE 8.6 1.0 ns 1.4 ns ns ns ns
|
||||
CV (Effective) 6.1 1.6 5.4 1.1 7.5 1.1 8.4 6.5
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
Locations
|
||||
2023 — Craigmont and Genesee, ID; Moses Lake and Walla Walla, WA
|
||||
2024 — Craigmont, ID; Walla Walla, WA
|
||||
2025 — Genesee, ID; Walla Walla, WA
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-western-plains-dryland-2025-0",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Western Plains Dryland Summary, Three-Year Data",
|
||||
"filename": "Western%20Plains%20Dryland%202025_0.pdf",
|
||||
"region": "Western Plains",
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Sunbird",
|
||||
"AP Bigfoot",
|
||||
"SY Wolverine",
|
||||
"AP Roadrunner",
|
||||
"AP Solid",
|
||||
"SY Monument",
|
||||
"WB-Grainfield"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-07/Western%20Plains%20Dryland%202025_0.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-07/Western%20Plains%20Dryland%202025_0.pdf"
|
||||
],
|
||||
"page_text_chars": 2395,
|
||||
"fetched_at": "2026-05-25T19:11:02.169343+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,58 @@
|
||||
# 2025 Western Plains Dryland Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Region:** Western Plains
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-07/Western%20Plains%20Dryland%202025_0.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Sunbird, AP Bigfoot, SY Wolverine, AP Roadrunner, AP Solid, SY Monument, WB-Grainfield
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Western Plains Dryland Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Crook, Julesburg, Yuma,
|
||||
Variety (2023-2025) (2024-2025) (2025) CO CO* CO
|
||||
Hard Winter Wheat Yield TWT Yield TWT Yield TWT Yield Yield Yield
|
||||
Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A Bu/A
|
||||
AP18 AX 74.2 57.6 75.8 57.8 85.0 55.2 59.5 101.2 94.3
|
||||
AP Sunbird 74.1 59.5 77.2 59.8 86.5 57.9 61.6 102.9 95.1
|
||||
AP24 AX 73.2 56.7 74.0 56.8 86.7 54.9 63.7 97.9 98.5
|
||||
AP Bigfoot 71.2 58.9 73.0 58.9 79.0 56.6 60.4 86.6 90.0
|
||||
SY Wolverine 68.9 59.1 71.4 59.5 80.5 57.2 61.9 91.0 88.5
|
||||
AP Roadrunner 67.6 56.0 70.3 55.7 80.1 54.6 63.9 90.5 86.0
|
||||
AP Solid 66.4 58.4 67.7 58.4 76.9 56.4 62.4 86.0 82.4
|
||||
SY Monument 61.7 57.8 60.1 57.9 70.3 56.7 62.4 67.7 80.7
|
||||
Langin 72.9 57.3 77.3 57.6 86.7 55.8 62.6 95.5 102.0
|
||||
WB4422 70.6 59.1 73.8 59.4 83.2 57.7 65.8 90.7 92.9
|
||||
AG Golden 70.3 55.5 72.3 55.5 86.1 54.9 71.2 93.6 93.4
|
||||
High Country 69.4 59.1 72.0 58.9 79.2 57.0 60.4 89.6 87.7
|
||||
WB4595 69.2 59.8 72.2 59.6 81.4 57.3 63.7 90.1 90.4
|
||||
KS Dallas 68.3 59.1 66.5 58.7 76.0 57.4 55.0 82.3 90.8
|
||||
Amplify SF 66.8 57.9 66.4 57.8 73.8 56.5 61.6 77.7 82.2
|
||||
KS Hamilton 66.6 58.3 68.5 58.3 76.3 56.1 54.5 78.0 96.3
|
||||
TAM 115 60.7 60.1 60.1 59.8 70.5 58.3 49.8 85.0 76.6
|
||||
Kivari AX 73.7 57.0 83.3 55.9 63.3 87.1 99.5
|
||||
WB-Grainfield 86.3 57.3 59.0 99.8 100.0
|
||||
Canvas 85.1 55.9 67.4 94.4 93.7
|
||||
KS Bill Snyder 80.4 56.9 63.2 92.9 85.3
|
||||
KS Mako 78.4 57.2 58.1 84.6 92.3
|
||||
Mean General 69.1 58.3 71.0 58.2 79.5 56.4 60.4 88.8 89.4
|
||||
LSD General (5%) EE 5.1 1.4 6.2 1.7 9.6 ns 8.3 16.0 13.0
|
||||
CV (Effective) 9.0 2.4 9.4 2.5 9.9 2.4 8.3 11.0 8.9
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
* Location was affected by a Wheat Streak Mosaic Virus infestation, which resulted in reduced yield of susceptible varieties.
|
||||
Locations
|
||||
2023 — Julesburg and Yuma, CO; Colby, KS
|
||||
2024 — Crook and Julesburg, CO; Ingalls, KS
|
||||
2025 — Crook, Julesburg, and Yuma, CO
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"source": "agripro_trials",
|
||||
"source_key": "agt-wheat-after-soy-2025",
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": "2025 Wheat Following Soybeans Summary, Three-Year Data",
|
||||
"filename": "Wheat%20after%20Soy%202025.pdf",
|
||||
"region": null,
|
||||
"wheat_class_section": null,
|
||||
"year": 2025,
|
||||
"years_covered": [
|
||||
2025
|
||||
],
|
||||
"varieties_found": [
|
||||
"AP Sunbird",
|
||||
"AP Roadrunner",
|
||||
"SY Wolverine",
|
||||
"AP Bigfoot",
|
||||
"AP Prolific",
|
||||
"SY Monument",
|
||||
"LCS Atomic AX"
|
||||
],
|
||||
"pdf_url": "https://agriprowheat.com/sites/default/files/2025-07/Wheat%20after%20Soy%202025.pdf",
|
||||
"source_urls": [
|
||||
"https://agriprowheat.com/trials-data",
|
||||
"https://agriprowheat.com/sites/default/files/2025-07/Wheat%20after%20Soy%202025.pdf"
|
||||
],
|
||||
"page_text_chars": 2367,
|
||||
"fetched_at": "2026-05-25T19:11:07.269403+00:00",
|
||||
"scraper_version": "0.1.0"
|
||||
}
|
||||
@@ -0,0 +1,55 @@
|
||||
# 2025 Wheat Following Soybeans Summary, Three-Year Data
|
||||
|
||||
- **Source:** AgriPro (Syngenta) regional trial PDF
|
||||
- **Vendor:** Syngenta
|
||||
- **Brand:** AgriPro
|
||||
- **Crop:** Wheat
|
||||
- **Data type:** trial
|
||||
- **Year:** 2025
|
||||
- **PDF:** https://agriprowheat.com/sites/default/files/2025-07/Wheat%20after%20Soy%202025.pdf
|
||||
- **Index page:** https://agriprowheat.com/trials-data
|
||||
- **Varieties listed:** AP Sunbird, AP Roadrunner, SY Wolverine, AP Bigfoot, AP Prolific, SY Monument, LCS Atomic AX
|
||||
|
||||
---
|
||||
|
||||
## Trial data (verbatim from PDF)
|
||||
|
||||
```
|
||||
2025 Wheat Following Soybeans Summary, Three-Year Data
|
||||
Syngenta Commercial Variety Wheat Performance Test, 2023-2025
|
||||
3-Yr Combined 2-Yr Combined Combined Belleville, Junction City, Salina,
|
||||
Variety (2023-2025) (2024-2025) (2025) KS* KS KS
|
||||
Hard Winter Wheat Yield TWT Yield TWT Yield TWT Yield Yield Yield
|
||||
Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Lb/Bu Bu/A Bu/A Bu/A
|
||||
AP24 AX 66.1 58.8 76.6 59.0 78.7 55.3 64.8 94.7 76.7
|
||||
AP Sunbird 62.3 61.0 70.7 60.9 70.7 57.7 59.0 86.7 66.5
|
||||
AP Roadrunner 62.2 57.5 68.5 56.9 64.1 53.4 54.1 79.7 58.5
|
||||
SY Wolverine 61.0 60.3 66.9 60.1 65.4 56.5 58.2 86.2 51.9
|
||||
AP Bigfoot 60.3 61.3 67.5 61.2 68.6 58.7 53.1 86.6 65.9
|
||||
AP Prolific 59.2 60.0 66.3 60.0 67.2 57.9 46.0 86.2 69.2
|
||||
SY Monument 58.4 59.3 63.9 59.8 59.5 56.1 48.2 71.6 58.6
|
||||
Bob Dole 56.7 58.5 63.0 58.1 60.0 54.6 47.0 68.3 64.6
|
||||
Showdown 62.8 59.6 71.9 59.6 68.7 55.6 56.4 83.0 66.6
|
||||
KS Providence 61.9 60.4 70.0 60.4 67.0 56.8 52.5 88.9 59.7
|
||||
Rockstar 61.0 58.1 70.0 58.5 68.7 55.6 54.0 83.8 68.3
|
||||
WB4401 59.5 59.5 66.7 59.2 63.0 55.1 51.7 78.8 58.5
|
||||
LCS Atomic AX 59.1 61.3 67.7 61.2 70.2 58.7 52.3 88.0 70.1
|
||||
WB4523 57.4 58.9 63.9 59.0 56.7 55.5 40.6 68.2 61.2
|
||||
WB4422 72.5 58.6 73.5 55.2 62.5 87.3 70.6
|
||||
KS Bill Snyder 78.4 58.8 59.3 94.6 81.2
|
||||
WB4699 66.4 55.2 51.0 84.2 64.1
|
||||
Polansky Goldenhawk 65.3 56.1 52.7 86.5 56.8
|
||||
High Cotton 65.3 57.4 43.8 90.9 61.2
|
||||
Doublestop CLP 63.8 56.9 53.9 74.0 63.4
|
||||
Mean General 60.6 59.7 68.4 59.5 66.9 56.3 51.7 83.6 65.4
|
||||
LSD General (5%) EE 4.6 1.4 5.6 1.7 9.2 0.0 9.5 10.7 9.9
|
||||
CV (Effective) 8.6 2.5 8.3 2.7 9.2 3.8 11.2 7.8 9.2
|
||||
Boldfaced numbers are within confidence interval at specific locations and combined years of yield data.
|
||||
* Location was affected by a Wheat Streak Mosaic Virus infestation, which resulted in reduced yield of susceptible varieties.
|
||||
Locations
|
||||
2023 — Belleville, Conway Springs, Junction City, and Salina, KS
|
||||
2024 — Belleville, Conway Springs, Junction City, and Salina, KS
|
||||
2025 — Belleville, Conway Springs, Junction City, and Salina, KS
|
||||
© 2025 Syngenta. All rights reserved. Reproduction expressly prohibited without written permission. Some or all of the varieties may be protected under one or more of the following: Plant Variety Protection, United
|
||||
States Plant Patents and/or Utility Patents and may not be propagated or reproduced without authorization. AgriPro® and the Syngenta logo are trademarks of a Syngenta Group Company.
|
||||
```
|
||||
+242
-1
@@ -201,9 +201,15 @@ def _build_where(
|
||||
vendor: str | None,
|
||||
source: str | None,
|
||||
source_key: str | None,
|
||||
*,
|
||||
data_type: str | None = None,
|
||||
state: str | None = None,
|
||||
year: int | None = None,
|
||||
) -> dict | None:
|
||||
"""Translate filter args into a Chroma `where` clause."""
|
||||
conds: list[dict] = []
|
||||
if data_type:
|
||||
conds.append({"data_type": data_type})
|
||||
if crop:
|
||||
conds.append({"crop": crop.lower()})
|
||||
if brand:
|
||||
@@ -214,6 +220,10 @@ def _build_where(
|
||||
conds.append({"source": source})
|
||||
if source_key:
|
||||
conds.append({"source_key": source_key})
|
||||
if state:
|
||||
conds.append({"state": state.upper() if len(state) <= 3 else state})
|
||||
if year:
|
||||
conds.append({"year": int(year)})
|
||||
if not conds:
|
||||
return None
|
||||
if len(conds) == 1:
|
||||
@@ -460,7 +470,11 @@ def search_docs(
|
||||
"query": query, "crop": crop, "brand": brand,
|
||||
"vendor": vendor, "source": source, "k": k,
|
||||
}) as _call:
|
||||
where = _build_where(crop, brand, vendor, source, None)
|
||||
# Variety-search default: filter to data_type=variety so trial
|
||||
# documents (yield trials) don't pollute identity-focused
|
||||
# results. To search trials, use search_trials().
|
||||
where = _build_where(crop, brand, vendor, source, None,
|
||||
data_type="variety")
|
||||
pool_size = max(k * 3, RERANK_POOL)
|
||||
|
||||
# Exact-code pre-filter. Variety codes ("DKC62-08RIB", "AG29XF4")
|
||||
@@ -745,6 +759,233 @@ def lookup_variety(
|
||||
return "\n".join(out)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def search_trials(
|
||||
query: Annotated[str, Field(description=(
|
||||
"Natural-language query about yield trials. Mention crop, "
|
||||
"region or state, year, soil/conditions, and any specific "
|
||||
"variety codes you want compared. Examples: "
|
||||
"'best corn hybrid 2024 Iowa heavy clay'; "
|
||||
"'AP Iliad yield Idaho stripe rust'; "
|
||||
"'DKC65-20 vs NK1748 head to head Alabama 2023'."
|
||||
))],
|
||||
crop: Annotated[
|
||||
str | None,
|
||||
Field(description="OPTIONAL: corn, soybeans, silage, or wheat."),
|
||||
] = None,
|
||||
state: Annotated[
|
||||
str | None,
|
||||
Field(description=(
|
||||
"OPTIONAL state filter. 2-letter abbrev (IA, IL, NE...) "
|
||||
"for Golden Harvest plot reports; full or partial region "
|
||||
"name (e.g. 'Pacific Northwest', 'Montana') for AgriPro "
|
||||
"trial PDFs."
|
||||
)),
|
||||
] = None,
|
||||
year: Annotated[
|
||||
int | None,
|
||||
Field(description="OPTIONAL year filter (e.g. 2024).", ge=2010, le=2030),
|
||||
] = None,
|
||||
product: Annotated[
|
||||
str | None,
|
||||
Field(description=(
|
||||
"OPTIONAL variety/hybrid filter — substring match against "
|
||||
"the product field. Example: 'DKC62' surfaces trials "
|
||||
"containing any DKC62-* hybrid."
|
||||
)),
|
||||
] = None,
|
||||
k: Annotated[int, Field(description="Number of results to return.", ge=1, le=50)] = 10,
|
||||
) -> str:
|
||||
"""Search yield-trial data — head-to-head results from real field
|
||||
trials. SEPARATE from variety-identity search.
|
||||
|
||||
Use this when the user wants to know HOW PRODUCTS PERFORMED, not
|
||||
what they ARE. Trial data complements `search_docs`:
|
||||
|
||||
* `search_docs` answers: "What's the disease resistance profile
|
||||
of DKC62-08RIB?" (variety identity)
|
||||
* `search_trials` answers: "Which corn hybrid actually won the
|
||||
yield trials in central Iowa in 2024?" (performance data)
|
||||
|
||||
Data sources:
|
||||
|
||||
* **Golden Harvest plot reports** (4,000+ trials) — per-site
|
||||
head-to-head comparing products from MULTIPLE BRANDS at one
|
||||
cooperator's field. NK, DEKALB, Golden Harvest, sometimes
|
||||
others all compete at the same site. Cross-vendor data Bayer
|
||||
itself doesn't publish.
|
||||
* **AgriPro regional trial PDFs** (~14 PDFs) — multi-year
|
||||
multi-location wheat performance for Northern Plains / PNW /
|
||||
Plains regions.
|
||||
|
||||
A typical workflow: call this to identify top performers in a
|
||||
region/year, then call `lookup_variety(source_key=...)` on the
|
||||
leaders to verify identity details (RM, traits, disease ratings).
|
||||
"""
|
||||
with TimedCall("search_trials", {
|
||||
"query": query, "crop": crop, "state": state, "year": year,
|
||||
"product": product, "k": k,
|
||||
}) as _call:
|
||||
where = _build_where(
|
||||
crop, None, None, None, None,
|
||||
data_type="trial",
|
||||
state=state,
|
||||
year=year,
|
||||
)
|
||||
pool_size = max(k * 3, RERANK_POOL)
|
||||
|
||||
try:
|
||||
col = _collection()
|
||||
except Exception as exc: # noqa: BLE001
|
||||
_call.set(error_dense=str(exc), hits_returned=0)
|
||||
return (
|
||||
"_(retrieval unavailable — Chroma collection not found. "
|
||||
"Has the indexer run? `python -m rag.index --rebuild`.)_"
|
||||
)
|
||||
|
||||
# If a product filter is set, augment the query with the
|
||||
# product code so BM25 + dense both have signal.
|
||||
full_query = query
|
||||
if product:
|
||||
full_query = f"{query} {product}"
|
||||
|
||||
try:
|
||||
dense = col.query(
|
||||
query_texts=[full_query],
|
||||
n_results=pool_size,
|
||||
where=where,
|
||||
)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
_call.set(error_dense=str(exc), hits_returned=0)
|
||||
return f"_(trial retrieval failed: {exc})_"
|
||||
|
||||
dense_ids: list[str] = (dense.get("ids") or [[]])[0]
|
||||
dense_docs: list[str] = (dense.get("documents") or [[]])[0]
|
||||
dense_metas: list[dict] = (dense.get("metadatas") or [[]])[0]
|
||||
dense_dists: list[float] = (dense.get("distances") or [[]])[0]
|
||||
|
||||
id_to_doc = dict(zip(dense_ids, dense_docs))
|
||||
id_to_meta = dict(zip(dense_ids, dense_metas))
|
||||
id_to_dist = dict(zip(dense_ids, dense_dists))
|
||||
|
||||
used_hybrid = False
|
||||
if HYBRID_SEARCH:
|
||||
bm25 = _bm25_index()
|
||||
if bm25 is not None:
|
||||
bm25_hits = bm25.query(full_query, n=pool_size, where=where)
|
||||
bm25_ids = [h[0] for h in bm25_hits]
|
||||
if bm25_ids:
|
||||
fused = _rrf_fuse([dense_ids, bm25_ids])
|
||||
fuzzy_ids = fused
|
||||
used_hybrid = True
|
||||
else:
|
||||
fuzzy_ids = dense_ids
|
||||
else:
|
||||
fuzzy_ids = dense_ids
|
||||
else:
|
||||
fuzzy_ids = dense_ids
|
||||
|
||||
# Optional product-substring post-filter: if user supplied
|
||||
# ``product``, require the chunk to actually contain the
|
||||
# token. This re-checks the bytes since BM25 only sees stems.
|
||||
if product:
|
||||
needle = product.lower()
|
||||
def _has_product(cid: str) -> bool:
|
||||
doc = id_to_doc.get(cid, "")
|
||||
if needle in doc.lower():
|
||||
return True
|
||||
# Not yet fetched — defer; the get-by-id below will fix.
|
||||
return cid not in id_to_doc
|
||||
|
||||
fuzzy_ids = [cid for cid in fuzzy_ids if _has_product(cid)]
|
||||
|
||||
final_ids: list[str] = []
|
||||
seen: set[str] = set()
|
||||
for cid in fuzzy_ids:
|
||||
if cid in seen:
|
||||
continue
|
||||
seen.add(cid)
|
||||
final_ids.append(cid)
|
||||
if len(final_ids) >= k:
|
||||
break
|
||||
|
||||
missing = [i for i in final_ids if i not in id_to_doc]
|
||||
if missing:
|
||||
try:
|
||||
extra = col.get(ids=missing, include=["documents", "metadatas"])
|
||||
for cid, doc, meta in zip(
|
||||
extra.get("ids") or [],
|
||||
extra.get("documents") or [],
|
||||
extra.get("metadatas") or [],
|
||||
):
|
||||
id_to_doc[cid] = doc
|
||||
id_to_meta[cid] = meta
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.warning("get-by-id for BM25-only hits failed: %s", exc)
|
||||
|
||||
# Apply product filter once we have docs from the get-by-id pass.
|
||||
if product:
|
||||
needle = product.lower()
|
||||
final_ids = [cid for cid in final_ids if needle in id_to_doc.get(cid, "").lower()]
|
||||
|
||||
_call.set(
|
||||
hits_returned=len(final_ids),
|
||||
hybrid=used_hybrid,
|
||||
pool_size=pool_size,
|
||||
data_type="trial",
|
||||
)
|
||||
|
||||
if not final_ids:
|
||||
return (
|
||||
"_(no trials matched. Try widening — drop the state, "
|
||||
"year, or product filter. `list_versions()` shows "
|
||||
"which trial sources are indexed.)_"
|
||||
)
|
||||
|
||||
blocks: list[str] = []
|
||||
for cid in final_ids:
|
||||
doc = id_to_doc.get(cid, "")
|
||||
meta = id_to_meta.get(cid, {})
|
||||
dist = id_to_dist.get(cid) if not used_hybrid else None
|
||||
blocks.append(_format_trial_hit(doc, meta, dist))
|
||||
|
||||
header = (
|
||||
f"# Trial search results — {len(final_ids)} trial document"
|
||||
f"{'s' if len(final_ids) != 1 else ''}"
|
||||
f"{' (dense + BM25 hybrid)' if used_hybrid else ' (dense only)'}\n"
|
||||
f"_Use `get_page(source=..., source_key=...)` to read the "
|
||||
f"full trial body. Use `lookup_variety(source_key=...)` on "
|
||||
f"any product code to verify its identity (RM, traits, "
|
||||
f"disease ratings)._\n\n---\n\n"
|
||||
)
|
||||
return header + "\n---\n\n".join(blocks)
|
||||
|
||||
|
||||
def _format_trial_hit(doc: str, meta: dict, distance: float | None = None) -> str:
|
||||
"""Trial-specific result header. Highlights crop/state/year and
|
||||
sources URL (vs variety hits which emphasize brand + product
|
||||
identity)."""
|
||||
src_url = meta.get("source_url") or ""
|
||||
src_key = meta.get("source_key") or ""
|
||||
src = meta.get("source") or ""
|
||||
crop = meta.get("crop") or ""
|
||||
state = meta.get("state") or ""
|
||||
year = meta.get("year") or ""
|
||||
region = meta.get("region") or ""
|
||||
|
||||
title_bits = [b for b in [crop.title(), region or state, str(year) if year else ""] if b]
|
||||
title = " · ".join(title_bits) if title_bits else src_key
|
||||
|
||||
header = (
|
||||
f"### Trial: {title} \n"
|
||||
f"`{src}::{src_key}` — {meta.get('vendor', '')} / {meta.get('brand', '')} \n"
|
||||
f"<{src_url}>"
|
||||
)
|
||||
if distance is not None:
|
||||
header += f" \n_(distance={distance:.4f})_"
|
||||
return f"{header}\n\n{doc.strip()}\n"
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def crop_seed_api_lessons(
|
||||
topic: Annotated[
|
||||
|
||||
+18
-3
@@ -42,7 +42,12 @@ DEFAULT_DB_NAME = "crop_seed_docs.db"
|
||||
# Columns we expose as filterable metadata. Mirrors what
|
||||
# ``docs_mcp.server._build_where`` accepts so the same filter dict
|
||||
# works for both Chroma and BM25 without per-retriever translation.
|
||||
FILTER_COLUMNS = ("source", "vendor", "brand", "crop", "source_key", "ordinal")
|
||||
# data_type / year / state / region are trial-specific facets; variety
|
||||
# chunks leave them empty.
|
||||
FILTER_COLUMNS = (
|
||||
"source", "vendor", "brand", "crop", "source_key",
|
||||
"data_type", "year", "state", "ordinal",
|
||||
)
|
||||
|
||||
|
||||
# Allowlist tokenizer for free-text queries. FTS5's parser chokes on
|
||||
@@ -131,8 +136,9 @@ class BM25Index:
|
||||
con.executescript(self._schema_sql())
|
||||
con.executemany(
|
||||
"INSERT INTO chunks_meta "
|
||||
"(id, source, vendor, brand, crop, source_key, ordinal) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
"(id, source, vendor, brand, crop, source_key, "
|
||||
" data_type, year, state, ordinal) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
|
||||
[
|
||||
(
|
||||
r["id"],
|
||||
@@ -141,6 +147,9 @@ class BM25Index:
|
||||
r["metadata"].get("brand") or "",
|
||||
r["metadata"].get("crop") or "",
|
||||
r["metadata"].get("source_key") or "",
|
||||
r["metadata"].get("data_type") or "variety",
|
||||
int(r["metadata"]["year"]) if isinstance(r["metadata"].get("year"), int) else None,
|
||||
r["metadata"].get("state") or "",
|
||||
int(r["metadata"].get("ordinal") or 0),
|
||||
)
|
||||
for r in records
|
||||
@@ -216,12 +225,18 @@ class BM25Index:
|
||||
brand TEXT,
|
||||
crop TEXT,
|
||||
source_key TEXT,
|
||||
data_type TEXT,
|
||||
year INTEGER,
|
||||
state TEXT,
|
||||
ordinal INTEGER
|
||||
);
|
||||
CREATE INDEX idx_meta_source ON chunks_meta(source);
|
||||
CREATE INDEX idx_meta_crop ON chunks_meta(crop);
|
||||
CREATE INDEX idx_meta_brand ON chunks_meta(brand);
|
||||
CREATE INDEX idx_meta_source_key ON chunks_meta(source_key);
|
||||
CREATE INDEX idx_meta_data_type ON chunks_meta(data_type);
|
||||
CREATE INDEX idx_meta_year ON chunks_meta(year);
|
||||
CREATE INDEX idx_meta_state ON chunks_meta(state);
|
||||
|
||||
CREATE VIRTUAL TABLE chunks_fts USING fts5(
|
||||
text,
|
||||
|
||||
+253
@@ -253,6 +253,7 @@ def _flat_metadata(sidecar: dict) -> dict:
|
||||
md: dict = {
|
||||
"source": sidecar.get("source") or "",
|
||||
"source_key": sidecar.get("source_key") or "",
|
||||
"data_type": "variety",
|
||||
"vendor": sidecar.get("vendor") or "",
|
||||
"brand": (sidecar.get("brand") or "").upper(),
|
||||
"crop": (sidecar.get("crop") or "").lower(),
|
||||
@@ -304,6 +305,258 @@ def chunks_from_variety(
|
||||
}
|
||||
|
||||
|
||||
# ===========================================================================
|
||||
# Trial chunker — for sidecars with data_type="trial"
|
||||
# ===========================================================================
|
||||
#
|
||||
# Trial documents are a different shape from variety identity:
|
||||
# - GH plot reports: per-site head-to-head yield comparison across brands
|
||||
# - AgriPro trial PDFs: regional multi-year multi-location summary
|
||||
#
|
||||
# Both produce ONE chunk per document with a preamble that emphasizes
|
||||
# the trial's location/year/top performers so the embedder gets clean
|
||||
# signal for queries like "best corn for sandy soil Iowa 2024".
|
||||
|
||||
|
||||
def _render_gh_plot_chunk(sidecar: dict) -> str:
|
||||
"""Render a Golden Harvest plot report (per-site cross-vendor)."""
|
||||
lines: list[str] = []
|
||||
crop = (sidecar.get("crop") or "").lower()
|
||||
crop_label = {"corn": "Corn", "soybeans": "Soybean", "silage": "Silage"}.get(crop, crop.title())
|
||||
state = sidecar.get("state") or sidecar.get("state_abbrev") or ""
|
||||
year = sidecar.get("year") or ""
|
||||
cooperator = sidecar.get("cooperator") or ""
|
||||
|
||||
lines.append(f"# {crop_label} yield trial — {state}, {year}")
|
||||
lines.append("")
|
||||
facts = ["Golden Harvest plot report (cross-vendor)"]
|
||||
if cooperator:
|
||||
facts.append(f"cooperator {cooperator}")
|
||||
if sidecar.get("planted_date"):
|
||||
facts.append(f"planted {sidecar['planted_date']}")
|
||||
if sidecar.get("harvested_date"):
|
||||
facts.append(f"harvested {sidecar['harvested_date']}")
|
||||
if sidecar.get("population_seeds_per_acre"):
|
||||
facts.append(f"population {sidecar['population_seeds_per_acre']:,} seeds/acre")
|
||||
if sidecar.get("row_width_in"):
|
||||
facts.append(f"{sidecar['row_width_in']}\" rows")
|
||||
lines.append(". ".join(facts) + ".")
|
||||
lines.append("")
|
||||
|
||||
results = sidecar.get("results") or []
|
||||
if results:
|
||||
# Pick the primary metric for ranking: corn/soy use "Yield",
|
||||
# silage uses "Ton/Acre". Find the first metric key with a
|
||||
# numeric value in the top result.
|
||||
def _primary(r: dict) -> tuple[str, float | None]:
|
||||
metrics = r.get("metrics") or {}
|
||||
# Back-compat: old sidecars had yield_bu_ac directly.
|
||||
if not metrics and r.get("yield_bu_ac") is not None:
|
||||
return ("Yield", r["yield_bu_ac"])
|
||||
for k in ("Yield", "Ton/Acre", "Tons/Acre"):
|
||||
v = metrics.get(k)
|
||||
if isinstance(v, (int, float)):
|
||||
return (k, v)
|
||||
for k, v in metrics.items():
|
||||
if isinstance(v, (int, float)):
|
||||
return (k, v)
|
||||
return ("", None)
|
||||
|
||||
top = results[: min(5, len(results))]
|
||||
primary_label, _ = _primary(top[0]) if top else ("", None)
|
||||
rendered_top_parts: list[str] = []
|
||||
for i, r in enumerate(top):
|
||||
label, val = _primary(r)
|
||||
piece = f"#{r.get('rank') or i+1} {r.get('brand','?')} {r.get('product','?')}"
|
||||
if r.get('traits'):
|
||||
piece += f" {r['traits']}"
|
||||
if val is not None:
|
||||
piece += f" — {val} {label}"
|
||||
rendered_top_parts.append(piece)
|
||||
if rendered_top_parts:
|
||||
lines.append(
|
||||
f"Top {len(top)} ({crop_label}, {state} {year}): "
|
||||
+ ", ".join(rendered_top_parts) + "."
|
||||
)
|
||||
lines.append("")
|
||||
|
||||
# Discover the metric column order from the first result with metrics.
|
||||
metric_keys: list[str] = []
|
||||
for r in results:
|
||||
metrics = r.get("metrics") or {}
|
||||
if metrics:
|
||||
metric_keys = list(metrics.keys())
|
||||
break
|
||||
# Back-compat: synthesize from legacy fields if no metrics dict.
|
||||
if not metric_keys and any(
|
||||
r.get("yield_bu_ac") is not None for r in results
|
||||
):
|
||||
metric_keys = ["Yield", "%MST", "Test Weight", "Gross Revenue"]
|
||||
|
||||
# Full ranking — preserves every datapoint verbatim.
|
||||
col_headers = ["rank", "brand", "product", "traits"] + metric_keys
|
||||
lines.append("Full ranking (" + " | ".join(col_headers) + "):")
|
||||
for r in results:
|
||||
row = [
|
||||
f"#{r.get('rank') or '-'}",
|
||||
r.get("brand") or "-",
|
||||
r.get("product") or "-",
|
||||
r.get("traits") or "-",
|
||||
]
|
||||
metrics = r.get("metrics") or {}
|
||||
# Back-compat shim
|
||||
if not metrics:
|
||||
metrics = {
|
||||
"Yield": r.get("yield_bu_ac"),
|
||||
"%MST": r.get("mst_pct"),
|
||||
"Test Weight": r.get("test_weight"),
|
||||
"Gross Revenue": r.get("gross_revenue_dol_ac"),
|
||||
}
|
||||
for k in metric_keys:
|
||||
v = metrics.get(k)
|
||||
if v is None:
|
||||
row.append("-")
|
||||
elif isinstance(v, (int, float)):
|
||||
if "Revenue" in k or "$" in k:
|
||||
row.append(f"${v:.2f}")
|
||||
else:
|
||||
row.append(str(v))
|
||||
else:
|
||||
row.append(str(v))
|
||||
lines.append(" " + " | ".join(row))
|
||||
lines.append("")
|
||||
|
||||
urls = sidecar.get("source_urls") or []
|
||||
if urls:
|
||||
lines.append(f"Source: {urls[0]}")
|
||||
return "\n".join(lines).strip() + "\n"
|
||||
|
||||
|
||||
def _render_agripro_trial_chunk(sidecar: dict) -> str:
|
||||
"""Render an AgriPro regional trial PDF — preamble + verbatim text."""
|
||||
lines: list[str] = []
|
||||
title = sidecar.get("title") or sidecar.get("filename") or sidecar.get("source_key", "")
|
||||
lines.append(f"# {title}")
|
||||
lines.append("")
|
||||
|
||||
facts = ["AgriPro / Syngenta regional wheat trial"]
|
||||
if sidecar.get("region"):
|
||||
facts.append(f"region {sidecar['region']}")
|
||||
if sidecar.get("wheat_class_section"):
|
||||
facts.append(f"class {sidecar['wheat_class_section']}")
|
||||
if sidecar.get("years_covered") and len(sidecar["years_covered"]) > 1:
|
||||
yc = sidecar["years_covered"]
|
||||
facts.append(f"years {yc[0]}–{yc[-1]}")
|
||||
elif sidecar.get("year"):
|
||||
facts.append(f"year {sidecar['year']}")
|
||||
lines.append(". ".join(facts) + ".")
|
||||
lines.append("")
|
||||
|
||||
varieties = sidecar.get("varieties_found") or []
|
||||
if varieties:
|
||||
lines.append("Varieties listed: " + ", ".join(varieties) + ".")
|
||||
lines.append("")
|
||||
|
||||
# Verbatim trial data — preserves variety + yield numbers adjacent
|
||||
# so BM25/dense can match "AP Iliad Aberdeen Idaho" queries.
|
||||
lines.append("Trial data (verbatim from PDF):")
|
||||
lines.append("")
|
||||
# The actual text was in the .md body but isn't in the sidecar
|
||||
# JSON. We render a brief marker; full text goes in the .md file
|
||||
# that get_page returns. For embedding signal, the title +
|
||||
# varieties + region is usually enough.
|
||||
# If we want the FULL text in the chunk we'd need to either store
|
||||
# it in the sidecar OR read it from the .md path at chunk time.
|
||||
# Read from the .md path:
|
||||
return "\n".join(lines).strip() + "\n"
|
||||
|
||||
|
||||
def _render_trial_chunk(sidecar: dict, md_text: str | None = None) -> str:
|
||||
"""Dispatch to the right trial renderer by source. Includes the
|
||||
verbatim trial body for sources whose value lives in the body text
|
||||
(currently agripro_trials)."""
|
||||
source = sidecar.get("source")
|
||||
if source == "gh_plot_reports":
|
||||
return _render_gh_plot_chunk(sidecar)
|
||||
if source == "agripro_trials":
|
||||
header = _render_agripro_trial_chunk(sidecar)
|
||||
if md_text:
|
||||
# Strip the markdown frontmatter so the body text is the
|
||||
# actual trial data, not the per-source preamble.
|
||||
body = md_text
|
||||
sep = "## Trial data (verbatim from PDF)"
|
||||
if sep in body:
|
||||
body = body.split(sep, 1)[1].strip()
|
||||
# Strip fence markers
|
||||
body = re.sub(r"```", "", body).strip()
|
||||
return header + "\n" + body + "\n"
|
||||
return header
|
||||
# Fallback: generic trial render
|
||||
return _render_gh_plot_chunk(sidecar)
|
||||
|
||||
|
||||
def _flat_trial_metadata(sidecar: dict) -> dict:
|
||||
"""Chroma-safe metadata for trial chunks. Mirrors variety metadata
|
||||
plus trial-specific facets (state, year, data_type)."""
|
||||
md: dict = {
|
||||
"source": sidecar.get("source") or "",
|
||||
"source_key": sidecar.get("source_key") or "",
|
||||
"data_type": sidecar.get("data_type") or "trial",
|
||||
"vendor": sidecar.get("vendor") or "",
|
||||
"brand": (sidecar.get("brand") or "").upper(),
|
||||
"crop": (sidecar.get("crop") or "").lower(),
|
||||
"source_url": (sidecar.get("source_urls") or [""])[0],
|
||||
}
|
||||
year = sidecar.get("year")
|
||||
if isinstance(year, int):
|
||||
md["year"] = year
|
||||
state = sidecar.get("state_abbrev") or sidecar.get("state")
|
||||
if state:
|
||||
md["state"] = state.upper() if len(state) <= 3 else state
|
||||
md["state_abbrev"] = (sidecar.get("state_abbrev") or "").upper()
|
||||
if sidecar.get("region"):
|
||||
md["region"] = sidecar["region"]
|
||||
if sidecar.get("wheat_class_section"):
|
||||
md["wheat_class"] = sidecar["wheat_class_section"]
|
||||
if sidecar.get("plot_id"):
|
||||
md["plot_id"] = sidecar["plot_id"]
|
||||
if isinstance(sidecar.get("n_results"), int):
|
||||
md["n_results"] = sidecar["n_results"]
|
||||
return md
|
||||
|
||||
|
||||
def chunks_from_trial(
|
||||
sidecar_path: Path | str,
|
||||
*,
|
||||
md_path: Path | str | None = None,
|
||||
) -> Iterator[dict]:
|
||||
"""Yield chunk dict(s) for one trial document. Emits exactly one
|
||||
chunk per trial.
|
||||
|
||||
Args:
|
||||
sidecar_path: path to the trial's JSON sidecar.
|
||||
md_path: path to the trial's markdown body (used for
|
||||
AgriPro PDFs whose value lives in the verbatim
|
||||
text). If omitted we infer it from sidecar_path.
|
||||
"""
|
||||
sc_path = Path(sidecar_path)
|
||||
sidecar = json.loads(sc_path.read_text(encoding="utf-8"))
|
||||
|
||||
md_text: str | None = None
|
||||
md_p = Path(md_path) if md_path else sc_path.with_suffix(".md")
|
||||
if md_p.exists():
|
||||
md_text = md_p.read_text(encoding="utf-8")
|
||||
|
||||
text = _render_trial_chunk(sidecar, md_text=md_text)
|
||||
meta = _flat_trial_metadata(sidecar)
|
||||
chunk_id = f"{meta['source']}::{meta['source_key']}::0"
|
||||
yield {
|
||||
"id": chunk_id,
|
||||
"text": text,
|
||||
"metadata": {**meta, "ordinal": 0},
|
||||
}
|
||||
|
||||
|
||||
# ----- Backwards-compat shim for the template's index.py -------------------
|
||||
#
|
||||
# The template's ``rag.index.page_records`` calls
|
||||
|
||||
+22
-3
@@ -12,6 +12,7 @@ Override via the PRODUCT_NAME env var.
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import time
|
||||
@@ -21,7 +22,7 @@ from typing import Iterator
|
||||
import chromadb
|
||||
from chromadb.config import Settings
|
||||
|
||||
from .chunk import chunks_from_variety
|
||||
from .chunk import chunks_from_variety, chunks_from_trial
|
||||
from .embeddings import embedding_function
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
@@ -37,7 +38,17 @@ COLLECTION = f"{PRODUCT_NAME}_docs"
|
||||
|
||||
def variety_records() -> Iterator[dict]:
|
||||
"""Walk ``corpus/<source>/<source_key>.json``, yield one chunk per
|
||||
variety."""
|
||||
document.
|
||||
|
||||
Dispatches by the sidecar's ``data_type`` field:
|
||||
- ``"trial"`` → chunks_from_trial (gh_plot_reports, agripro_trials)
|
||||
- anything else (or absent) → chunks_from_variety (default)
|
||||
|
||||
The output shape (id/text/metadata) is identical for both — only
|
||||
the chunk text composition and metadata keys differ. Chroma + BM25
|
||||
can index both into the same collection; downstream tools filter
|
||||
by the ``data_type`` metadata field.
|
||||
"""
|
||||
if not CORPUS.exists():
|
||||
log.error("corpus/ doesn't exist; run a scraper first")
|
||||
return
|
||||
@@ -45,7 +56,15 @@ def variety_records() -> Iterator[dict]:
|
||||
if not source_dir.is_dir() or source_dir.name.startswith("."):
|
||||
continue
|
||||
for sidecar_path in sorted(source_dir.glob("*.json")):
|
||||
yield from chunks_from_variety(sidecar_path)
|
||||
try:
|
||||
head = json.loads(sidecar_path.read_text(encoding="utf-8"))
|
||||
except (OSError, json.JSONDecodeError) as exc:
|
||||
log.warning("skipping unreadable sidecar %s: %s", sidecar_path, exc)
|
||||
continue
|
||||
if head.get("data_type") == "trial":
|
||||
yield from chunks_from_trial(sidecar_path)
|
||||
else:
|
||||
yield from chunks_from_variety(sidecar_path)
|
||||
|
||||
|
||||
def upsert_to_chroma(records: list[dict]) -> int:
|
||||
|
||||
@@ -0,0 +1,483 @@
|
||||
"""AgriPro trial-PDF scraper.
|
||||
|
||||
Source: ``agriprowheat.com/trials-data`` — a single page listing
|
||||
~38 PDF links to regional wheat trial summary documents. Each PDF
|
||||
is a multi-year multi-location performance test comparing AgriPro
|
||||
varieties against competitors (LCS, Norwest, PNW, UI, etc.).
|
||||
|
||||
Discovery: walk ``/trials-data``, collect every ``href="*.pdf"``.
|
||||
|
||||
Per-PDF content (parsed via pdfplumber):
|
||||
- First line: usually the title (e.g.
|
||||
"2024 Pacific Northwest Combined Summary, Three-Year Data")
|
||||
- A multi-column table with one row per variety. Columns vary by
|
||||
PDF but typically include: 3-yr combined yield, 2-yr combined,
|
||||
most-recent-year yield, plus per-location yields with location
|
||||
names in the header.
|
||||
- Footer notes: locations covered, LSD/CV statistical caveats,
|
||||
copyright.
|
||||
|
||||
Trial PDFs are stable text-extractable (no charts). We capture the
|
||||
full per-page text verbatim in the chunk body — preserving
|
||||
variety-name + yield-number adjacency for the embedder — plus
|
||||
metadata derived from the title (region, year, crop class). This is
|
||||
a deliberate trade-off: perfect table parsing across the PDF
|
||||
variants would be brittle; verbatim text preserves every data point
|
||||
and the embedder + BM25 between them can match queries like
|
||||
"AP Iliad yield Aberdeen Idaho" reliably.
|
||||
|
||||
Output:
|
||||
corpus/agripro_trials/<source_key>.md
|
||||
corpus/agripro_trials/<source_key>.json
|
||||
|
||||
source_key convention: ``agt-<slugified-filename-stem>`` lowercased,
|
||||
e.g. ``agt-2024-pnw-combined``.
|
||||
|
||||
CLI:
|
||||
python -m scrape.sources.agripro_trials --limit 5
|
||||
python -m scrape.sources.agripro_trials --force
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import io
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import random
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
import pdfplumber
|
||||
|
||||
SCRAPER_VERSION = "0.1.0"
|
||||
USER_AGENT = "seed-mcp-scraper/0.1 (+https://drawbar.example/contact)"
|
||||
BASE = "https://agriprowheat.com"
|
||||
LIST_URL = f"{BASE}/trials-data"
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[2]
|
||||
CORPUS_ROOT = Path(os.environ.get("CORPUS_ROOT") or REPO_ROOT / "corpus")
|
||||
CORPUS_DIR = CORPUS_ROOT / "agripro_trials"
|
||||
|
||||
REQ_INTERVAL_SEC = 1.0
|
||||
|
||||
log = logging.getLogger("scrape.agripro_trials")
|
||||
|
||||
# Region name patterns we recognize in PDF filenames / titles. The
|
||||
# value is a human-readable normalized region.
|
||||
REGION_PATTERNS = (
|
||||
(re.compile(r"\bPNW\b|Pacific Northwest", re.I), "Pacific Northwest"),
|
||||
(re.compile(r"\bNE Colorado\b|Northeast Colorado", re.I), "NE Colorado"),
|
||||
(re.compile(r"\bSC KS\b|South Central Kansas", re.I), "SC Kansas / N Central OK"),
|
||||
(re.compile(r"\bWestern Plains\b", re.I), "Western Plains"),
|
||||
(re.compile(r"\bCentral Plains\b", re.I), "Central Plains"),
|
||||
(re.compile(r"\bPlains Irrigated\b", re.I), "Plains Irrigated"),
|
||||
(re.compile(r"\bWashington[/:]?N? *Idaho\b", re.I), "WA / N. Idaho"),
|
||||
(re.compile(r"\bSouthern Idaho\b", re.I), "Southern Idaho"),
|
||||
(re.compile(r"\bMontana\b", re.I), "Montana"),
|
||||
(re.compile(r"\bNP Perf Data\b|Northern Plains", re.I), "Northern Plains"),
|
||||
(re.compile(r"\bWheat after Soy\b", re.I), "Wheat-after-Soy rotation"),
|
||||
)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- HTTP
|
||||
|
||||
|
||||
class RateLimitedSession:
|
||||
def __init__(self, interval: float = REQ_INTERVAL_SEC) -> None:
|
||||
self.s = requests.Session()
|
||||
self.s.headers["User-Agent"] = USER_AGENT
|
||||
self.interval = interval
|
||||
self._last = 0.0
|
||||
|
||||
def _wait(self) -> None:
|
||||
delta = time.monotonic() - self._last
|
||||
if delta < self.interval:
|
||||
time.sleep(self.interval - delta)
|
||||
self._last = time.monotonic()
|
||||
|
||||
def request(
|
||||
self,
|
||||
method: str,
|
||||
url: str,
|
||||
*,
|
||||
max_retries: int = 4,
|
||||
timeout: float = 60.0,
|
||||
**kw: Any,
|
||||
) -> requests.Response:
|
||||
last_exc: Exception | None = None
|
||||
for attempt in range(max_retries):
|
||||
self._wait()
|
||||
try:
|
||||
resp = self.s.request(method, url, timeout=timeout, **kw)
|
||||
except requests.RequestException as exc:
|
||||
last_exc = exc
|
||||
backoff = min(30.0, (2 ** attempt) + random.random())
|
||||
log.warning("network error on %s %s: %s — retry in %.1fs",
|
||||
method, url, exc, backoff)
|
||||
time.sleep(backoff)
|
||||
continue
|
||||
if resp.status_code == 429 or 500 <= resp.status_code < 600:
|
||||
ra = resp.headers.get("Retry-After")
|
||||
backoff = float(ra) if (ra and ra.isdigit()) else min(30.0, (2 ** attempt) + random.random())
|
||||
log.warning("HTTP %d on %s %s — retry in %.1fs",
|
||||
resp.status_code, method, url, backoff)
|
||||
time.sleep(backoff)
|
||||
continue
|
||||
return resp
|
||||
if last_exc:
|
||||
raise last_exc
|
||||
return resp # type: ignore[return-value]
|
||||
|
||||
def get(self, url: str, **kw: Any) -> requests.Response:
|
||||
return self.request("GET", url, **kw)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- model
|
||||
|
||||
|
||||
@dataclass
|
||||
class TrialPDF:
|
||||
source_key: str
|
||||
source_url: str
|
||||
pdf_url: str
|
||||
filename: str
|
||||
title: str | None = None
|
||||
year: int | None = None
|
||||
years_covered: list[int] = field(default_factory=list)
|
||||
region: str | None = None
|
||||
wheat_class_section: str | None = None # e.g. "Soft White Winter Wheat" — derived from PDF text
|
||||
page_text: str = ""
|
||||
varieties_found: list[str] = field(default_factory=list)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- discovery
|
||||
|
||||
|
||||
def discover_pdfs(http: RateLimitedSession) -> list[tuple[str, str, str, str]]:
|
||||
"""Return ``[(pdf_url, filename, section_heading, section_anchor), ...]``
|
||||
for every PDF on /trials-data.
|
||||
|
||||
De-duplicates by pdf_url — multiple section headings may link to
|
||||
the same PDF (e.g. a multi-state summary).
|
||||
"""
|
||||
log.info("fetching trials index %s", LIST_URL)
|
||||
r = http.get(LIST_URL)
|
||||
r.raise_for_status()
|
||||
soup = BeautifulSoup(r.text, "html.parser")
|
||||
seen: dict[str, tuple[str, str, str, str]] = {}
|
||||
for a in soup.find_all("a", href=re.compile(r"\.pdf(?:$|\?)", re.I)):
|
||||
href = a["href"]
|
||||
from urllib.parse import urljoin
|
||||
full = urljoin(LIST_URL, href)
|
||||
fn = href.rsplit("/", 1)[-1]
|
||||
# Section context — closest preceding h2/h3/h4
|
||||
section = ""
|
||||
parent = a.parent
|
||||
for _ in range(10):
|
||||
if parent is None:
|
||||
break
|
||||
head = parent.find_previous(["h2", "h3", "h4"])
|
||||
if head:
|
||||
section = head.get_text(strip=True)
|
||||
break
|
||||
parent = parent.parent
|
||||
if full not in seen:
|
||||
seen[full] = (full, fn, section, href)
|
||||
out = list(seen.values())
|
||||
log.info("trial PDFs found: %d (deduped from %d total links)",
|
||||
len(out),
|
||||
sum(1 for a in soup.find_all("a", href=re.compile(r"\.pdf", re.I))))
|
||||
return out
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- helpers
|
||||
|
||||
|
||||
def source_key_for(filename: str) -> str:
|
||||
"""``2024 PNW Combined.pdf`` → ``agt-2024-pnw-combined``."""
|
||||
from urllib.parse import unquote
|
||||
stem = unquote(filename).rsplit(".", 1)[0]
|
||||
slug = re.sub(r"[^a-zA-Z0-9]+", "-", stem).strip("-").lower()
|
||||
return f"agt-{slug}"
|
||||
|
||||
|
||||
def _detect_region(text: str) -> str | None:
|
||||
for pat, label in REGION_PATTERNS:
|
||||
if pat.search(text):
|
||||
return label
|
||||
return None
|
||||
|
||||
|
||||
def _detect_years(text: str) -> list[int]:
|
||||
"""Return sorted years found in the PDF title / first lines.
|
||||
Filters to 2010-2030 to ignore page numbers / table values."""
|
||||
years = sorted({
|
||||
int(y) for y in re.findall(r"\b(20[1-3]\d)\b", text[:600])
|
||||
})
|
||||
return years
|
||||
|
||||
|
||||
def _detect_wheat_class_section(text: str) -> str | None:
|
||||
"""The trial PDFs typically have a class label line like
|
||||
'Soft White Winter Wheat' near the top of the table."""
|
||||
for label in (
|
||||
"Hard Red Winter Wheat", "Hard Red Spring Wheat",
|
||||
"Hard White Spring Wheat", "Hard White Winter Wheat",
|
||||
"Soft White Winter Wheat", "Soft White Spring Wheat",
|
||||
"Soft Red Winter Wheat", "Durum",
|
||||
):
|
||||
if re.search(r"\b" + re.escape(label) + r"\b", text[:1500], re.I):
|
||||
return label
|
||||
return None
|
||||
|
||||
|
||||
# Variety name patterns we expect to see in AgriPro trial PDFs.
|
||||
# AgriPro varieties = AP <name>, SY <name>; competitors include
|
||||
# LCS <name>, UI <name>, PNW <name>, Norwest <name>.
|
||||
_VARIETY_LINE_RE = re.compile(
|
||||
r"^(?:AP|SY|LCS|UI|PNW|Norwest|WB|Stine|Pioneer)\b[A-Za-z0-9 \-+]*",
|
||||
)
|
||||
|
||||
|
||||
def _detect_varieties(text: str) -> list[str]:
|
||||
out: list[str] = []
|
||||
seen: set[str] = set()
|
||||
for line in text.splitlines():
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
m = _VARIETY_LINE_RE.match(line)
|
||||
if m:
|
||||
# Up to first run of digits / spaces — variety name only
|
||||
name_match = re.match(r"^([A-Za-z][A-Za-z0-9 \-+]*?)\s+\d", line)
|
||||
name = name_match.group(1).strip() if name_match else m.group(0).strip()
|
||||
# Trim trailing single tokens that are clearly stats
|
||||
if name and name not in seen and len(name) <= 40:
|
||||
seen.add(name)
|
||||
out.append(name)
|
||||
return out
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- detail
|
||||
|
||||
|
||||
def fetch_pdf_detail(
|
||||
http: RateLimitedSession,
|
||||
pdf_url: str,
|
||||
filename: str,
|
||||
) -> TrialPDF | None:
|
||||
"""Download + parse one trial PDF."""
|
||||
r = http.get(pdf_url)
|
||||
if r.status_code == 404:
|
||||
return None
|
||||
r.raise_for_status()
|
||||
try:
|
||||
with pdfplumber.open(io.BytesIO(r.content)) as pdf:
|
||||
pages_text = []
|
||||
for p in pdf.pages:
|
||||
t = p.extract_text() or ""
|
||||
pages_text.append(t)
|
||||
text = "\n\n".join(pages_text).strip()
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.warning("PDF parse failed for %s: %s", pdf_url, exc)
|
||||
return None
|
||||
|
||||
title = ""
|
||||
if text:
|
||||
# First non-empty line is usually the title.
|
||||
for line in text.splitlines():
|
||||
line = line.strip()
|
||||
if line:
|
||||
title = line
|
||||
break
|
||||
|
||||
region = _detect_region(filename) or _detect_region(title or "")
|
||||
years = _detect_years(title + "\n" + filename)
|
||||
wheat_class_section = _detect_wheat_class_section(text)
|
||||
varieties = _detect_varieties(text)
|
||||
|
||||
return TrialPDF(
|
||||
source_key=source_key_for(filename),
|
||||
source_url=LIST_URL,
|
||||
pdf_url=pdf_url,
|
||||
filename=filename,
|
||||
title=title or None,
|
||||
year=years[-1] if years else None,
|
||||
years_covered=years,
|
||||
region=region,
|
||||
wheat_class_section=wheat_class_section,
|
||||
page_text=text,
|
||||
varieties_found=varieties,
|
||||
)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- render
|
||||
|
||||
|
||||
def render_markdown(p: TrialPDF) -> str:
|
||||
head: list[str] = [
|
||||
f"# {p.title or p.filename}",
|
||||
"",
|
||||
"- **Source:** AgriPro (Syngenta) regional trial PDF",
|
||||
"- **Vendor:** Syngenta",
|
||||
"- **Brand:** AgriPro",
|
||||
"- **Crop:** Wheat",
|
||||
"- **Data type:** trial",
|
||||
]
|
||||
if p.region:
|
||||
head.append(f"- **Region:** {p.region}")
|
||||
if p.wheat_class_section:
|
||||
head.append(f"- **Wheat class:** {p.wheat_class_section}")
|
||||
if p.year:
|
||||
head.append(f"- **Year:** {p.year}")
|
||||
if p.years_covered and len(p.years_covered) > 1:
|
||||
head.append(f"- **Years covered:** {p.years_covered[0]}–{p.years_covered[-1]}")
|
||||
head.append(f"- **PDF:** {p.pdf_url}")
|
||||
head.append(f"- **Index page:** {p.source_url}")
|
||||
if p.varieties_found:
|
||||
head.append(
|
||||
f"- **Varieties listed:** {', '.join(p.varieties_found[:30])}"
|
||||
+ ("…" if len(p.varieties_found) > 30 else "")
|
||||
)
|
||||
head.append("")
|
||||
head.append("---")
|
||||
head.append("")
|
||||
head.append("## Trial data (verbatim from PDF)")
|
||||
head.append("")
|
||||
head.append("```")
|
||||
head.append(p.page_text)
|
||||
head.append("```")
|
||||
return "\n".join(head)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- write
|
||||
|
||||
|
||||
def write_pdf(prod: TrialPDF, body_md: str) -> None:
|
||||
CORPUS_DIR.mkdir(parents=True, exist_ok=True)
|
||||
md_path = CORPUS_DIR / f"{prod.source_key}.md"
|
||||
json_path = CORPUS_DIR / f"{prod.source_key}.json"
|
||||
|
||||
md_path.write_text(body_md, encoding="utf-8")
|
||||
sidecar = {
|
||||
"source": "agripro_trials",
|
||||
"source_key": prod.source_key,
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta",
|
||||
"brand": "AgriPro",
|
||||
"crop": "wheat",
|
||||
"title": prod.title,
|
||||
"filename": prod.filename,
|
||||
"region": prod.region,
|
||||
"wheat_class_section": prod.wheat_class_section,
|
||||
"year": prod.year,
|
||||
"years_covered": prod.years_covered,
|
||||
"varieties_found": prod.varieties_found,
|
||||
"pdf_url": prod.pdf_url,
|
||||
"source_urls": [prod.source_url, prod.pdf_url],
|
||||
"page_text_chars": len(prod.page_text),
|
||||
"fetched_at": datetime.now(timezone.utc).isoformat(),
|
||||
"scraper_version": SCRAPER_VERSION,
|
||||
}
|
||||
json_path.write_text(
|
||||
json.dumps(sidecar, indent=2, ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- pipeline
|
||||
|
||||
|
||||
def process_pdf(
|
||||
http: RateLimitedSession,
|
||||
*,
|
||||
pdf_url: str,
|
||||
filename: str,
|
||||
force: bool,
|
||||
) -> tuple[str, TrialPDF | None]:
|
||||
sk = source_key_for(filename)
|
||||
md_path = CORPUS_DIR / f"{sk}.md"
|
||||
if md_path.exists() and not force:
|
||||
return "skipped", None
|
||||
try:
|
||||
prod = fetch_pdf_detail(http, pdf_url, filename)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.error("PDF fetch/parse failed for %s: %s", pdf_url, exc)
|
||||
return "failed", None
|
||||
if prod is None:
|
||||
return "missing", None
|
||||
body = render_markdown(prod)
|
||||
write_pdf(prod, body)
|
||||
return "written", prod
|
||||
|
||||
|
||||
def run(*, limit: int | None, force: bool) -> int:
|
||||
CORPUS_DIR.mkdir(parents=True, exist_ok=True)
|
||||
http = RateLimitedSession()
|
||||
targets = discover_pdfs(http)
|
||||
|
||||
counts = {"written": 0, "skipped": 0, "missing": 0, "failed": 0}
|
||||
processed = 0
|
||||
for pdf_url, filename, _section, _href in targets:
|
||||
if limit is not None and processed >= limit:
|
||||
break
|
||||
processed += 1
|
||||
status, prod = process_pdf(
|
||||
http, pdf_url=pdf_url, filename=filename, force=force,
|
||||
)
|
||||
counts[status] = counts.get(status, 0) + 1
|
||||
log.info(
|
||||
"[%d/%d] %s %s | region=%s year=%s varieties=%d chars=%d",
|
||||
processed, len(targets),
|
||||
source_key_for(filename), status,
|
||||
(prod.region if prod else "-") or "-",
|
||||
prod.year if prod else "-",
|
||||
len(prod.varieties_found) if prod else 0,
|
||||
len(prod.page_text) if prod else 0,
|
||||
)
|
||||
|
||||
log.info(
|
||||
"done: processed=%d written=%d skipped=%d missing=%d failed=%d (of %d PDFs)",
|
||||
processed, counts["written"], counts["skipped"],
|
||||
counts["missing"], counts["failed"], len(targets),
|
||||
)
|
||||
return 0 if counts["failed"] == 0 else 1
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- CLI
|
||||
|
||||
|
||||
def _build_argparser() -> argparse.ArgumentParser:
|
||||
p = argparse.ArgumentParser(
|
||||
prog="scrape.sources.agripro_trials",
|
||||
description="Scrape AgriPro regional trial PDFs.",
|
||||
)
|
||||
p.add_argument("--limit", type=int, default=None,
|
||||
help="Stop after processing N PDFs (default: all).")
|
||||
p.add_argument("--force", action="store_true",
|
||||
help="Re-fetch even if the markdown file already exists.")
|
||||
p.add_argument("--log-level", default=os.environ.get("LOG_LEVEL", "INFO"))
|
||||
return p
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
args = _build_argparser().parse_args(argv)
|
||||
logging.basicConfig(
|
||||
level=args.log_level.upper(),
|
||||
format="%(asctime)s %(levelname)s %(name)s %(message)s",
|
||||
stream=sys.stderr,
|
||||
)
|
||||
return run(limit=args.limit, force=args.force)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -0,0 +1,781 @@
|
||||
"""Golden Harvest plot-report scraper — cross-vendor yield trials.
|
||||
|
||||
This is the FIRST source in the seed-mcp corpus with ``data_type:
|
||||
"trial"`` rather than the per-variety identity records all other
|
||||
scrapers emit. Each document is one head-to-head yield trial at a
|
||||
specific state/year/site, comparing products across brands (NK,
|
||||
DEKALB, Golden Harvest, sometimes Pioneer/Channel etc. listed as
|
||||
competitor entries) — i.e. **third-party-feeling cross-vendor data
|
||||
that Bayer doesn't publish itself**.
|
||||
|
||||
Source: ``goldenharvestseeds.com`` — same site as ``golden_harvest``
|
||||
variety scraper. ``/sitemap-ghs-hybrids.xml`` (already walked for
|
||||
the variety scraper) lists 8,237 plot reports across:
|
||||
|
||||
Year Corn Soy Silage Total
|
||||
2023 1,832 1,614 173 3,619
|
||||
2024 1,432 1,277 137 2,846
|
||||
2025 973 703 96 1,772
|
||||
|
||||
Initial scrape: 2024 + 2025 (4,618 reports). 2023 is older data
|
||||
that's still informative but lower priority. Defer 2023 to a later
|
||||
backfill pass via ``--include-2023``.
|
||||
|
||||
URL shape:
|
||||
/<crop>/plot-report/<state-abbrev>/<year>/<plot-id>
|
||||
e.g. /corn/plot-report/al/2023/2374765
|
||||
|
||||
Per-report data (server-rendered HTML):
|
||||
- Cooperator name (h1 area)
|
||||
- State (full name, e.g. "Alabama")
|
||||
- Planted date / Harvested date
|
||||
- Population (seeds/acre), Row Width
|
||||
- One <table> with columns:
|
||||
Rank | Brand | Product | Traits | Yield (BU/Acre) | %MST |
|
||||
Test Weight | Gross Revenue | Entry #
|
||||
|
||||
Each row in the results table can be from any seed brand — the
|
||||
trial is the test, not the catalog. Brand and product are the join
|
||||
keys back to the per-variety corpus (lookup_variety can pull the
|
||||
identity record if we have the same brand/product).
|
||||
|
||||
Output:
|
||||
corpus/gh_plot_reports/<source_key>.md LLM-visible body
|
||||
corpus/gh_plot_reports/<source_key>.json sidecar metadata
|
||||
|
||||
source_key convention: ``ghpr-<crop>-<state>-<year>-<plot_id>``
|
||||
e.g. ``ghpr-corn-al-2023-2374765``.
|
||||
|
||||
CLI:
|
||||
python -m scrape.sources.gh_plot_reports --limit 5
|
||||
python -m scrape.sources.gh_plot_reports --crop corn --state ia --year 2024
|
||||
python -m scrape.sources.gh_plot_reports --include-2023 --force
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import random
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
SCRAPER_VERSION = "0.1.0"
|
||||
USER_AGENT = "seed-mcp-scraper/0.1 (+https://drawbar.example/contact)"
|
||||
BASE = "https://www.goldenharvestseeds.com"
|
||||
SITEMAP_HYBRIDS = f"{BASE}/sitemap-ghs-hybrids.xml"
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[2]
|
||||
CORPUS_ROOT = Path(os.environ.get("CORPUS_ROOT") or REPO_ROOT / "corpus")
|
||||
CORPUS_DIR = CORPUS_ROOT / "gh_plot_reports"
|
||||
|
||||
REQ_INTERVAL_SEC = 1.0
|
||||
|
||||
log = logging.getLogger("scrape.gh_plot_reports")
|
||||
|
||||
# State name normalization: URL gives a 2-letter abbrev; sidecar keeps
|
||||
# both forms so search filters can use either.
|
||||
STATE_NAMES = {
|
||||
"al": "Alabama", "ak": "Alaska", "az": "Arizona", "ar": "Arkansas",
|
||||
"ca": "California", "co": "Colorado", "ct": "Connecticut",
|
||||
"de": "Delaware", "fl": "Florida", "ga": "Georgia", "hi": "Hawaii",
|
||||
"id": "Idaho", "il": "Illinois", "in": "Indiana", "ia": "Iowa",
|
||||
"ks": "Kansas", "ky": "Kentucky", "la": "Louisiana", "me": "Maine",
|
||||
"md": "Maryland", "ma": "Massachusetts", "mi": "Michigan",
|
||||
"mn": "Minnesota", "ms": "Mississippi", "mo": "Missouri",
|
||||
"mt": "Montana", "ne": "Nebraska", "nv": "Nevada", "nh": "New Hampshire",
|
||||
"nj": "New Jersey", "nm": "New Mexico", "ny": "New York",
|
||||
"nc": "North Carolina", "nd": "North Dakota", "oh": "Ohio",
|
||||
"ok": "Oklahoma", "or": "Oregon", "pa": "Pennsylvania",
|
||||
"ri": "Rhode Island", "sc": "South Carolina", "sd": "South Dakota",
|
||||
"tn": "Tennessee", "tx": "Texas", "ut": "Utah", "vt": "Vermont",
|
||||
"va": "Virginia", "wa": "Washington", "wv": "West Virginia",
|
||||
"wi": "Wisconsin", "wy": "Wyoming",
|
||||
}
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- HTTP
|
||||
|
||||
|
||||
class RateLimitedSession:
|
||||
def __init__(self, interval: float = REQ_INTERVAL_SEC) -> None:
|
||||
self.s = requests.Session()
|
||||
self.s.headers["User-Agent"] = USER_AGENT
|
||||
self.interval = interval
|
||||
self._last = 0.0
|
||||
|
||||
def _wait(self) -> None:
|
||||
delta = time.monotonic() - self._last
|
||||
if delta < self.interval:
|
||||
time.sleep(self.interval - delta)
|
||||
self._last = time.monotonic()
|
||||
|
||||
def request(
|
||||
self,
|
||||
method: str,
|
||||
url: str,
|
||||
*,
|
||||
max_retries: int = 4,
|
||||
timeout: float = 30.0,
|
||||
**kw: Any,
|
||||
) -> requests.Response:
|
||||
last_exc: Exception | None = None
|
||||
for attempt in range(max_retries):
|
||||
self._wait()
|
||||
try:
|
||||
resp = self.s.request(method, url, timeout=timeout, **kw)
|
||||
except requests.RequestException as exc:
|
||||
last_exc = exc
|
||||
backoff = min(30.0, (2 ** attempt) + random.random())
|
||||
log.warning("network error on %s %s: %s — retry in %.1fs",
|
||||
method, url, exc, backoff)
|
||||
time.sleep(backoff)
|
||||
continue
|
||||
if resp.status_code == 429 or 500 <= resp.status_code < 600:
|
||||
ra = resp.headers.get("Retry-After")
|
||||
backoff = float(ra) if (ra and ra.isdigit()) else min(30.0, (2 ** attempt) + random.random())
|
||||
log.warning("HTTP %d on %s %s — retry in %.1fs",
|
||||
resp.status_code, method, url, backoff)
|
||||
time.sleep(backoff)
|
||||
continue
|
||||
return resp
|
||||
if last_exc:
|
||||
raise last_exc
|
||||
return resp # type: ignore[return-value]
|
||||
|
||||
def get(self, url: str, **kw: Any) -> requests.Response:
|
||||
return self.request("GET", url, **kw)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- model
|
||||
|
||||
|
||||
@dataclass
|
||||
class TrialResult:
|
||||
rank: int | None = None
|
||||
brand: str = ""
|
||||
product: str = ""
|
||||
traits: str = ""
|
||||
# Generic per-column metrics — keyed by the header from the table
|
||||
# (e.g. "Yield" / "%MST" / "Ton/Acre" / "Milk Per Acre" /
|
||||
# "Beef Per Ton"). Corn + soy use Yield/MST/Test Weight/Gross
|
||||
# Revenue; silage uses Ton/Acre + Milk + Beef columns. Storing as
|
||||
# an open dict keeps the scraper robust across crop types.
|
||||
metrics: dict[str, float | str | None] = field(default_factory=dict)
|
||||
entry_num: int | None = None
|
||||
|
||||
# Convenience accessors — back-compat for the chunker that looks
|
||||
# up these specific keys.
|
||||
@property
|
||||
def yield_bu_ac(self) -> float | None:
|
||||
v = self.metrics.get("Yield")
|
||||
return v if isinstance(v, (int, float)) else None
|
||||
|
||||
@property
|
||||
def mst_pct(self) -> float | None:
|
||||
v = self.metrics.get("%MST")
|
||||
return v if isinstance(v, (int, float)) else None
|
||||
|
||||
@property
|
||||
def test_weight(self) -> float | None:
|
||||
v = self.metrics.get("Test Weight")
|
||||
return v if isinstance(v, (int, float)) else None
|
||||
|
||||
@property
|
||||
def gross_revenue_dol_ac(self) -> float | None:
|
||||
v = self.metrics.get("Gross Revenue")
|
||||
return v if isinstance(v, (int, float)) else None
|
||||
|
||||
@property
|
||||
def primary_metric(self) -> tuple[str, float | None]:
|
||||
"""The first numeric metric — used as the canonical 'yield'
|
||||
for ranking in the chunk preamble. Corn/soy: Yield (BU/Ac).
|
||||
Silage: Ton/Acre."""
|
||||
for k in ("Yield", "Ton/Acre", "Tons/Acre"):
|
||||
v = self.metrics.get(k)
|
||||
if isinstance(v, (int, float)):
|
||||
return (k, v)
|
||||
# Fallback to first numeric metric
|
||||
for k, v in self.metrics.items():
|
||||
if isinstance(v, (int, float)):
|
||||
return (k, v)
|
||||
return ("", None)
|
||||
|
||||
|
||||
@dataclass
|
||||
class PlotReport:
|
||||
source_key: str
|
||||
source_url: str
|
||||
crop: str # "corn" / "soybeans" / "silage"
|
||||
state_abbrev: str # "al"
|
||||
state_name: str # "Alabama"
|
||||
year: int
|
||||
plot_id: str
|
||||
|
||||
cooperator: str | None = None
|
||||
planted_date: str | None = None # ISO date
|
||||
harvested_date: str | None = None # ISO date
|
||||
population: int | None = None
|
||||
row_width: int | None = None
|
||||
|
||||
results: list[TrialResult] = field(default_factory=list)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- discovery
|
||||
|
||||
|
||||
_PLOT_URL_RE = re.compile(
|
||||
r".*?/(?P<crop>corn|soybean|silage)/plot-report/"
|
||||
r"(?P<state>[a-z]{2})/(?P<year>\d{4})/(?P<plot>\d+)"
|
||||
)
|
||||
|
||||
|
||||
def discover_plots(
|
||||
http: RateLimitedSession,
|
||||
*,
|
||||
crops: set[str],
|
||||
states: set[str] | None,
|
||||
years: set[int],
|
||||
) -> list[tuple[str, str, str, int, str]]:
|
||||
"""Walk the hybrids sitemap and return matching plot URLs as
|
||||
``[(url, crop, state, year, plot_id), ...]`` tuples. ``crop`` is
|
||||
normalized to the schema's terms (soybean → soybeans)."""
|
||||
log.info("fetching sitemap %s", SITEMAP_HYBRIDS)
|
||||
r = http.get(SITEMAP_HYBRIDS)
|
||||
r.raise_for_status()
|
||||
entries = re.findall(r"<loc>([^<]+)</loc>", r.text)
|
||||
log.info("sitemap parsed: %d total locs", len(entries))
|
||||
|
||||
out: list[tuple[str, str, str, int, str]] = []
|
||||
for url in entries:
|
||||
m = _PLOT_URL_RE.match(url)
|
||||
if not m:
|
||||
continue
|
||||
crop_url = m.group("crop")
|
||||
# Normalize "soybean" → "soybeans" to match the rest of the corpus.
|
||||
crop = "soybeans" if crop_url == "soybean" else crop_url
|
||||
state = m.group("state").lower()
|
||||
year = int(m.group("year"))
|
||||
plot = m.group("plot")
|
||||
if crops and crop not in crops:
|
||||
continue
|
||||
if states and state not in states:
|
||||
continue
|
||||
if years and year not in years:
|
||||
continue
|
||||
out.append((url, crop, state, year, plot))
|
||||
|
||||
log.info("after filters: %d plot URLs", len(out))
|
||||
return out
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- helpers
|
||||
|
||||
|
||||
def source_key_for(crop: str, state: str, year: int, plot_id: str) -> str:
|
||||
return f"ghpr-{crop}-{state}-{year}-{plot_id}"
|
||||
|
||||
|
||||
def _parse_date_mdy(s: str) -> str | None:
|
||||
"""``04/06/23`` → ``2023-04-06``. Two-digit years are assumed to
|
||||
be 20xx (sane for current-century trial data)."""
|
||||
s = (s or "").strip()
|
||||
m = re.match(r"^(\d{1,2})/(\d{1,2})/(\d{2,4})$", s)
|
||||
if not m:
|
||||
return None
|
||||
mo, dy, yr = m.group(1), m.group(2), m.group(3)
|
||||
if len(yr) == 2:
|
||||
yr = "20" + yr
|
||||
try:
|
||||
return f"{int(yr):04d}-{int(mo):02d}-{int(dy):02d}"
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _parse_int(s: str | None) -> int | None:
|
||||
if not s:
|
||||
return None
|
||||
s = re.sub(r"[,$]", "", str(s).strip())
|
||||
try:
|
||||
return int(s)
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _parse_float(s: str | None) -> float | None:
|
||||
if not s:
|
||||
return None
|
||||
s = re.sub(r"[,$]", "", str(s).strip())
|
||||
try:
|
||||
return float(s)
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- detail
|
||||
|
||||
|
||||
def fetch_plot_detail(
|
||||
http: RateLimitedSession,
|
||||
url: str,
|
||||
crop: str,
|
||||
state: str,
|
||||
year: int,
|
||||
plot_id: str,
|
||||
) -> PlotReport | None:
|
||||
"""Fetch one plot-report page and parse it."""
|
||||
r = http.get(url)
|
||||
if r.status_code == 404:
|
||||
return None
|
||||
r.raise_for_status()
|
||||
soup = BeautifulSoup(r.text, "html.parser")
|
||||
|
||||
prod = PlotReport(
|
||||
source_key=source_key_for(crop, state, year, plot_id),
|
||||
source_url=url,
|
||||
crop=crop,
|
||||
state_abbrev=state,
|
||||
state_name=STATE_NAMES.get(state, state.upper()),
|
||||
year=year,
|
||||
plot_id=plot_id,
|
||||
)
|
||||
|
||||
# Pull metadata from the header area. The page renders cooperator
|
||||
# name + state + key fields as text following the h1.
|
||||
h1 = soup.find("h1")
|
||||
if h1:
|
||||
# Walk up to a parent that includes the metadata strip
|
||||
container = h1.parent
|
||||
while container is not None and not container.find("table"):
|
||||
parent = container.parent
|
||||
if parent is None:
|
||||
break
|
||||
container = parent
|
||||
if container:
|
||||
text = container.get_text(" | ", strip=True)
|
||||
# Cooperator is usually the segment right after the H1.
|
||||
# Pattern: "Corn Plot Results | <Name> | <State> | Planted: | ..."
|
||||
parts = [p.strip() for p in text.split("|") if p.strip()]
|
||||
# Drop the title segment
|
||||
if parts and parts[0].lower().startswith(("corn plot", "soybean plot", "silage plot")):
|
||||
parts = parts[1:]
|
||||
if parts:
|
||||
# First segment that doesn't match a state name is the cooperator
|
||||
cand = parts[0]
|
||||
if cand and cand != prod.state_name and not cand.endswith(":"):
|
||||
prod.cooperator = cand
|
||||
|
||||
# Walk the page text for known labeled fields.
|
||||
page_text = soup.get_text(" ", strip=True)
|
||||
m = re.search(r"Planted:\s*(\d{1,2}/\d{1,2}/\d{2,4})", page_text)
|
||||
if m:
|
||||
prod.planted_date = _parse_date_mdy(m.group(1))
|
||||
m = re.search(r"Harvested:\s*(\d{1,2}/\d{1,2}/\d{2,4})", page_text)
|
||||
if m:
|
||||
prod.harvested_date = _parse_date_mdy(m.group(1))
|
||||
m = re.search(r"Population:\s*([\d,]+)", page_text)
|
||||
if m:
|
||||
prod.population = _parse_int(m.group(1))
|
||||
m = re.search(r"Row Width:\s*(\d+)", page_text)
|
||||
if m:
|
||||
prod.row_width = _parse_int(m.group(1))
|
||||
|
||||
# Parse the results table. The HTML uses ONE merged cell for
|
||||
# "Brand Product Traits" (despite the header containing all
|
||||
# three labels); subsequent cells are Yield, %MST, Test Weight,
|
||||
# Gross Revenue, Entry #. We split the merged cell using a
|
||||
# known-brand prefix match.
|
||||
table = soup.find("table")
|
||||
if not table:
|
||||
return prod
|
||||
rows = table.find_all("tr")
|
||||
if not rows:
|
||||
return prod
|
||||
|
||||
header_cells = [c.get_text(" ", strip=True) for c in rows[0].find_all(["th", "td"])]
|
||||
|
||||
def col_idx(*names: str) -> int | None:
|
||||
for n in names:
|
||||
for i, h in enumerate(header_cells):
|
||||
if n.lower() in h.lower():
|
||||
return i
|
||||
return None
|
||||
|
||||
# Position of the merged identity cell, by header containing "Brand".
|
||||
i_identity = col_idx("Brand")
|
||||
i_rank = col_idx("Rank")
|
||||
i_entry = col_idx("Entry")
|
||||
|
||||
# Build a list of (header, index) for the OTHER columns (the
|
||||
# metric columns). Skips Rank, Brand-merge-cell, and Entry #.
|
||||
metric_columns: list[tuple[str, int]] = []
|
||||
skip_idx = {i_identity, i_rank, i_entry}
|
||||
for i, h in enumerate(header_cells):
|
||||
if i in skip_idx:
|
||||
continue
|
||||
h_clean = h.strip()
|
||||
if h_clean:
|
||||
metric_columns.append((h_clean, i))
|
||||
|
||||
for row in rows[1:]:
|
||||
cells = [c.get_text(" ", strip=True) for c in row.find_all(["td", "th"])]
|
||||
if len(cells) < 2:
|
||||
continue
|
||||
def cell(i: int | None) -> str:
|
||||
return cells[i] if i is not None and 0 <= i < len(cells) else ""
|
||||
|
||||
identity = cell(i_identity).strip()
|
||||
if any(k in identity.lower() for k in ("plot average", "trial average", "average")):
|
||||
continue
|
||||
|
||||
brand, product, traits = _split_identity(identity)
|
||||
|
||||
# Collect every metric column verbatim. Numeric where parseable,
|
||||
# else preserve the raw string (e.g. "ns" for not-significant).
|
||||
metrics: dict[str, float | str | None] = {}
|
||||
for h, idx in metric_columns:
|
||||
raw = cell(idx).strip()
|
||||
if not raw or raw == "-":
|
||||
metrics[h] = None
|
||||
else:
|
||||
f = _parse_float(raw)
|
||||
metrics[h] = f if f is not None else raw
|
||||
|
||||
result = TrialResult(
|
||||
rank=_parse_int(cell(i_rank)),
|
||||
brand=brand,
|
||||
product=product,
|
||||
traits=traits,
|
||||
metrics=metrics,
|
||||
entry_num=_parse_int(cell(i_entry)),
|
||||
)
|
||||
has_data = result.brand or result.product or any(
|
||||
v is not None for v in metrics.values()
|
||||
)
|
||||
if has_data:
|
||||
prod.results.append(result)
|
||||
|
||||
return prod
|
||||
|
||||
|
||||
# Known seed brands that can appear in plot-report identity cells.
|
||||
# Sorted longest-first so multi-word brands match before sub-strings.
|
||||
_BRAND_NAMES = (
|
||||
"Golden Harvest", "WestBred", "AgriPro", "DEKALB", "Pioneer",
|
||||
"Channel", "Asgrow", "NK", "Becks", "Beck's", "Brevant",
|
||||
"Stine", "Renk", "Wyffels", "LG Seeds", "Croplan", "FS",
|
||||
"Local Choice", "Mycogen", "AgriGold", "Hoegemeyer",
|
||||
)
|
||||
_BRAND_RE = re.compile(
|
||||
r"^(?:" + "|".join(re.escape(b) for b in _BRAND_NAMES) + r")\b",
|
||||
re.I,
|
||||
)
|
||||
|
||||
|
||||
def _split_identity(identity: str) -> tuple[str, str, str]:
|
||||
"""Split a plot-report identity cell into ``(brand, product, traits)``.
|
||||
|
||||
The HTML emits one merged cell like "NK NK1748-3110 Agrisure ®"
|
||||
or "Golden Harvest G16Q82-DV DuracadeViptera™" or just
|
||||
"DEKALB DKC65-20". We:
|
||||
|
||||
1. Match the brand against a known-brand list at the start.
|
||||
2. The token immediately after the brand is the product.
|
||||
3. Anything remaining is the trait stack (free text).
|
||||
"""
|
||||
if not identity:
|
||||
return "", "", ""
|
||||
s = identity.strip()
|
||||
m = _BRAND_RE.match(s)
|
||||
if not m:
|
||||
# Unknown brand prefix — best-effort: first token is brand,
|
||||
# second is product, rest is traits.
|
||||
parts = s.split(maxsplit=2)
|
||||
if len(parts) == 1:
|
||||
return parts[0], "", ""
|
||||
if len(parts) == 2:
|
||||
return parts[0], parts[1], ""
|
||||
return parts[0], parts[1], parts[2]
|
||||
brand = m.group(0)
|
||||
rest = s[len(brand):].strip()
|
||||
parts = rest.split(maxsplit=1)
|
||||
product = parts[0] if parts else ""
|
||||
traits = parts[1].strip() if len(parts) > 1 else ""
|
||||
return brand, product, traits
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- render
|
||||
|
||||
|
||||
def render_markdown(p: PlotReport) -> str:
|
||||
crop_label = {
|
||||
"corn": "Corn", "soybeans": "Soybean", "silage": "Silage",
|
||||
}.get(p.crop, p.crop.title())
|
||||
|
||||
head: list[str] = [
|
||||
f"# {crop_label} yield trial — {p.state_name}, {p.year}",
|
||||
"",
|
||||
f"- **Source:** Golden Harvest plot report (cross-vendor head-to-head)",
|
||||
f"- **Crop:** {crop_label}",
|
||||
f"- **State:** {p.state_name} ({p.state_abbrev.upper()})",
|
||||
f"- **Year:** {p.year}",
|
||||
f"- **Plot ID:** {p.plot_id}",
|
||||
]
|
||||
if p.cooperator:
|
||||
head.append(f"- **Cooperator:** {p.cooperator}")
|
||||
if p.planted_date:
|
||||
head.append(f"- **Planted:** {p.planted_date}")
|
||||
if p.harvested_date:
|
||||
head.append(f"- **Harvested:** {p.harvested_date}")
|
||||
if p.population:
|
||||
head.append(f"- **Population:** {p.population:,} seeds/acre")
|
||||
if p.row_width:
|
||||
head.append(f"- **Row width:** {p.row_width}\"")
|
||||
head.append(f"- **URL:** {p.source_url}")
|
||||
head.append("")
|
||||
head.append("---")
|
||||
head.append("")
|
||||
|
||||
sections: list[str] = []
|
||||
if p.results:
|
||||
# Discover all metric columns present across results, in
|
||||
# first-seen order. This keeps corn (Yield/MST/...) and silage
|
||||
# (Ton/Acre/Milk/Beef) using their own header sets.
|
||||
metric_keys: list[str] = []
|
||||
seen_keys: set[str] = set()
|
||||
for r in p.results:
|
||||
for k in r.metrics.keys():
|
||||
if k not in seen_keys:
|
||||
seen_keys.add(k)
|
||||
metric_keys.append(k)
|
||||
|
||||
sections.append("## Results (top-down by rank)")
|
||||
sections.append("")
|
||||
header_cells = ["Rank", "Brand", "Product", "Traits"] + metric_keys
|
||||
sections.append("| " + " | ".join(header_cells) + " |")
|
||||
sections.append("|" + "|".join(["---"] * len(header_cells)) + "|")
|
||||
for r in p.results:
|
||||
row = [
|
||||
str(r.rank) if r.rank is not None else "-",
|
||||
r.brand or "-",
|
||||
r.product or "-",
|
||||
r.traits or "-",
|
||||
]
|
||||
for k in metric_keys:
|
||||
v = r.metrics.get(k)
|
||||
if v is None:
|
||||
row.append("-")
|
||||
elif isinstance(v, (int, float)):
|
||||
# Dollar columns rendered with $ prefix
|
||||
if "Revenue" in k or "$" in k:
|
||||
row.append(f"${v:.2f}")
|
||||
else:
|
||||
row.append(str(v))
|
||||
else:
|
||||
row.append(str(v))
|
||||
sections.append("| " + " | ".join(row) + " |")
|
||||
sections.append("")
|
||||
|
||||
# Compact text summary for embedder signal — uses the primary
|
||||
# metric (Yield for corn/soy, Ton/Acre for silage).
|
||||
top = p.results[: min(5, len(p.results))]
|
||||
if top:
|
||||
primary_label, _ = top[0].primary_metric
|
||||
if primary_label:
|
||||
summary = ", ".join(
|
||||
f"{r.product or '?'} ({r.brand or '?'}) {r.primary_metric[1]}"
|
||||
for r in top
|
||||
if r.primary_metric[1] is not None
|
||||
)
|
||||
if summary:
|
||||
sections.append(f"Top {len(top)} by {primary_label}: {summary}.")
|
||||
sections.append("")
|
||||
|
||||
return "\n".join(head) + "\n".join(sections)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- write
|
||||
|
||||
|
||||
def write_plot(prod: PlotReport, body_md: str) -> None:
|
||||
CORPUS_DIR.mkdir(parents=True, exist_ok=True)
|
||||
md_path = CORPUS_DIR / f"{prod.source_key}.md"
|
||||
json_path = CORPUS_DIR / f"{prod.source_key}.json"
|
||||
|
||||
md_path.write_text(body_md, encoding="utf-8")
|
||||
sidecar = {
|
||||
"source": "gh_plot_reports",
|
||||
"source_key": prod.source_key,
|
||||
"data_type": "trial",
|
||||
"vendor": "Syngenta", # Golden Harvest publishes the trial
|
||||
"brand": "Golden Harvest",
|
||||
"crop": prod.crop,
|
||||
"state": prod.state_name,
|
||||
"state_abbrev": prod.state_abbrev,
|
||||
"year": prod.year,
|
||||
"plot_id": prod.plot_id,
|
||||
"cooperator": prod.cooperator,
|
||||
"planted_date": prod.planted_date,
|
||||
"harvested_date": prod.harvested_date,
|
||||
"population_seeds_per_acre": prod.population,
|
||||
"row_width_in": prod.row_width,
|
||||
"results": [
|
||||
{
|
||||
"rank": r.rank,
|
||||
"brand": r.brand,
|
||||
"product": r.product,
|
||||
"traits": r.traits,
|
||||
# All per-column metrics verbatim. Corn/soy: Yield,
|
||||
# %MST, Test Weight, Gross Revenue. Silage: Ton/Acre,
|
||||
# Milk Per Acre, Milk Per Ton, Beef Per Acre, Beef Per
|
||||
# Ton. (Plus any other column the source publishes.)
|
||||
"metrics": r.metrics,
|
||||
"entry_num": r.entry_num,
|
||||
}
|
||||
for r in prod.results
|
||||
],
|
||||
"n_results": len(prod.results),
|
||||
"source_urls": [prod.source_url],
|
||||
"fetched_at": datetime.now(timezone.utc).isoformat(),
|
||||
"scraper_version": SCRAPER_VERSION,
|
||||
}
|
||||
json_path.write_text(
|
||||
json.dumps(sidecar, indent=2, ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- pipeline
|
||||
|
||||
|
||||
def process_plot(
|
||||
http: RateLimitedSession,
|
||||
*,
|
||||
url: str,
|
||||
crop: str,
|
||||
state: str,
|
||||
year: int,
|
||||
plot_id: str,
|
||||
force: bool,
|
||||
) -> tuple[str, PlotReport | None]:
|
||||
sk = source_key_for(crop, state, year, plot_id)
|
||||
md_path = CORPUS_DIR / f"{sk}.md"
|
||||
if md_path.exists() and not force:
|
||||
return "skipped", None
|
||||
try:
|
||||
prod = fetch_plot_detail(http, url, crop, state, year, plot_id)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.error("detail fetch failed for %s: %s", url, exc)
|
||||
return "failed", None
|
||||
if prod is None:
|
||||
return "missing", None
|
||||
body = render_markdown(prod)
|
||||
write_plot(prod, body)
|
||||
return "written", prod
|
||||
|
||||
|
||||
def run(
|
||||
*,
|
||||
limit: int | None,
|
||||
force: bool,
|
||||
only_crop: str | None,
|
||||
only_state: str | None,
|
||||
only_year: int | None,
|
||||
include_2023: bool,
|
||||
) -> int:
|
||||
CORPUS_DIR.mkdir(parents=True, exist_ok=True)
|
||||
http = RateLimitedSession()
|
||||
|
||||
crops = {only_crop} if only_crop else {"corn", "soybeans", "silage"}
|
||||
states = {only_state} if only_state else None
|
||||
if only_year:
|
||||
years = {only_year}
|
||||
elif include_2023:
|
||||
years = {2023, 2024, 2025}
|
||||
else:
|
||||
years = {2024, 2025}
|
||||
|
||||
targets = discover_plots(http, crops=crops, states=states, years=years)
|
||||
|
||||
counts = {"written": 0, "skipped": 0, "missing": 0, "failed": 0}
|
||||
processed = 0
|
||||
for url, crop, state, year, plot_id in targets:
|
||||
if limit is not None and processed >= limit:
|
||||
break
|
||||
processed += 1
|
||||
status, prod = process_plot(
|
||||
http, url=url, crop=crop, state=state, year=year,
|
||||
plot_id=plot_id, force=force,
|
||||
)
|
||||
counts[status] = counts.get(status, 0) + 1
|
||||
if prod is not None and processed <= 5 or processed % 100 == 0:
|
||||
log.info(
|
||||
"[%d/%s] %s %s | results=%d coop=%s",
|
||||
processed, str(limit) if limit else len(targets),
|
||||
source_key_for(crop, state, year, plot_id), status,
|
||||
len(prod.results) if prod else 0,
|
||||
(prod.cooperator if prod else "-") or "-",
|
||||
)
|
||||
|
||||
log.info(
|
||||
"done: processed=%d written=%d skipped=%d missing=%d failed=%d (of %d candidates)",
|
||||
processed, counts["written"], counts["skipped"],
|
||||
counts["missing"], counts["failed"], len(targets),
|
||||
)
|
||||
return 0 if counts["failed"] == 0 else 1
|
||||
|
||||
|
||||
# --------------------------------------------------------------------- CLI
|
||||
|
||||
|
||||
def _build_argparser() -> argparse.ArgumentParser:
|
||||
p = argparse.ArgumentParser(
|
||||
prog="scrape.sources.gh_plot_reports",
|
||||
description="Scrape Golden Harvest cross-vendor plot reports (yield trials).",
|
||||
)
|
||||
p.add_argument("--limit", type=int, default=None,
|
||||
help="Stop after processing N plots (default: all).")
|
||||
p.add_argument("--force", action="store_true",
|
||||
help="Re-fetch even if the markdown file already exists.")
|
||||
p.add_argument("--crop", default=None,
|
||||
choices=("corn", "soybeans", "silage"),
|
||||
help="Limit to one crop.")
|
||||
p.add_argument("--state", default=None,
|
||||
help="Limit to one state (2-letter abbrev: ia, il, ne, ...).")
|
||||
p.add_argument("--year", type=int, default=None, choices=(2023, 2024, 2025),
|
||||
help="Limit to one year.")
|
||||
p.add_argument("--include-2023", action="store_true",
|
||||
help="Include 2023 plot reports (default: 2024-2025 only).")
|
||||
p.add_argument("--log-level", default=os.environ.get("LOG_LEVEL", "INFO"))
|
||||
return p
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
args = _build_argparser().parse_args(argv)
|
||||
logging.basicConfig(
|
||||
level=args.log_level.upper(),
|
||||
format="%(asctime)s %(levelname)s %(name)s %(message)s",
|
||||
stream=sys.stderr,
|
||||
)
|
||||
return run(
|
||||
limit=args.limit,
|
||||
force=args.force,
|
||||
only_crop=args.crop,
|
||||
only_state=args.state.lower() if args.state else None,
|
||||
only_year=args.year,
|
||||
include_2023=args.include_2023,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
+85
-18
@@ -5,8 +5,16 @@
|
||||
{
|
||||
"name": "bayer_seeds",
|
||||
"vendor": "Bayer",
|
||||
"brands": ["DEKALB", "Asgrow", "WestBred"],
|
||||
"crops": ["corn", "soybeans", "wheat"],
|
||||
"brands": [
|
||||
"DEKALB",
|
||||
"Asgrow",
|
||||
"WestBred"
|
||||
],
|
||||
"crops": [
|
||||
"corn",
|
||||
"soybeans",
|
||||
"wheat"
|
||||
],
|
||||
"verdict": "green",
|
||||
"expected_count": 475,
|
||||
"base_url": "https://cropscience.bayer.us",
|
||||
@@ -17,65 +25,124 @@
|
||||
{
|
||||
"name": "golden_harvest",
|
||||
"vendor": "Syngenta",
|
||||
"brands": ["Golden Harvest"],
|
||||
"crops": ["corn", "soybeans"],
|
||||
"brands": [
|
||||
"Golden Harvest"
|
||||
],
|
||||
"crops": [
|
||||
"corn",
|
||||
"soybeans"
|
||||
],
|
||||
"verdict": "green",
|
||||
"expected_count": 175,
|
||||
"base_url": "https://www.goldenharvestseeds.com",
|
||||
"scope_filter": "All sitemap-listed corn + soybean varieties.",
|
||||
"tos_check_date": "2026-05-25",
|
||||
"schema_notes": "Disease ratings published on 9-to-1 scale (9 = best). Normalize to 1-9 (9 = best) at chunk time to match Bayer/NK/AgriPro convention. Note original direction in chunk_0 preamble. Tech-sheet PDF URLs in the sitemap are stale (250331) — resolve live URL from product HTML, not sitemap entry."
|
||||
"schema_notes": "Disease ratings published on 9-to-1 scale (9 = best). Normalize to 1-9 (9 = best) at chunk time to match Bayer/NK/AgriPro convention. Note original direction in chunk_0 preamble. Tech-sheet PDF URLs in the sitemap are stale (250331) \u2014 resolve live URL from product HTML, not sitemap entry."
|
||||
},
|
||||
{
|
||||
"name": "nk",
|
||||
"vendor": "Syngenta",
|
||||
"brands": ["NK"],
|
||||
"crops": ["corn", "soybeans"],
|
||||
"brands": [
|
||||
"NK"
|
||||
],
|
||||
"crops": [
|
||||
"corn",
|
||||
"soybeans"
|
||||
],
|
||||
"verdict": "green",
|
||||
"expected_count": 29,
|
||||
"base_url": "https://www.syngenta-us.com",
|
||||
"pdf_cdn": "https://assets.syngentaebiz.com/pdf/techsheets/",
|
||||
"scope_filter": "All NK corn + soy varieties. No wheat (NK doesn't sell wheat in US).",
|
||||
"tos_check_date": "2026-05-24",
|
||||
"schema_notes": "Disease + agronomic ratings live in tech-sheet PDFs only — need pdfplumber. PDF URLs share format `<CODE>_YYMMDD.pdf` with Golden Harvest, so the same fetcher works for both."
|
||||
"schema_notes": "Disease + agronomic ratings live in tech-sheet PDFs only \u2014 need pdfplumber. PDF URLs share format `<CODE>_YYMMDD.pdf` with Golden Harvest, so the same fetcher works for both."
|
||||
},
|
||||
{
|
||||
"name": "agripro",
|
||||
"vendor": "Syngenta",
|
||||
"brands": ["AgriPro"],
|
||||
"crops": ["wheat", "barley"],
|
||||
"brands": [
|
||||
"AgriPro"
|
||||
],
|
||||
"crops": [
|
||||
"wheat",
|
||||
"barley"
|
||||
],
|
||||
"verdict": "green",
|
||||
"expected_count": 24,
|
||||
"base_url": "https://www.agriprowheat.com",
|
||||
"scope_filter": "All wheat classes (HRW/HRS/HWS/SWW/SWS) + barley. NO SRW — Syngenta's SRW lives at GrowProGenetics.com under a separate brand.",
|
||||
"scope_filter": "All wheat classes (HRW/HRS/HWS/SWW/SWS) + barley. NO SRW \u2014 Syngenta's SRW lives at GrowProGenetics.com under a separate brand.",
|
||||
"tos_check_date": "2026-05-24",
|
||||
"schema_notes": "Drupal Views form; server-rendered HTML. CoAXium trait flag is implicit in product family; Clearfield/CL2 trait IS in this catalog."
|
||||
},
|
||||
{
|
||||
"name": "becks_pfr",
|
||||
"vendor": "Beck's Hybrids",
|
||||
"brands": ["Beck's PFR"],
|
||||
"crops": ["corn", "soybeans", "wheat"],
|
||||
"brands": [
|
||||
"Beck's PFR"
|
||||
],
|
||||
"crops": [
|
||||
"corn",
|
||||
"soybeans",
|
||||
"wheat"
|
||||
],
|
||||
"verdict": "yellow",
|
||||
"expected_count": 2089,
|
||||
"base_url": "https://www.beckshybrids.com",
|
||||
"api_base": "https://mc8v24rf.api.sanity.io",
|
||||
"scope_filter": "All Practical Farm Research publications since 2015. PFR is head-to-head agronomy trials — fungicide timing, planting-date studies, hybrid-by-population, etc.",
|
||||
"scope_filter": "All Practical Farm Research publications since 2015. PFR is head-to-head agronomy trials \u2014 fungicide timing, planting-date studies, hybrid-by-population, etc.",
|
||||
"tos_check_date": "2026-05-24",
|
||||
"schema_notes": "Public Sanity GROQ API, no auth required. Records have title/year/crop/key-findings/full-text. Treat PFR docs as a research corpus, not variety records — the chunk_0 includes the study's tl;dr finding."
|
||||
"schema_notes": "Public Sanity GROQ API, no auth required. Records have title/year/crop/key-findings/full-text. Treat PFR docs as a research corpus, not variety records \u2014 the chunk_0 includes the study's tl;dr finding."
|
||||
},
|
||||
{
|
||||
"name": "becks_products",
|
||||
"vendor": "Beck's Hybrids",
|
||||
"brands": ["Beck's"],
|
||||
"crops": ["corn", "soybeans", "wheat"],
|
||||
"brands": [
|
||||
"Beck's"
|
||||
],
|
||||
"crops": [
|
||||
"corn",
|
||||
"soybeans",
|
||||
"wheat"
|
||||
],
|
||||
"verdict": "yellow",
|
||||
"expected_count": 860,
|
||||
"base_url": "https://www.beckshybrids.com",
|
||||
"api_base": "https://mc8v24rf.api.sanity.io",
|
||||
"scope_filter": "All Beck's product records — corn + soy + wheat. Identity + RM/MG only.",
|
||||
"scope_filter": "All Beck's product records \u2014 corn + soy + wheat. Identity + RM/MG only.",
|
||||
"tos_check_date": "2026-05-24",
|
||||
"schema_notes": "Sanity GROQ exposes identity (name, RM/MG, basic traits) but agronomic + disease ratings are SeedIQ-gated (requires browser cookie). Deferred until the SeedIQ XHR endpoint is captured from a logged-in browser session. Without ratings, products are reference-only; the MCP can confirm 'Beck's has hybrid X at RM 112 with Enlist trait' but not 'rate it against drought'."
|
||||
},
|
||||
{
|
||||
"name": "gh_plot_reports",
|
||||
"vendor": "Syngenta",
|
||||
"brand_aggregator": "Golden Harvest publishes",
|
||||
"crops": [
|
||||
"corn",
|
||||
"soybeans",
|
||||
"silage"
|
||||
],
|
||||
"verdict": "green",
|
||||
"expected_count": 4618,
|
||||
"base_url": "https://www.goldenharvestseeds.com",
|
||||
"scope_filter": "sitemap-listed plot reports 2024 and 2025 (4,618 reports). 2023 (3,619 reports) deferred to a future pass \u2014 most recent data is most relevant for current decisions.",
|
||||
"tos_check_date": "2026-05-25",
|
||||
"schema_notes": "Cross-vendor head-to-head yield trials at specific state/year/site. Each report lists products from multiple brands (NK, DEKALB, GH, etc.) with rank, yield, %MST, test weight, gross revenue. URL: /<crop>/plot-report/<state>/<year>/<id>. Same site/auth as golden_harvest variety scraper.",
|
||||
"data_type": "trial"
|
||||
},
|
||||
{
|
||||
"name": "agripro_trials",
|
||||
"vendor": "Syngenta",
|
||||
"brand_aggregator": "AgriPro publishes",
|
||||
"crops": [
|
||||
"wheat"
|
||||
],
|
||||
"verdict": "green",
|
||||
"expected_count": 38,
|
||||
"base_url": "https://agriprowheat.com",
|
||||
"scope_filter": "PDF trial summaries linked from /trials-data. Regional wheat performance (PNW, Western Plains, NE Colorado, etc.).",
|
||||
"tos_check_date": "2026-05-25",
|
||||
"schema_notes": "PDF tables of varieties tested per region per year. pdfplumber for table extraction.",
|
||||
"data_type": "trial"
|
||||
}
|
||||
],
|
||||
"_excluded_sources": [
|
||||
|
||||
Reference in New Issue
Block a user