epa_ppls: narrow row-crop filter to corn/soy/wheat only
App focus is corn, soybeans, and wheat. Dropping the broader US-row-crops allowlist (cotton/rice/sorghum/milo/barley/oats/rye/ sunflower/peanut/sugar-beet/dry-bean/canola/alfalfa). Empirical impact (random N=100 sample): broad list matched 17/100 products, narrow list matches 16/100 — only 6% reduction, because corn/soy/wheat dominate ag-chem registrations so thoroughly that products registered for cotton/sorghum/etc. are almost always co-registered for one of corn/soy/wheat. One sampled product was dropped: a peanut-only herbicide (2749-614). Verified live: 524-475 Roundup + 524-591 Warrant kept (CORN/SOYBEAN sites); 2749-614 AG36448 (PEANUTS only) correctly filtered. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -83,23 +83,17 @@ MAX_RETRIES = 4
|
||||
# "OATS" naively matches "SHIPS, BOATS, SHIPHOLDS"; bare "RICE" matches
|
||||
# "LICORICE"; bare "RYE" matches "FRYER".
|
||||
#
|
||||
# Scope = the major US row + small-grain + oilseed + sugar/fiber crops the
|
||||
# farmer-advisor consumer cares about. Alfalfa included as a common rotation
|
||||
# crop; sweet/seed corn included alongside field corn.
|
||||
# Scope = the three crops the farmer-advisor consumer focuses on: corn,
|
||||
# soybeans, and wheat. Sweet/seed/pop corn included alongside field corn.
|
||||
# Empirically (random N=100 sample, 2026-05-23): this narrow allowlist
|
||||
# matches ~16% of all PPLS products and only loses ~6% of the broader
|
||||
# "all US row crops" hit set, because corn/soy/wheat dominate ag chemistry
|
||||
# registrations — almost every product registered for e.g. cotton or
|
||||
# sorghum is co-registered for at least one of corn/soy/wheat.
|
||||
ROW_CROP_KEYWORDS = (
|
||||
"CORN", "MAIZE", "POPCORN",
|
||||
"SOYBEAN", "SOYBEANS",
|
||||
"COTTON",
|
||||
"WHEAT",
|
||||
"RICE",
|
||||
"SORGHUM", "MILO",
|
||||
"BARLEY", "OATS", "RYE",
|
||||
"SUNFLOWER", "SUNFLOWERS",
|
||||
"PEANUT", "PEANUTS",
|
||||
"SUGAR BEET", "SUGAR BEETS",
|
||||
"DRY BEAN", "DRY BEANS", "FIELD BEAN", "FIELD BEANS",
|
||||
"CANOLA", "RAPESEED",
|
||||
"ALFALFA",
|
||||
)
|
||||
_ROW_CROP_PATTERNS = tuple(
|
||||
re.compile(rf"\b{re.escape(kw)}\b", re.IGNORECASE)
|
||||
|
||||
Reference in New Issue
Block a user