"""Golden Harvest scraper (Syngenta brand). Discovery: ``https://www.goldenharvestseeds.com/sitemap.xml`` lists every variety page. Server-rendered HTML — no headless browser required. Tech-sheet PDFs live on the Syngenta CDN at ``assets.syngentaebiz.com/pdf/techsheets/_YYMMDD.pdf`` — same fetcher pattern as NK. Two gotchas: 1. **Sitemap PDF dates are stale** (the sitemap was generated 2025-03-31 and never updated). Resolve the LIVE PDF URL from the product HTML page, not from the sitemap entry. 2. **Disease scale is reversed.** Golden Harvest publishes ratings on a 9-to-1 scale (9 = best, 1 = worst). Bayer/NK/AgriPro use 1-9 (9 = best). Normalize at chunk time so the corpus has a single direction. Record the original direction in the chunk_0 preamble: "Note: ratings normalized to 1-9 (9 = best). Golden Harvest publishes on a 9-to-1 scale natively." Expected count: ~175 varieties (89 corn + 86 soy). No wheat. Bonus dataset: ``/plot-report///`` — ~7,800 regional yield trial records. Out of scope for v1 but a high-value future ingest for regional placement recommendations. TODO: implement. Reuse the PDF-fetch helper that NK uses. """ from __future__ import annotations import sys def main(argv: list[str] | None = None) -> int: print("golden_harvest: not implemented yet — see CLAUDE.md for the disease-scale-reversal gotcha and the live-PDF-URL-resolution requirement", file=sys.stderr) return 2 if __name__ == "__main__": sys.exit(main(sys.argv[1:]))