Phase 4-5: deployable container + corpus snapshot + CI fixes
deploy/docker-compose.yml — replace <product>/<registry> placeholders with concrete values for Drawbar's stack: - image: git.jpaul.io/justin/seed-mcp:latest (CF tunnel for pulls; CI pushes via LAN 192.168.0.2:1234 to avoid 100 MB body cap) - container_name: seed-mcp - port 8001:8000 (8001 host-side to not collide with crop-chem-docs on 8000) - PRODUCT_NAME=crop_seed, hybrid search enabled, stateless HTTP - llama-rerank shared with crop-chem-docs (NOT redefined here — expected to already be in Drawbar's parent compose network) - networks.drawbar-mcp external: true so seed-mcp joins the existing cross-MCP shared network .gitignore — corpus/ is now COMMITTED, not ignored. The monthly refresh workflow scrapes and commits corpus changes; the image-only workflow rebuilds indexes from the committed corpus. Allowing the corpus to flow through git means the :corpus-YYYY.MM.DD image tag pins to a specific seed-catalog snapshot. chroma/ and bm25/ remain ignored — those are deterministically derived from corpus. Initial committed snapshot: 614 varieties. - bayer_seeds: 475 (DEKALB 288 + Asgrow 102 + WestBred 85) - golden_harvest: 139 (Syngenta corn + soy; 36 sitemap URLs 302-redirected = discontinued) rag/chunk.py — normalize brand and crop to uppercase/lowercase in Chroma metadata so cross-vendor brand-filter lookups don't break on casing inconsistency (Bayer stores "DEKALB", Golden Harvest stores "Golden Harvest"; _build_where uppercases user-supplied brand which matched the former but not the latter pre-fix). Sidecar JSON keeps original casing for display. Stub scrapers (nk, agripro, becks_pfr, becks_products) — change return code from 2 to 0 so the monthly-refresh CI workflow doesn't fail on deferred sources. Real implementations will return 0 on success / 1 on failure when they ship. Smoke-tested cross-vendor retrieval against the 614-chunk index: - list_versions shows both vendors with correct facet counts - broad "corn hybrid 100 RM" query returns both DEKALB and Golden Harvest hits in top 5 - brand='Golden Harvest' filter returns 3 GH-only varieties - variety-code prefilter still works (E085Z5 → top hit on GH) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -25,9 +25,12 @@ import sys
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
print("agripro: not implemented yet — Drupal Views form, only wheat in the corpus, no SRW (separate brand)",
|
||||
print("agripro: deferred — Drupal Views form, only wheat in the corpus, no SRW (separate brand). See reference_seed_vendor_recon.md.",
|
||||
file=sys.stderr)
|
||||
return 2
|
||||
# Return 0 so the monthly CI workflow doesn't fail when this
|
||||
# source is listed but not yet implemented. Real implementation
|
||||
# will return 0 on success / 1 on failure.
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
@@ -36,9 +36,11 @@ import sys
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
print("becks_pfr: not implemented yet — public Sanity GROQ at mc8v24rf.api.sanity.io, ~2089 research docs",
|
||||
print("becks_pfr: deferred — public Sanity GROQ at mc8v24rf.api.sanity.io, ~2089 research docs",
|
||||
file=sys.stderr)
|
||||
return 2
|
||||
# Return 0 so the monthly CI workflow doesn't fail when this
|
||||
# source is listed but not yet implemented.
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
@@ -39,7 +39,10 @@ import sys
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
print("becks_products: deferred — SeedIQ XHR sniff required for ratings, run only if user has captured the endpoint",
|
||||
file=sys.stderr)
|
||||
return 2
|
||||
# Return 0 so the monthly CI workflow doesn't fail when this
|
||||
# source is listed but not yet implemented (and may never be,
|
||||
# if SeedIQ gates persist).
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
@@ -26,9 +26,12 @@ import sys
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
print("nk: not implemented yet — disease/agronomic ratings come from CDN tech-sheet PDFs only, use pdfplumber",
|
||||
print("nk: deferred — disease/agronomic ratings come from CDN tech-sheet PDFs only, use pdfplumber. See reference_seed_vendor_recon.md.",
|
||||
file=sys.stderr)
|
||||
return 2
|
||||
# Return 0 so the monthly CI workflow doesn't fail when this
|
||||
# source is listed but not yet implemented. Real implementation
|
||||
# will return 0 on success / 1 on failure.
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
Reference in New Issue
Block a user