CI was failing on the "Rebuild indexes from committed corpus" step
with httpx.ConnectError [Errno 111] — `localhost:11434` in the
OLLAMA_URL pool resolves to the Gitea Actions runner CONTAINER's
own localhost (no Ollama there), not the host. Fix: drop localhost
from CI's pool; it stays useful for dev runs from the workstation
where the TITAN X serves Ollama on the host loopback.
Final CI pool — 3 LAN endpoints, weighted to .0.125 (4090):
.0.125:11434 ×4 (RTX 40-series, 242 embeds/sec)
.0.2:11436 ×2 (GPU-pinned, 108 embeds/sec)
.0.2:11435 ×1 (GPU-pinned, 72 embeds/sec)
deploy/docker-compose.yml — rewrite to match Drawbar's actual
parent-stack pattern, learned by inspecting how chem-mcp is
deployed on trashpanda:
- Service name `seed-mcp` (matches chem-mcp's pattern). Reached
via docker DNS as `seed-mcp:8080` from drawbar-backend-api.
- Internal-only (no host port), expose 8080 only.
- MCP_PORT=8080 inside container (chem-mcp uses 8080 too).
- OLLAMA_URL via host.docker.internal:11434 (trashpanda's Ollama
runs on the host). extra_hosts maps host-gateway.
- RERANK_URL: http://llama-rerank:8080 — but llama-rerank is on
the default `bridge` network, not drawbar-backend_default,
so chem-mcp's reranker silently fails! Documented patch:
docker network connect drawbar-backend_default llama-rerank
Fixes rerank for BOTH chem-mcp (today: dense-only fallback)
and the new seed-mcp.
- Watchtower label set so CI pushes to :latest auto-deploy.
Documented llama-rerank service block as an alternative for
bringing the sidecar fully into the parent compose stack, with the
ubatch-size flag the seed corpus needs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
deploy/docker-compose.yml — replace <product>/<registry> placeholders
with concrete values for Drawbar's stack:
- image: git.jpaul.io/justin/seed-mcp:latest (CF tunnel for pulls; CI
pushes via LAN 192.168.0.2:1234 to avoid 100 MB body cap)
- container_name: seed-mcp
- port 8001:8000 (8001 host-side to not collide with crop-chem-docs
on 8000)
- PRODUCT_NAME=crop_seed, hybrid search enabled, stateless HTTP
- llama-rerank shared with crop-chem-docs (NOT redefined here —
expected to already be in Drawbar's parent compose network)
- networks.drawbar-mcp external: true so seed-mcp joins the existing
cross-MCP shared network
.gitignore — corpus/ is now COMMITTED, not ignored. The monthly
refresh workflow scrapes and commits corpus changes; the image-only
workflow rebuilds indexes from the committed corpus. Allowing the
corpus to flow through git means the :corpus-YYYY.MM.DD image tag
pins to a specific seed-catalog snapshot. chroma/ and bm25/ remain
ignored — those are deterministically derived from corpus.
Initial committed snapshot: 614 varieties.
- bayer_seeds: 475 (DEKALB 288 + Asgrow 102 + WestBred 85)
- golden_harvest: 139 (Syngenta corn + soy; 36 sitemap URLs
302-redirected = discontinued)
rag/chunk.py — normalize brand and crop to uppercase/lowercase in
Chroma metadata so cross-vendor brand-filter lookups don't break on
casing inconsistency (Bayer stores "DEKALB", Golden Harvest stores
"Golden Harvest"; _build_where uppercases user-supplied brand which
matched the former but not the latter pre-fix). Sidecar JSON keeps
original casing for display.
Stub scrapers (nk, agripro, becks_pfr, becks_products) — change
return code from 2 to 0 so the monthly-refresh CI workflow doesn't
fail on deferred sources. Real implementations will return 0 on
success / 1 on failure when they ship.
Smoke-tested cross-vendor retrieval against the 614-chunk index:
- list_versions shows both vendors with correct facet counts
- broad "corn hybrid 100 RM" query returns both DEKALB and Golden
Harvest hits in top 5
- brand='Golden Harvest' filter returns 3 GH-only varieties
- variety-code prefilter still works (E085Z5 → top hit on GH)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>