8766d73327
Image pushed to git.jpaul.io/justin/crop-chem-docs with three tags:
:latest — Watchtower auto-pull target
:a97107de4636 — commit-sha rollback pin
:corpus-2026.05.24 — corpus-snapshot pin (prod-recommended)
Drawbar compose snippet at deploy/drawbar-compose-snippet.md.
Wires the container against the existing infra:
- Ollama pool: 192.168.0.2:11434, 192.168.0.2:11435,
192.168.0.125:11434, 10.10.1.65:11434
- Reranker: http://10.10.1.65:8082
- HYBRID_SEARCH=true (production retrieval — BM25 + dense + rerank)
- Exposes streamable-HTTP MCP on port 8000
Pull path uses git.jpaul.io (public hostname, CF-fronted; pull
response bodies aren't capped). Push path uses 192.168.0.2:1234
(LAN endpoint, bypasses CF 100MB body cap). Same registry,
different URLs — per the template gotcha doc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.8 KiB
3.8 KiB
Drawbar deploy — crop-chem-docs MCP server snippet
Drop this into Drawbar's docker-compose.yml. Targets the existing
trashpanda infra: Ollama pool on the LAN, llama-rerank container
on Tesla P4, Cloudflare Tunnel out front.
Pre-reqs (one-time on the deploy host)
- Login to the Gitea registry so the host can pull:
docker login git.jpaul.io -u justin # PAT for password - Ollama embed pool reachable from this host (already up):
192.168.0.2:11434,192.168.0.2:11435(Gitea-host GPUs)192.168.0.125:11434(Windows GPU)
- Reranker reachable (already up on trashpanda):
http://10.10.1.65:8082
Compose service
services:
crop-chem-docs:
image: git.jpaul.io/justin/crop-chem-docs:latest
# Or pin to an immutable tag for prod:
# image: git.jpaul.io/justin/crop-chem-docs:corpus-2026.05.24
container_name: crop-chem-docs
restart: unless-stopped
ports:
- "8001:8000" # MCP server (streamable-http). Adjust host port.
environment:
# Embedder pool. Round-robined for parallel search.
OLLAMA_URL: "http://192.168.0.2:11434,http://192.168.0.2:11435,http://192.168.0.125:11434,http://10.10.1.65:11434"
# Reranker on trashpanda's Tesla P4.
RERANK_URL: "http://10.10.1.65:8082"
# Production retrieval: BM25 + dense fused, then reranked.
HYBRID_SEARCH: "true"
# Override docs URL shown to the LLM if needed (default is EPA PPLS portal).
# PRODUCT_DOCS_URL: "https://..."
labels:
# Watchtower auto-pulls :latest on update.
com.centurylinklabs.watchtower.enable: "true"
# Optional: if you want Watchtower to drive auto-updates of this
# container too, you already run watchtower elsewhere — just make
# sure this container has the label above set true.
Test from the host
# Tool inventory (uses MCP's HTTP transport — adjust if you have a
# different MCP client probe handy):
curl -s http://localhost:8001/sse # or whichever endpoint your
# client expects from streamable-http
# Or exec into the container and run the stdio transport:
docker exec -it crop-chem-docs \
python -m docs_mcp.server --transport stdio < /dev/null
What the container exposes
| Tool | What it does |
|---|---|
search_docs |
Hybrid+rerank pesticide-label search with optional filters |
get_page |
Full label markdown + metadata by (source, source_key) |
list_versions |
Discover sources, product classes, signal words, registrants |
corpus_status |
Counts + freshness; useful for health probes |
crop_chem_api_lessons |
Curated agronomy/label-handling knowledge — call before recommending |
Versioning
Tags published by the Gitea Actions workflows:
| Tag | When | Use for |
|---|---|---|
:latest |
Every monthly refresh + every code push | Dev / Watchtower auto-pull |
:<sha12> |
Every build | Rollback pin |
:corpus-YYYY.MM.DD |
Every build | Pin to a specific corpus snapshot in prod |
The :corpus-YYYY.MM.DD tag is the right one for production —
guarantees the running container has a known, frozen corpus that
matches the labels you've validated against.
Updating the corpus
Two paths:
- Wait for the monthly cron — 1st @ 06:00 UTC, full re-scrape
of Bayer + EPA PPLS, then reindex, then image push. Watchtower
pulls the new
:latestautomatically. - Trigger manually in Gitea Actions UI →
Monthly corpus refresh→Run workflow. Optionalsourcesinput for single-source refresh (e.g.,bayeronly).
Switching corpus scope
The row-crop filter (corn/soybeans/wheat) is in
scrape/sources/epa_ppls.py as ROW_CROP_KEYWORDS. Edit + push +
let the next workflow run pick it up. Same for the registrant
allowlist at scrape/sources/epa_registrant_allowlist.json.