deploy: Drawbar compose snippet — first image is published
Image pushed to git.jpaul.io/justin/crop-chem-docs with three tags:
:latest — Watchtower auto-pull target
:a97107de4636 — commit-sha rollback pin
:corpus-2026.05.24 — corpus-snapshot pin (prod-recommended)
Drawbar compose snippet at deploy/drawbar-compose-snippet.md.
Wires the container against the existing infra:
- Ollama pool: 192.168.0.2:11434, 192.168.0.2:11435,
192.168.0.125:11434, 10.10.1.65:11434
- Reranker: http://10.10.1.65:8082
- HYBRID_SEARCH=true (production retrieval — BM25 + dense + rerank)
- Exposes streamable-HTTP MCP on port 8000
Pull path uses git.jpaul.io (public hostname, CF-fronted; pull
response bodies aren't capped). Push path uses 192.168.0.2:1234
(LAN endpoint, bypasses CF 100MB body cap). Same registry,
different URLs — per the template gotcha doc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,102 @@
|
||||
# Drawbar deploy — `crop-chem-docs` MCP server snippet
|
||||
|
||||
Drop this into Drawbar's `docker-compose.yml`. Targets the existing
|
||||
trashpanda infra: Ollama pool on the LAN, `llama-rerank` container
|
||||
on Tesla P4, Cloudflare Tunnel out front.
|
||||
|
||||
## Pre-reqs (one-time on the deploy host)
|
||||
|
||||
1. **Login to the Gitea registry** so the host can pull:
|
||||
```bash
|
||||
docker login git.jpaul.io -u justin # PAT for password
|
||||
```
|
||||
2. **Ollama embed pool** reachable from this host (already up):
|
||||
- `192.168.0.2:11434`, `192.168.0.2:11435` (Gitea-host GPUs)
|
||||
- `192.168.0.125:11434` (Windows GPU)
|
||||
3. **Reranker** reachable (already up on trashpanda):
|
||||
- `http://10.10.1.65:8082`
|
||||
|
||||
## Compose service
|
||||
|
||||
```yaml
|
||||
services:
|
||||
crop-chem-docs:
|
||||
image: git.jpaul.io/justin/crop-chem-docs:latest
|
||||
# Or pin to an immutable tag for prod:
|
||||
# image: git.jpaul.io/justin/crop-chem-docs:corpus-2026.05.24
|
||||
container_name: crop-chem-docs
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8001:8000" # MCP server (streamable-http). Adjust host port.
|
||||
environment:
|
||||
# Embedder pool. Round-robined for parallel search.
|
||||
OLLAMA_URL: "http://192.168.0.2:11434,http://192.168.0.2:11435,http://192.168.0.125:11434,http://10.10.1.65:11434"
|
||||
# Reranker on trashpanda's Tesla P4.
|
||||
RERANK_URL: "http://10.10.1.65:8082"
|
||||
# Production retrieval: BM25 + dense fused, then reranked.
|
||||
HYBRID_SEARCH: "true"
|
||||
# Override docs URL shown to the LLM if needed (default is EPA PPLS portal).
|
||||
# PRODUCT_DOCS_URL: "https://..."
|
||||
labels:
|
||||
# Watchtower auto-pulls :latest on update.
|
||||
com.centurylinklabs.watchtower.enable: "true"
|
||||
|
||||
# Optional: if you want Watchtower to drive auto-updates of this
|
||||
# container too, you already run watchtower elsewhere — just make
|
||||
# sure this container has the label above set true.
|
||||
```
|
||||
|
||||
## Test from the host
|
||||
|
||||
```bash
|
||||
# Tool inventory (uses MCP's HTTP transport — adjust if you have a
|
||||
# different MCP client probe handy):
|
||||
curl -s http://localhost:8001/sse # or whichever endpoint your
|
||||
# client expects from streamable-http
|
||||
|
||||
# Or exec into the container and run the stdio transport:
|
||||
docker exec -it crop-chem-docs \
|
||||
python -m docs_mcp.server --transport stdio < /dev/null
|
||||
```
|
||||
|
||||
## What the container exposes
|
||||
|
||||
| Tool | What it does |
|
||||
|---|---|
|
||||
| `search_docs` | Hybrid+rerank pesticide-label search with optional filters |
|
||||
| `get_page` | Full label markdown + metadata by `(source, source_key)` |
|
||||
| `list_versions` | Discover sources, product classes, signal words, registrants |
|
||||
| `corpus_status` | Counts + freshness; useful for health probes |
|
||||
| `crop_chem_api_lessons` | Curated agronomy/label-handling knowledge — call before recommending |
|
||||
|
||||
## Versioning
|
||||
|
||||
Tags published by the Gitea Actions workflows:
|
||||
|
||||
| Tag | When | Use for |
|
||||
|---|---|---|
|
||||
| `:latest` | Every monthly refresh + every code push | Dev / Watchtower auto-pull |
|
||||
| `:<sha12>` | Every build | Rollback pin |
|
||||
| `:corpus-YYYY.MM.DD` | Every build | Pin to a specific corpus snapshot in prod |
|
||||
|
||||
The `:corpus-YYYY.MM.DD` tag is the right one for production —
|
||||
guarantees the running container has a known, frozen corpus that
|
||||
matches the labels you've validated against.
|
||||
|
||||
## Updating the corpus
|
||||
|
||||
Two paths:
|
||||
|
||||
1. **Wait for the monthly cron** — 1st @ 06:00 UTC, full re-scrape
|
||||
of Bayer + EPA PPLS, then reindex, then image push. Watchtower
|
||||
pulls the new `:latest` automatically.
|
||||
2. **Trigger manually** in Gitea Actions UI → `Monthly corpus
|
||||
refresh` → `Run workflow`. Optional `sources` input for
|
||||
single-source refresh (e.g., `bayer` only).
|
||||
|
||||
## Switching corpus scope
|
||||
|
||||
The row-crop filter (corn/soybeans/wheat) is in
|
||||
`scrape/sources/epa_ppls.py` as `ROW_CROP_KEYWORDS`. Edit + push +
|
||||
let the next workflow run pick it up. Same for the registrant
|
||||
allowlist at `scrape/sources/epa_registrant_allowlist.json`.
|
||||
Reference in New Issue
Block a user