Files
crop-chem-docs/deploy/drawbar-compose-snippet.md
T
justin c5ed5560fc
Image rebuild (skip scrape) / build (push) Failing after 1h41m9s
deploy: sensible Dockerfile defaults + simplified compose snippet
Dockerfile now sets OLLAMA_URL=http://ollama:11434 and
RERANK_URL=http://llama-rerank:8080 as image defaults, assuming the
MCP container shares a Docker network with services named `ollama`
and `llama-rerank` (typical compose pattern). Drawbar's stack
already runs both — no cross-host IPs to maintain, no off-stack
GPU dependencies. Stays inside the trashpanda compose.

deploy/drawbar-compose-snippet.md simplified: no environment
overrides needed for the common case. Override block shown only
for stacks with non-default service names. Pull tag updated to
:corpus-2026.05.24.

Per the new architecture call:
- MCP doesn't reach out to cross-host Ollama instances (192.168.0.2,
  192.168.0.125 etc.) at serve time — only at index-build time in CI.
- All serve-time dependencies are in the same Docker network as
  the consumer apps.

Code push touches Dockerfile → image-only.yml will rebuild + push.
Future-me note: the image-only.yml needs Ollama reachable from the
Gitea Actions runner for the reindex step; that still uses the LAN
endpoints (workflow env), which is correct since indexing is CI-side
not serve-side.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 13:09:38 -04:00

88 lines
3.0 KiB
Markdown

# Drawbar deploy — `crop-chem-docs` MCP server snippet
Drop this into Drawbar's `docker-compose.yml`. Targets the existing
trashpanda stack: shared Docker network with `ollama` + `llama-rerank`
service containers, Cloudflare Tunnel out front.
## Pre-reqs (one-time on the deploy host)
1. **Login to the Gitea registry** so the host can pull:
```bash
docker login git.jpaul.io -u justin # PAT for password
```
2. **`ollama` and `llama-rerank` services** are already running in
the same compose stack on the same Docker network. The MCP
container resolves them by service name via Docker's embedded
DNS — no IPs to maintain.
## Compose service
```yaml
services:
crop-chem-docs:
image: git.jpaul.io/justin/crop-chem-docs:corpus-2026.05.24
# :latest for dev / Watchtower auto-pull
container_name: crop-chem-docs
restart: unless-stopped
ports:
- "8001:8000" # MCP server (streamable-http). Adjust host port.
# No environment block needed — the image's defaults handle it:
# OLLAMA_URL=http://ollama:11434
# RERANK_URL=http://llama-rerank:8080
# HYBRID_SEARCH=true
# PRODUCT_NAME=crop_chem
# Override here only if your services have different names.
networks:
- default # or whichever shared network ollama/llama-rerank are on
labels:
com.centurylinklabs.watchtower.enable: "true"
```
If your stack uses non-default service names:
```yaml
environment:
OLLAMA_URL: "http://<your-ollama-service>:11434"
RERANK_URL: "http://<your-rerank-service>:8080"
```
## Test from the host
```bash
# Verify counts + indexes from inside the container:
docker exec crop-chem-docs python -c \
"from docs_mcp.server import corpus_status; print(corpus_status())"
```
## What the container exposes
| Tool | What it does |
|---|---|
| `search_docs` | Hybrid+rerank pesticide-label search with optional filters |
| `get_page` | Full label markdown + metadata by `(source, source_key)` |
| `list_versions` | Discover sources, product classes, signal words, registrants |
| `corpus_status` | Counts + freshness; useful for health probes |
| `crop_chem_api_lessons` | Curated agronomy / label-handling knowledge — call before recommending |
## Tag scheme
| Tag | When | Use for |
|---|---|---|
| `:latest` | Every monthly refresh + every code push | Dev / Watchtower auto-pull |
| `:<sha12>` | Every build | Rollback pin |
| `:corpus-YYYY.MM.DD` | Every build | **Production pin** (frozen corpus version) |
## Updating the corpus
- **Monthly cron** — 1st @ 06:00 UTC, full re-scrape of Bayer + EPA PPLS,
reindex, image push. Watchtower pulls the new `:latest` automatically.
- **Manual** — Gitea Actions UI → `Monthly corpus refresh` → `Run workflow`.
Optional `sources` input for single-source refresh (e.g., `bayer` only).
## Switching corpus scope
The row-crop filter (corn/soybeans/wheat) is in
`scrape/sources/epa_ppls.py` as `ROW_CROP_KEYWORDS`. Edit + push +
let the next workflow run pick it up. Same for the registrant
allowlist at `scrape/sources/epa_registrant_allowlist.json`.