crop-chem-docs

justin/crop-chem-docs

Fork 0

Commit Graph

Author	SHA1	Message	Date
justin	e5da4b21b0	deploy: add llama-rerank service to compose snippet Drawbar's compose doesn't have a rerank service today — the llama-rerank container I spun up earlier was a standalone docker run, not a compose service. For Docker DNS resolution (http://llama-rerank:8080) to work between MCP + reranker, both need to be siblings in the same compose stack. Added the llama-rerank service entry with: - :server-cuda image (CUDA-built llama.cpp; the plain :server is CPU-only and 25× slower for our 50-doc rerank pool) - -ngl 99 to offload all layers to GPU - deploy.resources.reservations.devices block for compose v3 GPU passthrough (preferred over the older `runtime: nvidia` syntax) - volume for the HuggingFace model cache so first-start GGUF download survives container recreates - no host port mapping — internal-network-only Tesla P4 compatibility notes inline: Pascal (CC 6.1) is in the :server-cuda image's compute-arch list (500-1200) so no special handling beyond the standard compose entry. Also: cleanup instruction to docker rm -f the standalone llama-rerank from the earlier setup before bringing up compose (name collision). And: noted that if trashpanda's existing Ollama is a host-mode process rather than a compose service, the MCP needs host.docker.internal override (snippet included). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 13:25:34 -04:00
justin	c5ed5560fc	deploy: sensible Dockerfile defaults + simplified compose snippet Image rebuild (skip scrape) / build (push) Failing after 1h41m9s Details Dockerfile now sets OLLAMA_URL=http://ollama:11434 and RERANK_URL=http://llama-rerank:8080 as image defaults, assuming the MCP container shares a Docker network with services named `ollama` and `llama-rerank` (typical compose pattern). Drawbar's stack already runs both — no cross-host IPs to maintain, no off-stack GPU dependencies. Stays inside the trashpanda compose. deploy/drawbar-compose-snippet.md simplified: no environment overrides needed for the common case. Override block shown only for stacks with non-default service names. Pull tag updated to :corpus-2026.05.24. Per the new architecture call: - MCP doesn't reach out to cross-host Ollama instances (192.168.0.2, 192.168.0.125 etc.) at serve time — only at index-build time in CI. - All serve-time dependencies are in the same Docker network as the consumer apps. Code push touches Dockerfile → image-only.yml will rebuild + push. Future-me note: the image-only.yml needs Ollama reachable from the Gitea Actions runner for the reindex step; that still uses the LAN endpoints (workflow env), which is correct since indexing is CI-side not serve-side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 13:09:38 -04:00
justin	8766d73327	deploy: Drawbar compose snippet — first image is published Image pushed to git.jpaul.io/justin/crop-chem-docs with three tags: :latest — Watchtower auto-pull target :a97107de4636 — commit-sha rollback pin :corpus-2026.05.24 — corpus-snapshot pin (prod-recommended) Drawbar compose snippet at deploy/drawbar-compose-snippet.md. Wires the container against the existing infra: - Ollama pool: 192.168.0.2:11434, 192.168.0.2:11435, 192.168.0.125:11434, 10.10.1.65:11434 - Reranker: http://10.10.1.65:8082 - HYBRID_SEARCH=true (production retrieval — BM25 + dense + rerank) - Exposes streamable-HTTP MCP on port 8000 Pull path uses git.jpaul.io (public hostname, CF-fronted; pull response bodies aren't capped). Push path uses 192.168.0.2:1234 (LAN endpoint, bypasses CF 100MB body cap). Same registry, different URLs — per the template gotcha doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 12:48:24 -04:00

Author

SHA1

Message

Date

justin

e5da4b21b0

deploy: add llama-rerank service to compose snippet

Drawbar's compose doesn't have a rerank service today — the
llama-rerank container I spun up earlier was a standalone
docker run, not a compose service. For Docker DNS resolution
(http://llama-rerank:8080) to work between MCP + reranker, both
need to be siblings in the same compose stack.

Added the llama-rerank service entry with:
- :server-cuda image (CUDA-built llama.cpp; the plain :server is
  CPU-only and 25× slower for our 50-doc rerank pool)
- -ngl 99 to offload all layers to GPU
- deploy.resources.reservations.devices block for compose v3 GPU
  passthrough (preferred over the older `runtime: nvidia` syntax)
- volume for the HuggingFace model cache so first-start GGUF
  download survives container recreates
- no host port mapping — internal-network-only

Tesla P4 compatibility notes inline: Pascal (CC 6.1) is in the
:server-cuda image's compute-arch list (500-1200) so no special
handling beyond the standard compose entry.

Also: cleanup instruction to docker rm -f the standalone
llama-rerank from the earlier setup before bringing up compose
(name collision).

And: noted that if trashpanda's existing Ollama is a host-mode
process rather than a compose service, the MCP needs
host.docker.internal override (snippet included).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-24 13:25:34 -04:00

justin

c5ed5560fc

deploy: sensible Dockerfile defaults + simplified compose snippet

Image rebuild (skip scrape) / build (push) Failing after 1h41m9s

Details

Dockerfile now sets OLLAMA_URL=http://ollama:11434 and
RERANK_URL=http://llama-rerank:8080 as image defaults, assuming the
MCP container shares a Docker network with services named `ollama`
and `llama-rerank` (typical compose pattern). Drawbar's stack
already runs both — no cross-host IPs to maintain, no off-stack
GPU dependencies. Stays inside the trashpanda compose.

deploy/drawbar-compose-snippet.md simplified: no environment
overrides needed for the common case. Override block shown only
for stacks with non-default service names. Pull tag updated to
:corpus-2026.05.24.

Per the new architecture call:
- MCP doesn't reach out to cross-host Ollama instances (192.168.0.2,
  192.168.0.125 etc.) at serve time — only at index-build time in CI.
- All serve-time dependencies are in the same Docker network as
  the consumer apps.

Code push touches Dockerfile → image-only.yml will rebuild + push.
Future-me note: the image-only.yml needs Ollama reachable from the
Gitea Actions runner for the reindex step; that still uses the LAN
endpoints (workflow env), which is correct since indexing is CI-side
not serve-side.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-24 13:09:38 -04:00

justin

8766d73327

deploy: Drawbar compose snippet — first image is published

Image pushed to git.jpaul.io/justin/crop-chem-docs with three tags:
  :latest             — Watchtower auto-pull target
  :a97107de4636       — commit-sha rollback pin
  :corpus-2026.05.24  — corpus-snapshot pin (prod-recommended)

Drawbar compose snippet at deploy/drawbar-compose-snippet.md.
Wires the container against the existing infra:
  - Ollama pool: 192.168.0.2:11434, 192.168.0.2:11435,
                 192.168.0.125:11434, 10.10.1.65:11434
  - Reranker:    http://10.10.1.65:8082
  - HYBRID_SEARCH=true (production retrieval — BM25 + dense + rerank)
  - Exposes streamable-HTTP MCP on port 8000

Pull path uses git.jpaul.io (public hostname, CF-fronted; pull
response bodies aren't capped). Push path uses 192.168.0.2:1234
(LAN endpoint, bypasses CF 100MB body cap). Same registry,
different URLs — per the template gotcha doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-24 12:48:24 -04:00

3 Commits