ci: use zerto-docs's load-balanced Ollama GPU pool on the Gitea host
Match the OLLAMA_URLS pattern from zerto-docs-rag so every docs MCP build fans out across the same two GPU-pinned Ollama containers on 192.168.0.2 (:11435 Titan X, :11436 1080 Ti). The host's primary Ollama on :11434 is left alone for OpenWebUI. rag.embeddings now reads OLLAMA_URLS (plural CSV) preferentially with fallback to OLLAMA_URL, defaulting to http://192.168.0.2:11434 — same shape as zerto's embeddings.py. The OllamaEmbeddings class already round-robins per batch, so both GPUs run in parallel during the chroma rebuild. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -29,10 +29,12 @@ env:
|
||||
# edit this. github.* is the Gitea-Actions inherited namespace.
|
||||
IMAGE: ${{ github.repository_owner }}/${{ github.event.repository.name }}
|
||||
|
||||
# Embedder. One URL per GPU; the indexer round-robins if you pass a
|
||||
# comma-separated list. Adjust to wherever Ollama is reachable from
|
||||
# the runner (gitea_default network can reach the host's bridge IP).
|
||||
OLLAMA_URL: http://192.168.0.126:11434
|
||||
# Two GPU-pinned Ollama containers on the Gitea host — same infra
|
||||
# zerto-docs uses (deploy/ollama-rag.docker-compose.yml over there).
|
||||
# :11435 owns the Titan X, :11436 owns the 1080 Ti; the indexer
|
||||
# round-robins per batch so both cards run in parallel. The host's
|
||||
# primary Ollama on :11434 is left alone for OpenWebUI etc.
|
||||
OLLAMA_URLS: http://192.168.0.2:11435,http://192.168.0.2:11436
|
||||
EMBED_MODEL: nomic-embed-text
|
||||
|
||||
PRODUCT_NAME: hvm
|
||||
|
||||
Reference in New Issue
Block a user