ci: use zerto-docs's load-balanced Ollama GPU pool on the Gitea host

Match the OLLAMA_URLS pattern from zerto-docs-rag so every docs MCP
build fans out across the same two GPU-pinned Ollama containers on
192.168.0.2 (:11435 Titan X, :11436 1080 Ti). The host's primary
Ollama on :11434 is left alone for OpenWebUI.

rag.embeddings now reads OLLAMA_URLS (plural CSV) preferentially with
fallback to OLLAMA_URL, defaulting to http://192.168.0.2:11434 — same
shape as zerto's embeddings.py. The OllamaEmbeddings class already
round-robins per batch, so both GPUs run in parallel during the
chroma rebuild.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 13:22:59 -04:00
parent fd376fab77
commit 6b11993688
3 changed files with 31 additions and 9 deletions
+4 -1
View File
@@ -19,7 +19,10 @@ env:
REGISTRY_PUSH: 192.168.0.2:1234
REGISTRY_PULL: git.jpaul.io
IMAGE: ${{ github.repository_owner }}/${{ github.event.repository.name }}
OLLAMA_URL: http://192.168.0.126:11434
# Two GPU-pinned Ollama containers on the Gitea host — same infra
# zerto-docs uses. :11435 = Titan X, :11436 = 1080 Ti. Indexer
# round-robins per batch.
OLLAMA_URLS: http://192.168.0.2:11435,http://192.168.0.2:11436
EMBED_MODEL: nomic-embed-text
PRODUCT_NAME: hvm