Model providers: OpenAI/xAI/Ollama + run several at once (registry)

Extends the #215 abstraction:
- OpenAICompatibleLLMProvider / OpenAICompatibleEmbeddingProvider — one impl (via
  the official openai SDK) covers OpenAI, xAI (api.x.ai/v1), Ollama
  (…:11434/v1), OpenRouter, etc.; they differ only by base_url, key, and model.
- Registry factory: build_llm_providers() / build_embedding_providers() return
  every provider whose credentials are configured, so you can run several
  concurrently. get_llm_provider(name)/get_embedding_provider(name) select by
  name, falling back to default_*_provider, then Null.
- Per-provider env config (ANTHROPIC_*, OPENAI_*, XAI_*, OLLAMA_*) +
  DEFAULT_LLM_PROVIDER / DEFAULT_EMBEDDING_PROVIDER; documented in .env.example.
  Defaults keep AI off (empty registry).

Embeddings now have real backends (OpenAI/Ollama), still separate from the LLM
since Anthropic offers no embeddings endpoint. Tests cover multi-provider
selection, default resolution, disabled-without-credentials, and null fail-loud.
Full suite 87 passed.

Relates to #215.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
This commit is contained in:
2026-06-09 18:39:19 -04:00
parent 9187c0a791
commit de50f2c803
7 changed files with 245 additions and 50 deletions
+27 -5
View File
@@ -61,12 +61,34 @@ class Settings(BaseSettings):
smtp_from: str = "Provenance <no-reply@provenance.local>"
# --- Model providers (AI assistant + match-ranking embeddings) ---
# Separate because Anthropic has no embeddings endpoint; either can be off.
model_provider: str = "null" # null | anthropic
anthropic_api_key: str | None = None
llm_model: str = "claude-opus-4-8"
# Configure as many as you like; each is enabled when its credentials are
# present. `default_*_provider` picks which one is used by default. LLM and
# embeddings are independent (Anthropic has no embeddings endpoint).
default_llm_provider: str = "null" # null | anthropic | openai | xai | ollama
default_embedding_provider: str = "null" # null | openai | ollama
llm_max_tokens: int = 4096
embedding_provider: str = "null" # null | (future: ollama, voyage, …)
embedding_dimensions: int = 1536 # must match the embedding model + pgvector column
# Anthropic (LLM only)
anthropic_api_key: str | None = None
anthropic_model: str = "claude-opus-4-8"
# OpenAI (LLM + embeddings)
openai_api_key: str | None = None
openai_base_url: str = "https://api.openai.com/v1"
openai_model: str = "gpt-4o"
openai_embedding_model: str = "text-embedding-3-small"
# xAI / Grok — OpenAI-compatible (LLM)
xai_api_key: str | None = None
xai_base_url: str = "https://api.x.ai/v1"
xai_model: str = "grok-2-latest" # set to your account's current Grok model
# Ollama — local, OpenAI-compatible, no key (LLM + embeddings)
ollama_enabled: bool = False
ollama_base_url: str = "http://localhost:11434/v1"
ollama_model: str = "llama3.1"
ollama_embedding_model: str = "nomic-embed-text"
@lru_cache