Model providers: OpenAI/xAI/Ollama + run several at once (registry)

Extends the #215 abstraction: - OpenAICompatibleLLMProvider / OpenAICompatibleEmbeddingProvider — one impl (via the official openai SDK) covers OpenAI, xAI (api.x.ai/v1), Ollama (…:11434/v1), OpenRouter, etc.; they differ only by base_url, key, and model. - Registry factory: build_llm_providers() / build_embedding_providers() return every provider whose credentials are configured, so you can run several concurrently. get_llm_provider(name)/get_embedding_provider(name) select by name, falling back to default_*_provider, then Null. - Per-provider env config (ANTHROPIC_*, OPENAI_*, XAI_*, OLLAMA_*) + DEFAULT_LLM_PROVIDER / DEFAULT_EMBEDDING_PROVIDER; documented in .env.example. Defaults keep AI off (empty registry). Embeddings now have real backends (OpenAI/Ollama), still separate from the LLM since Anthropic offers no embeddings endpoint. Tests cover multi-provider selection, default resolution, disabled-without-credentials, and null fail-loud. Full suite 87 passed. Relates to #215. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 18:39:19 -04:00
parent 9187c0a791
commit de50f2c803
7 changed files with 245 additions and 50 deletions
@@ -61,12 +61,34 @@ class Settings(BaseSettings):
    smtp_from: str = "Provenance <no-reply@provenance.local>"

    # --- Model providers (AI assistant + match-ranking embeddings) ---
-    # Separate because Anthropic has no embeddings endpoint; either can be off.
-    model_provider: str = "null"  # null | anthropic
-    anthropic_api_key: str | None = None
-    llm_model: str = "claude-opus-4-8"
+    # Configure as many as you like; each is enabled when its credentials are
+    # present. `default_*_provider` picks which one is used by default. LLM and
+    # embeddings are independent (Anthropic has no embeddings endpoint).
+    default_llm_provider: str = "null"  # null | anthropic | openai | xai | ollama
+    default_embedding_provider: str = "null"  # null | openai | ollama
    llm_max_tokens: int = 4096
-    embedding_provider: str = "null"  # null | (future: ollama, voyage, …)
+    embedding_dimensions: int = 1536  # must match the embedding model + pgvector column
+
+    # Anthropic (LLM only)
+    anthropic_api_key: str | None = None
+    anthropic_model: str = "claude-opus-4-8"
+
+    # OpenAI (LLM + embeddings)
+    openai_api_key: str | None = None
+    openai_base_url: str = "https://api.openai.com/v1"
+    openai_model: str = "gpt-4o"
+    openai_embedding_model: str = "text-embedding-3-small"
+
+    # xAI / Grok — OpenAI-compatible (LLM)
+    xai_api_key: str | None = None
+    xai_base_url: str = "https://api.x.ai/v1"
+    xai_model: str = "grok-2-latest"  # set to your account's current Grok model
+
+    # Ollama — local, OpenAI-compatible, no key (LLM + embeddings)
+    ollama_enabled: bool = False
+    ollama_base_url: str = "http://localhost:11434/v1"
+    ollama_model: str = "llama3.1"
+    ollama_embedding_model: str = "nomic-embed-text"


@lru_cache