Files
provenance/backend/app/core/config.py
T
justin 330543f9ce Fix #215: pluggable LLM + embedding provider abstraction
Adds the vendor-agnostic seam the AI assistant + match-ranking plug into:
- LLMProvider / EmbeddingProvider ABCs (base.py). LLM and embeddings are
  SEPARATE abstractions — Anthropic has no embeddings endpoint, so each is
  configured independently and either can be off.
- NullLLMProvider / NullEmbeddingProvider — the default; fail loud with a clear
  "not configured" error so AI-off deployments don't silently no-op.
- AnthropicLLMProvider — first concrete LLM impl, via the official anthropic SDK
  (default model claude-opus-4-8). A local provider (e.g. Ollama) would be
  another subclass of the same interface.
- Factory in deps.py (get_llm_provider / get_embedding_provider) selects by
  env (MODEL_PROVIDER / EMBEDDING_PROVIDER); documented in .env.example.

Providers are read-only text/vector producers — they never touch the DB, so the
"AI never writes autonomously" invariant (CLAUDE.md #1) holds; writes will go
through ChangeProposal (#214).

Tests: provider selection (null default, anthropic when keyed, fallback without
key) + null providers raise. 81 passed.

Closes #215

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 12:51:01 -04:00

75 lines
2.7 KiB
Python

"""Application configuration.
Twelve-factor: everything is read from the environment. Defaults are
development-friendly; production supplies real values via the compose `.env`.
No secrets or endpoints are hard-coded.
"""
from functools import lru_cache
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
extra="ignore",
)
app_name: str = "Provenance"
version: str = "0.0.0"
app_env: str = Field(default="development", description="development | production")
# SQLAlchemy async URL, e.g. postgresql+asyncpg://user:pass@host:5432/db
database_url: str = Field(
default="postgresql+asyncpg://provenance:provenance@localhost:5432/provenance",
)
# --- Auth / sessions ---
session_ttl_days: int = 30
token_ttl_hours: int = 24 # email-verify / password-reset token lifetime
cookie_name: str = "provenance_session"
cookie_secure: bool = True # set false for local http; true behind TLS
# Base URL used to build links in outbound email.
app_base_url: str = "http://localhost"
# --- Object storage (S3-compatible / MinIO) ---
s3_endpoint_url: str = "http://minio:9000"
s3_bucket: str = "provenance"
s3_access_key: str = "provenance"
s3_secret_key: str = "change-me-too"
s3_region: str = "us-east-1"
s3_presign_ttl: int = 3600 # seconds
# --- Worker ---
purge_interval_seconds: int = 3600 # how often to run the soft-delete purge
purge_after_days: int = 30 # soft-deleted rows older than this are purged
# --- Email (SMTP) ---
# When true, a user with no verified email gets no active session (login is
# refused and existing sessions stop resolving). Default false so self-hosts
# without SMTP — and accounts created before this gate existed — aren't
# locked out; operators turn it on once mail works and accounts are verified.
require_email_verification: bool = False
mailer: str = Field(default="console", description="console | smtp")
smtp_host: str | None = None
smtp_port: int = 587
smtp_username: str | None = None
smtp_password: str | None = None
smtp_from: str = "Provenance <no-reply@provenance.local>"
# --- Model providers (AI assistant + match-ranking embeddings) ---
# Separate because Anthropic has no embeddings endpoint; either can be off.
model_provider: str = "null" # null | anthropic
anthropic_api_key: str | None = None
llm_model: str = "claude-opus-4-8"
llm_max_tokens: int = 4096
embedding_provider: str = "null" # null | (future: ollama, voyage, …)
@lru_cache
def get_settings() -> Settings:
return Settings()