Guard against schema drift: fail readiness + log loudly when DB is behind code
Defense-in-depth for the deploy pipeline. Today a backend image shipped ahead of an un-applied migration; the Tree model selected columns the DB didn't have yet, so every trees query 500'd with an opaque UndefinedColumnError and the UI showed no trees. The root cause (deploys not running migrations) is fixed separately; this makes the *symptom* impossible to miss. - app/core/schema_version.py: compare the DB's stamped alembic head to the head(s) baked into the image's migration scripts. A DB with no alembic_version table (e.g. a create_all test DB) is treated as current, so this stays quiet outside real deployments. Uses to_regclass so a missing table never poisons the caller's transaction. - /health/ready: returns 503 with an explicit "drift: db=… expected=…" message when the schema is behind, instead of reporting ready and serving 500s. - Startup lifespan: logs CRITICAL on drift (advisory — never blocks startup). Liveness (/health) is untouched, so a drifted container isn't killed into a crash-loop — it's loudly degraded and self-heals once migrations apply. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Justin Paul <justin@jpaul.me>
This commit is contained in:
@@ -7,6 +7,7 @@ engine is the single enforcement point for reads.
|
||||
|
||||
import logging
|
||||
import sys
|
||||
from contextlib import asynccontextmanager
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi.responses import JSONResponse
|
||||
@@ -14,6 +15,8 @@ from fastapi.responses import JSONResponse
|
||||
from app.api.health import router as health_router
|
||||
from app.api.v1 import api_router
|
||||
from app.core.config import get_settings
|
||||
from app.core.db import get_engine
|
||||
from app.core.schema_version import schema_is_current
|
||||
from app.services.exceptions import Conflict, Forbidden, NotFound
|
||||
|
||||
|
||||
@@ -30,6 +33,32 @@ def _configure_logging() -> None:
|
||||
app_logger.propagate = False
|
||||
|
||||
|
||||
async def _check_schema_drift() -> None:
|
||||
"""On startup, shout if the DB schema is behind the code. The entrypoint
|
||||
runs migrations when RUN_MIGRATIONS=1; this catches the case where that
|
||||
didn't happen, so a half-applied deploy is obvious in the logs instead of a
|
||||
silent storm of 500s. Never blocks startup — purely advisory."""
|
||||
logger = logging.getLogger("provenance")
|
||||
try:
|
||||
async with get_engine().connect() as conn:
|
||||
ok, db, expected = await schema_is_current(conn)
|
||||
if not ok:
|
||||
logger.critical(
|
||||
"SCHEMA DRIFT: database is at %s but this build expects %s. "
|
||||
"Run 'alembic upgrade head' — queries will fail until migrated.",
|
||||
sorted(db) or ["none"],
|
||||
sorted(expected),
|
||||
)
|
||||
except Exception as exc: # noqa: BLE001 — advisory only; never block startup
|
||||
logger.warning("schema drift check skipped: %s", exc)
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def _lifespan(app: FastAPI):
|
||||
await _check_schema_drift()
|
||||
yield
|
||||
|
||||
|
||||
def _register_error_handlers(app: FastAPI) -> None:
|
||||
@app.exception_handler(NotFound)
|
||||
async def _not_found(request: Request, exc: NotFound) -> JSONResponse:
|
||||
@@ -51,6 +80,7 @@ def create_app() -> FastAPI:
|
||||
title=settings.app_name,
|
||||
version=settings.version,
|
||||
description="Provenance API — family and land provenance.",
|
||||
lifespan=_lifespan,
|
||||
)
|
||||
app.include_router(health_router)
|
||||
app.include_router(api_router)
|
||||
|
||||
Reference in New Issue
Block a user