Files
justin 94b5caa7e5 Guard against schema drift: fail readiness + log loudly when DB is behind code
Defense-in-depth for the deploy pipeline. Today a backend image shipped ahead
of an un-applied migration; the Tree model selected columns the DB didn't have
yet, so every trees query 500'd with an opaque UndefinedColumnError and the UI
showed no trees. The root cause (deploys not running migrations) is fixed
separately; this makes the *symptom* impossible to miss.

- app/core/schema_version.py: compare the DB's stamped alembic head to the
  head(s) baked into the image's migration scripts. A DB with no alembic_version
  table (e.g. a create_all test DB) is treated as current, so this stays quiet
  outside real deployments. Uses to_regclass so a missing table never poisons
  the caller's transaction.
- /health/ready: returns 503 with an explicit "drift: db=… expected=…" message
  when the schema is behind, instead of reporting ready and serving 500s.
- Startup lifespan: logs CRITICAL on drift (advisory — never blocks startup).

Liveness (/health) is untouched, so a drifted container isn't killed into a
crash-loop — it's loudly degraded and self-heals once migrations apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 21:55:21 -04:00

43 lines
1.8 KiB
Python

"""Schema-drift guard: the DB-vs-code head check behind /health/ready and the
startup log. Regression cover for the outage where the backend image shipped
ahead of an un-applied migration and every trees query 500'd."""
from sqlalchemy import text
from app.core.schema_version import db_heads, expected_heads, schema_is_current
def test_expected_heads_is_a_single_known_head():
heads = expected_heads()
# Linear migration history → exactly one head, and it's a real revision id.
assert len(heads) == 1
assert all(h and isinstance(h, str) for h in heads)
async def test_schema_is_current_detects_drift(db_session):
conn = await db_session.connection()
# The test DB is built from create_all (no alembic_version table), so it is
# not Alembic-managed and the check stays quiet — treated as current.
await conn.execute(text("DROP TABLE IF EXISTS alembic_version"))
assert await db_heads(conn) is None
ok, _, _ = await schema_is_current(conn)
assert ok is True
# Stamp an old/wrong revision → drift detected.
await conn.execute(text("CREATE TABLE alembic_version (version_num varchar(32) NOT NULL)"))
await conn.execute(text("INSERT INTO alembic_version (version_num) VALUES ('0000deadbeef')"))
ok, db, expected = await schema_is_current(conn)
assert ok is False
assert db == frozenset({"0000deadbeef"})
# Stamp the code's real head → current again.
head = next(iter(expected))
await conn.execute(text("DELETE FROM alembic_version"))
await conn.execute(text("INSERT INTO alembic_version (version_num) VALUES (:h)"), {"h": head})
ok, _, _ = await schema_is_current(conn)
assert ok is True
# Leave no alembic_version behind for other tests.
await conn.execute(text("DROP TABLE IF EXISTS alembic_version"))