36 Commits

Author SHA1 Message Date
justin 7043532c3b Merge pull request 'Cleanup tool: mark deceased by a child's birth year' (#254) from cleanup-deceased-by-child into main
build-backend / build (push) Successful in 29s
build-frontend / build (push) Successful in 1m29s
2026-06-11 11:08:52 -04:00
justin 1340d1957f Cleanup tool: "mark deceased by a child's birth year" rule
Adds a preview/apply rule to the Cleanup tool for parents who have NO birth date
of their own (so the existing born-on-or-before rule can't reach them) but who
have a child born long ago — they're necessarily deceased. This is the gap that
left ~56 parents in the Paul tree as "unknown".

- cleanup_service.preview_deceased_by_child(year): parents of any child born
  on/before the cutoff, excluding already-deceased; returns child_birth_year.
- GET /trees/{id}/cleanup/deceased-by-child?born_on_or_before=1900. Apply reuses
  the existing POST .../cleanup/deceased (same audited mark-deceased path).
- Frontend: a new card in the Cleanup tool (year input → preview → select →
  apply), preview-first like the rest of the tool.

Test covers preview (finds the no-birthdate parent of a pre-cutoff child,
excludes modern-child parents), child_birth_year, apply, and re-preview drop.
Suite 106 passing.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 11:08:50 -04:00
justin e24a7cfcc9 Merge pull request 'Tree cards: living/unset-sex people render gray, not blue' (#253) from living-and-unset-cards-gray into main
build-frontend / build (push) Successful in 1m28s
2026-06-11 10:37:27 -04:00
justin 07944e329e Tree cards: render unset-sex / redacted "Living person" in gray, not blue
The chart mapped gender as `=== "female" ? "F" : "M"`, so anything non-female —
including null — became "M" (blue). On the public site, redacted living people
(whose gender the privacy engine nulls) all showed blue regardless of real sex,
and anywhere a person's sex was simply unset they also showed blue (misleading).

Map male→"M", female→"F", and everything else→null, which family-chart renders
as `card-genderless`. So living/redacted people render gray (and never imply a
sex), and unset-sex people render gray instead of defaulting to male/blue.
Applied to both the member tree (tree/page.tsx) and the public chart
(public-tree-chart.tsx), which share chart.css. Also bumped the genderless color
from the library's washed-out `lightgray` to a warm mid-gray that matches the
muted male/female tones and the brand palette.

Privacy note: `_redact` already nulls gender, so this is purely the client color
mapping — no sex leak, just a correct neutral rendering.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 10:37:25 -04:00
justin a33a88e558 Merge pull request 'docs: note the spouse-layout fix is upstreamed' (#252) from docs-upstream-spouse-fix into main 2026-06-11 09:33:21 -04:00
justin fe8349819f docs: note the spouse-layout fix is upstreamed (donatso/family-chart#105)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 09:33:19 -04:00
justin e745fb5d4d Merge pull request 'Move cardToMiddle fix into the family-chart patch (+ document patches)' (#251) from family-chart-patch-cardtomiddle into main
build-frontend / build (push) Successful in 1m27s
2026-06-11 09:21:32 -04:00
justin e0573e6be2 Move cardToMiddle vertical-centering fix into the family-chart patch
Fold the fly-to vertical-centering fix into our patch-package patch (alongside
the existing spouse-layout fix) instead of compensating in app code, and revert
the in-app workaround so the two don't double-correct.

- patches/family-chart+0.9.0.patch: cardToMiddle now scales datum.y by the zoom
  k in both dist builds (.js + .esm.js), matching datum.x. Verified the patch
  applies cleanly (patch-package --error-on-fail).
- tree/page.tsx: the cardToMiddle caller passes raw y again (the patched library
  does the scaling now); pre-scaling here too would double-correct. Behavior is
  identical to the previous in-app fix — both center the node exactly.
- CLAUDE.md: documents the two family-chart patches, how to regenerate them, and
  that both should be upstreamed. The cardToMiddle fix is submitted upstream
  (donatso/family-chart#103, issue #102); the spouse-layout fix is a TODO.

The frontend Dockerfile already COPYs patches/ before npm ci, so the fix is in
the production build.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 09:21:30 -04:00
justin 3731d77d4b Merge pull request 'Fix fly-to vertical centering at non-1 zoom levels' (#250) from fix-fly-to-vertical-centering into main
build-frontend / build (push) Successful in 1m28s
2026-06-11 08:58:38 -04:00
justin bf1576252b Fix fly-to vertical centering at non-1 zoom levels
Clicking ×N sometimes flew to a blank area far below the tree. Cause:
family-chart's cardToMiddle scales datum.x by the zoom factor k but not datum.y
(`y = height/2 - datum.y`, missing the ·k), so vertical centering is only
correct at k=1 and drifts by datum.y·(k−1) at any other zoom — worse the deeper
the person sits. That's why it worked only when the view happened to be near 1:1.

Compensate by pre-multiplying the y we pass to cardToMiddle by the current
scale, cancelling the library's missing ·k. x was already correct.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 08:58:36 -04:00
justin 0ed6ba4505 Merge pull request 'Tree: clicking ×N flies to the person's other copy' (#249) from tree-fly-to-duplicate into main
build-frontend / build (push) Successful in 1m31s
2026-06-11 08:47:59 -04:00
justin ed263cf9a7 Tree: clicking ×N flies to the person's other copy (not just flashes)
On a large tree the duplicate's other copy is usually off-screen, so flashing
in place wasn't enough. Clicking the ×N badge now pans/zooms the view to center
the other copy and flashes it on arrival; clicking again cycles through the
remaining copies (for a person drawn 3+ times).

Uses family-chart's exported handlers: cardToMiddle centers a datum (read from
the target card_cont's bound x/y, falling back to its transform attr), keeping
the current zoom level via getCurrentZoom. Verified against the lib: the svg's
parent (f3Canvas) holds the zoom object, and cards are positioned by datum x/y —
same coordinate space cardToMiddle expects. Falls back to an in-place flash if
the zoom object isn't ready. Frontend only; supersedes the flash-only behavior.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 08:47:44 -04:00
justin f7666ad30b Merge pull request 'Tree: Legend by the pan/zoom hint + clickable ×N duplicate badges' (#248) from tree-legend-and-duplicate-flash into main
build-frontend / build (push) Successful in 1m28s
2026-06-11 08:32:53 -04:00
justin 690a6da659 Tree: a Legend by the pan/zoom hint, and clickable ×N duplicate badges
Two small tree-view aids prompted by "why do some people show ×2".

- Legend: a hover/focus "Legend" link next to the "drag to pan…" hint, explaining
  the ×N badge (a person drawn N times in the view because they connect through
  more than one line — a shared ancestor or an intermarriage), the gender card
  colors, and the pan/zoom/recenter controls.
- The ×N badge is now clearly clickable (cursor + hover state); clicking it
  flashes every copy of that person in the current view (a bronze outline pulse),
  so you can spot where else they appear. Implemented by delegating on the chart
  container and matching the d3-bound person id across cards; capture-phase +
  stopPropagation so a badge click flashes instead of recentering.

Frontend only. Honest follow-up: flashing finds copies that are on-screen; a true
"fly to an off-screen copy" needs d3-zoom transform work (the chart pans by
transform, not scroll) — a later enhancement.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 08:32:35 -04:00
justin e7115023e1 Merge pull request 'Person page: server-side search; stop loading the whole tree' (#247) from person-page-server-search into main
build-backend / build (push) Successful in 38s
build-frontend / build (push) Successful in 1m30s
2026-06-11 08:29:32 -04:00
justin 58400ffdf7 Person page: server-side search; stop loading the whole tree
The person page fetched the entire tree on every open — all persons (to build a
name map + power the relative pickers) and all events (to find partnership
events). On a 2k-person tree that's a ~230KB person list + ~600KB event list per
view. Now it loads only what the page shows:

Frontend:
- The relationship & spouse pickers use the backend's fuzzy pg_trgm search
  (debounced, typo-tolerant) instead of substring-filtering a preloaded array —
  better search, and no need to preload every person. PersonCombobox gained an
  `onSearch` server mode (client `people` mode still works).
- The page drops the all-persons and all-events fetches; it resolves just this
  person's relatives' names via GET /persons?ids=..., and reads partnership
  events from the per-person events endpoint.

Backend:
- GET /trees/{id}/persons?ids=a,b,c — batch by id (privacy-filtered, names
  batched), for relative-name display.
- list_events_for_person (member path) now also returns the person's partnership
  events, so the page needn't scan every event in the tree.

Adversarial review (frontend logic + backend/privacy) found no issues. Suite 105
passing.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 08:29:13 -04:00
justin 629bfa1367 Merge pull request 'Fix list_persons N+1 (the ~4s person-page load)' (#246) from fix-person-list-n-plus-one into main
build-backend / build (push) Successful in 37s
2026-06-11 08:00:47 -04:00
justin 1562febdcf Fix list_persons N+1 (the ~4s person-page load)
Opening any person page on a large tree took 4-5s on an idle server. Root cause:
list_persons looped over every person calling privacy.person_visibility (which
issues TWO get_membership_role queries per call) AND _attach_primary_name (one
name query per person). On the reporter's 2,324-person tree that's ~7,000
serialized DB round-trips per page load — the person page fetches the full
person list to build its name-lookup map.

Fix:
- Resolve the viewer's membership role ONCE. Members see the whole tree (full),
  so skip the per-person privacy engine entirely.
- Add _attach_primary_names: one batched names query (person_id IN (...),
  ordered the same as the single-person query so it picks the same name) instead
  of one per person.
- Apply the same batching to the non-member path, search_persons, the deleted-
  persons list, and public_view_service.list_public_persons.

Member-path list_persons goes from ~3·N queries to ~3 total. Other tree-wide
list endpoints (events/relationships/media/citations) were already flat selects.

Adds a regression test that asserts list_persons issues a constant number of
queries (not proportional to person count). Suite: 103 passing.

Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-11 08:00:30 -04:00
justin 265f5f4e7a Merge pull request 'Close citation/source living-person leak; add on-demand tree purge' (#245) from citation-redaction-and-tree-purge into main
build-backend / build (push) Successful in 32s
build-frontend / build (push) Successful in 1m26s
2026-06-10 22:39:15 -04:00
justin a6179037c2 Close citation/source living-person leak; add on-demand tree purge
Two changes.

1. Privacy fix (NN#2/NN#3) — the citation and source list endpoints gated only
   on can_view_tree, so a non-member on a public/unlisted/site_members tree could
   enumerate citations and sources tied to a redacted living person, leaking that
   the person exists and has sourced facts (and possibly their name via a source
   title). #46 closed this for events/media/names/relationships but not
   citations/sources. Now citation_service.list_citations and
   source_service.{list_sources,get_source} delegate non-member reads to
   public_view_service, mirroring the #46 pattern:
   - citations: shown only when the cited fact resolves to FULL-visibility
     person(s) — covers the person_id, name_id, event_id (person or both-partner),
     and relationship_id (both-partner) target paths.
   - sources: shown only when they back at least one visible citation; a withheld
     source 404s (don't reveal it exists).
   Tests cover all four citation target types + source withholding + member-sees-all.

2. On-demand tree purge — owners can permanently delete a soft-deleted tree now
   instead of waiting out the 30-day auto-purge window. POST /trees/{id}/purge
   (owner-only): the tree must already be in the trash, and the caller retypes its
   name to confirm. Media objects are deleted from storage, then a single
   DELETE on trees cascades all tree-owned rows via the tree_id ON DELETE CASCADE;
   the audit entry survives (tree_id SET NULL). Frontend adds a "Delete forever"
   button to the Recently-deleted list. No migration.

Suite: 102 passing.
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-10 22:38:59 -04:00
justin 7ed3ddd448 Merge pull request 'docs: bring all documentation current with shipped work' (#244) from docs-catch-up into main 2026-06-10 21:05:46 -04:00
justin 447daf7fa8 docs: bring all documentation current with shipped work
A multi-agent audit of every doc against the code surfaced ~50 stale/missing
items (the roadmap/status docs and the backlog had fallen behind the code).
This catches them up:

- CLAUDE.md: phase status was ~3 phases stale ("Phase 1 is next" while Phase 1 +
  chunks of 2 & 4 shipped). Rewrote the status list; added a model-provider
  tech-stack entry; updated repo-layout (integrations objectstore/models,
  deploy backup.sh/dev compose).
- ARCHITECTURE.md: §6 privacy engine described 3 visibility levels — corrected to
  the shipped 4 (adds site_members); documented per-tree AI policy on Tree,
  LLMProvider/EmbeddingProvider split + registry, ChangeProposal origin/status/
  operations, verified-email session gate, instance-owner role, schema-drift
  guard, and the env_file config model.
- PRD.md: 4-level visibility in US-040/§5.5, instance-owner role (§5.1/§5.11),
  per-tree AI policy (§5.8), §8 sequencing annotated with shipped status, header
  date/status bumped.
- README.md: 4-level privacy; softened "Full GEDCOM 7" to the 5.5.1/7 common
  subset; noted backups + instance-owner admin; moved property/land to an
  explicit "where it's headed" (no property models exist yet).
- BACKLOG.md: flipped ~15 shipped-but-open rows to Have (ChangeProposal, provider
  abstraction, GEDCOM citation export, membership management, operator backup,
  email-verification gate, per-tree AI policy, instance owner, the whole
  visibility/public-viewing/child-resource-redaction cluster #41-#51/#46), and
  reconciled the executive summary, "current defects" list, quick wins, and
  differentiators. Left genuinely-open items (citation/source redaction, sitemap,
  per-tree noindex, scoped-token API) accurately open.
- .env.example: dropped "SMTP wired in a later phase"; documented the worker
  purge knobs, S3_PRESIGN_TTL, COOKIE_NAME; removed a stray duplicate line.
- design/: tree-visibility.md and change-proposal.md marked Shipped; corrected
  the redaction approach (reuses member schemas, not a separate PublicPersonRead)
  and the apply() rollback claim (v1 is not cross-op transactional), and marked
  rate-limiting/sitemap/noindex as deferred.

No code changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-10 21:05:29 -04:00
justin 0388b9b99f Merge pull request 'compose: drive app config from .env (env_file, blanket passthrough)' (#243) from compose-env-file into main 2026-06-10 08:46:16 -04:00
justin 00f403defa compose: drive backend/worker/migrate config from .env (env_file)
Replace the per-setting environment allow-list with `env_file: .env` on the
three app-image services, so every setting in app/core/config.py is configurable
from .env with no compose edit. This kills the recurring trap where a documented
env var (OWNER_EMAIL, the AI keys, SMTP, APP_BASE_URL) silently didn't reach the
app because it wasn't on the hand-maintained list.

`env_file` is `required: false` so local/CI without a .env still works (falls
back to ${VAR:-default} interpolation + code defaults). The small `environment:`
block that remains is only for values that must NOT come from .env:
  - RUN_MIGRATIONS=1 (backend) — a deploy flag, not an app setting.
  - DATABASE_URL — pinned to the compose-internal host, because the code default
    points at localhost (wrong inside the network). environment wins over
    env_file, so this is a safety net if .env ever omits it.

Trade-off (accepted, see comment): env_file also injects infra secrets
(POSTGRES_*, MINIO_*, CLOUDFLARE_TUNNEL_TOKEN) into the app process env; the app
ignores unknown vars (pydantic extra="ignore").

Verified on prod: DATABASE_URL resolves to postgres:5432, RUN_MIGRATIONS=1 and
OWNER_EMAIL intact, COOKIE_SECURE=true (no posture change), health 200, trees
200. The earlier explicit AI/SMTP/OWNER passthrough is now subsumed by this.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-10 08:46:00 -04:00
justin 519f1c31b5 Merge pull request 'compose: forward AI provider + mailer/SMTP env to the backend' (#242) from compose-ai-smtp-passthrough into main 2026-06-10 08:39:04 -04:00
justin 3a1395b6af compose: forward AI provider + mailer/SMTP env to the backend
Follow-up to the OWNER_EMAIL passthrough. The backend service env block is an
explicit allow-list, so the documented model-provider keys (ANTHROPIC_*,
OPENAI_*, XAI_*, OLLAMA_*, DEFAULT_*_PROVIDER, LLM_MAX_TOKENS,
EMBEDDING_DIMENSIONS) and mailer settings (MAILER, SMTP_*, APP_BASE_URL,
REQUIRE_EMAIL_VERIFICATION) never reached the container — setting them in .env
was a no-op. The AI assistant/policy and the SMTP mailer run in the backend, so
forward them here.

Side fix: APP_BASE_URL was likewise dropped, so outbound email links used the
code default http://localhost instead of the configured domain. Now forwarded
(verified live: backend reports APP_BASE_URL=https://provenance.paul.farm).

Worker is left as-is (it consumes neither today); it'll need the model vars when
embedding/matching jobs land. Alternative to this growing allow-list is
`env_file: .env` on the service — deferred to avoid forwarding unrelated secrets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-10 08:38:49 -04:00
justin 2712ae469b Merge pull request 'compose: forward OWNER_EMAIL to the backend container' (#241) from compose-forward-owner-email into main 2026-06-09 23:22:59 -04:00
justin 88beb9650f compose: forward OWNER_EMAIL to the backend container
The instance-owner feature reads OWNER_EMAIL, but the backend service's
environment block is an explicit allow-list that didn't include it — so setting
it in .env never reached the app (is_instance_owner always saw "" → no owner).
Add the passthrough.

NOTE: the same allow-list omits the AI provider keys (ANTHROPIC_API_KEY,
OPENAI_*, XAI_*, OLLAMA_*) and SMTP settings, so those documented env vars also
don't currently reach the backend on this deployment. Worth a follow-up
(forward them explicitly, or switch the service to env_file) so .env actually
drives all configuration per the twelve-factor rule.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 23:22:48 -04:00
justin 15504ba6e1 Merge pull request 'Instance owner/operator role (env-declared via OWNER_EMAIL)' (#240) from instance-owner into main
build-backend / build (push) Successful in 29s
build-frontend / build (push) Successful in 1m29s
2026-06-09 23:17:08 -04:00
justin c5631d3eab Add an instance owner/operator role (env-declared via OWNER_EMAIL)
Provenance had no system-level owner: ownership was only per-tree
(TreeMembership), so a self-hosted instance had no operator account and no
instance-admin surface. This adds one, declared by environment per the project's
twelve-factor rule.

- OWNER_EMAIL (comma-separated): the account(s) named here are instance owners.
  Derived at request time — no DB column, no migration, can't drift from the env,
  survives DB resets. is_instance_owner()/InstanceOwner dependency in api/deps.py.
- Ownership requires a VERIFIED email (independent of REQUIRE_EMAIL_VERIFICATION).
  Registration is open, so without this an attacker could seize the role by
  registering the owner address first; verification ties it to inbox control.
- GET /api/v1/admin/instance (owner-only): operational status — version, env,
  user/tree counts, configured AI providers. Deliberately exposes no tree data
  or PII: instance ownership is an operator role, NOT a privacy-engine bypass.
- /users/me reports is_instance_owner; frontend gains an owner-only /admin page
  and a conditional sidebar link (server-enforced, not just client-hidden).

Found-and-fixed by an adversarial security review before merge: the
verified-email land-grab (above) and a frontend null-deref where the admin page
crashed on 401/5xx instead of failing closed.

Docs: .env.example + ARCHITECTURE (notes the not-a-privacy-bypass boundary and
the verified-email requirement). Tests: owner matching, the land-grab guard,
/users/me, and owner-only /admin. Suite 96 passing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 23:16:45 -04:00
justin 6fbad3106d Merge pull request 'Guard against schema drift (readiness 503 + loud startup log)' (#239) from schema-drift-guard into main
build-backend / build (push) Successful in 32s
2026-06-09 21:56:08 -04:00
justin 94b5caa7e5 Guard against schema drift: fail readiness + log loudly when DB is behind code
Defense-in-depth for the deploy pipeline. Today a backend image shipped ahead
of an un-applied migration; the Tree model selected columns the DB didn't have
yet, so every trees query 500'd with an opaque UndefinedColumnError and the UI
showed no trees. The root cause (deploys not running migrations) is fixed
separately; this makes the *symptom* impossible to miss.

- app/core/schema_version.py: compare the DB's stamped alembic head to the
  head(s) baked into the image's migration scripts. A DB with no alembic_version
  table (e.g. a create_all test DB) is treated as current, so this stays quiet
  outside real deployments. Uses to_regclass so a missing table never poisons
  the caller's transaction.
- /health/ready: returns 503 with an explicit "drift: db=… expected=…" message
  when the schema is behind, instead of reporting ready and serving 500s.
- Startup lifespan: logs CRITICAL on drift (advisory — never blocks startup).

Liveness (/health) is untouched, so a drifted container isn't killed into a
crash-loop — it's loudly degraded and self-heals once migrations apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 21:55:21 -04:00
justin f8fa23c1f6 Merge pull request 'Per-tree AI model policy (owner-only admin view)' (#238) from ai-model-policy into main
build-backend / build (push) Successful in 32s
build-frontend / build (push) Successful in 1m27s
2026-06-09 20:53:07 -04:00
justin c6b1e72130 Per-tree AI model policy (owner-only admin view)
The operator decides which model providers exist (env / registry — Anthropic,
OpenAI, x.AI, Ollama, several at once). The *tree owner* decides who uses which:

- Members' assistant -> one configured provider (or none)
- Recommender (association/connection finder) -> one configured provider (or none)
- Owner -> may use any configured provider

Backend: two nullable columns on `trees` (ai_member_provider,
ai_recommender_provider) + migration; `configured_llm_providers()` exposes the
registry as {name, model} with no secrets; owner-gated GET/PATCH
/trees/{id}/ai validate names against the configured set. Frontend: owner-only
"AI models" page with a dropdown per role, graceful 403 for non-owners, and a
sidebar link.

Per-model-within-a-provider selection is a follow-up; today each provider maps
to its single configured model.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 20:52:30 -04:00
justin ceafb299d6 Merge pull request 'Model providers: OpenAI/xAI/Ollama + run several at once' (#237) from multi-provider-openai-xai-ollama into main
build-backend / build (push) Successful in 32s
2026-06-09 18:39:20 -04:00
justin de50f2c803 Model providers: OpenAI/xAI/Ollama + run several at once (registry)
Extends the #215 abstraction:
- OpenAICompatibleLLMProvider / OpenAICompatibleEmbeddingProvider — one impl (via
  the official openai SDK) covers OpenAI, xAI (api.x.ai/v1), Ollama
  (…:11434/v1), OpenRouter, etc.; they differ only by base_url, key, and model.
- Registry factory: build_llm_providers() / build_embedding_providers() return
  every provider whose credentials are configured, so you can run several
  concurrently. get_llm_provider(name)/get_embedding_provider(name) select by
  name, falling back to default_*_provider, then Null.
- Per-provider env config (ANTHROPIC_*, OPENAI_*, XAI_*, OLLAMA_*) +
  DEFAULT_LLM_PROVIDER / DEFAULT_EMBEDDING_PROVIDER; documented in .env.example.
  Defaults keep AI off (empty registry).

Embeddings now have real backends (OpenAI/Ollama), still separate from the LLM
since Anthropic offers no embeddings endpoint. Tests cover multi-provider
selection, default resolution, disabled-without-credentials, and null fail-loud.
Full suite 87 passed.

Relates to #215.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
2026-06-09 18:39:19 -04:00
62 changed files with 3189 additions and 263 deletions
+30 -5
View File
@@ -30,6 +30,7 @@ These are product invariants, not preferences. Do not violate them, and flag any
- **Object storage:** S3-compatible (MinIO for self-host).
- **Edge:** Caddy reverse proxy; optional Cloudflare Tunnel (preferred ingress, never required).
- **Email:** operator-configured SMTP.
- **Model providers:** pluggable `LLMProvider` + `EmbeddingProvider` abstraction (ABCs) with Null / Anthropic / OpenAI-compatible (OpenAI, xAI, Ollama) implementations; an operator configures one or more via env and they're selectable by name through a registry (per-tree AI policy + `default_llm_provider`/`default_embedding_provider`).
- **CI/CD:** Gitea Actions build per-component images. **Push** to the LAN registry `192.168.0.2:1234` (plain HTTP, bypasses Cloudflare's body limit); **pull** via the public `git.jpaul.io` FQDN. Servers pull to deploy — no host build. Mirrors the drawbar setup; see [[gitea-lan-push-fqdn-pull]].
Pick libraries consistent with this stack. If you introduce a significant dependency or a new service, note it in ARCHITECTURE.md in the same change.
@@ -39,17 +40,24 @@ Pick libraries consistent with this stack. If you introduce a significant depend
```
/ # docs and project meta (this file, README, LICENSE, COC, CONTRIBUTING)
/docs # PRD.md, ARCHITECTURE.md
/backend # FastAPI service (uv-managed). app/{api/v1, services (+ privacy engine), repositories, models, schemas, integrations (auth/mailer), core}; migrations/ = Alembic
/deploy # docker-compose.yml, Caddyfile, .env.example — the self-host stack
/backend # FastAPI service (uv-managed). app/{api/v1, services (+ privacy engine), repositories, models, schemas, integrations (auth, mailer, objectstore, models = pluggable LLM/embedding providers), core}; migrations/ = Alembic
/deploy # docker-compose.yml (+ docker-compose.dev.yml), Caddyfile, .env.example, backup.sh + BACKUP.md (one-command pg_dump + MinIO backup) — the self-host stack
/.gitea/workflows # Gitea Actions CI (build images → Gitea registry)
/frontend # Next.js (App Router, TS, Tailwind, shadcn-style UI). app/ pages, lib/api generated OpenAPI client, components/ui
```
Phase 0 is landing **deploy-first**: the compose stack (Postgres + MinIO + Caddy + a minimal FastAPI backend exposing `/health` and `/health/ready`) and CI come before the real data model and the frontend. Backend dependencies are managed with **uv**; migrations use **Alembic**. The core data model (ARCHITECTURE §5), **local auth** (Argon2 passwords, backend-issued sessions, email verify/reset behind the `AuthProvider` interface; API auth via Bearer header or HttpOnly cookie), and the **Next.js frontend scaffold** (Tailwind + shadcn-style UI, generated OpenAPI client, auth + tree/person views) have all landed — **Phase 0 is complete and running on the live deployment.** Phase 1 (core tree features — media, soft-delete recovery, richer CRUD) is next; OIDC/social auth is Phase 5. Keep this section current as the tree grows.
Phase 0 landed **deploy-first**: the compose stack (Postgres + MinIO + Caddy + FastAPI backend) and CI before the data model and frontend. Backend deps use **uv**; migrations use **Alembic**. Status (keep current as the tree grows):
- **Phase 0 — Foundation: complete** and running live (core data model, local auth behind `AuthProvider`, Next.js frontend).
- **Phase 1 — Core tree: complete.** Media (upload/serve), soft-delete + recovery UI, full CRUD across entities, and the 4-level tree visibility/privacy model (#41#51).
- **Phase 2 — substantially landed.** GEDCOM import (preview→apply, duplicate-aware) and export (citation-preserving, #232); fuzzy name search (pg_trgm) + the public `/explore` directory. Living-person protection is still hardening.
- **Phase 4 — AI assistant foundations landed.** Pluggable `LLMProvider`/`EmbeddingProvider` abstraction + multi-provider registry (Anthropic/OpenAI/xAI/Ollama, #235/#237), the **ChangeProposal** propose-then-confirm flow (#236), and per-tree AI model policy (#238). The assistant's *tool surface that emits proposals* is the remaining piece.
- Also shipped: tree membership management (#233), an **instance owner/operator** role (`OWNER_EMAIL`, #240), a schema-drift readiness guard (#239), and a one-command operator backup (#234).
- **Not built yet:** Phase 3 (Property — parcels/deeds/chain-of-title; no property models exist), Phase 5 (OIDC/social auth — only the `AuthProvider` ABC exists), and cross-tree hints (last; needs multiple populated trees + the embedding provider).
## Where to start
The roadmap is phased in PRD §8. Build in dependency order. **Phase 0 — Foundation is complete** and running on the live deployment; **Phase 1 (core tree features) is the current target.** For reference, Phase 0 covered:
The roadmap is phased in PRD §8. Build in dependency order. **Phases 0 and 1 are complete**, Phase 2 is substantially done, and Phase 4's AI foundations have shipped (see the status list above). The biggest unbuilt areas are **Phase 3 (Property)** and **Phase 5 (OIDC/social auth)** — likely current targets. For reference, Phase 0 covered:
1. Backend skeleton (FastAPI, async, layered) + Postgres + migrations
2. Core data model from ARCHITECTURE §5 — start with User, Tree, TreeMembership, Person, Name, Relationship, Event, Place, Source, Citation, AuditEntry, soft-delete support
@@ -58,7 +66,7 @@ The roadmap is phased in PRD §8. Build in dependency order. **Phase 0 — Found
5. The deploy stack: `compose` for app + postgres + objectstore, Caddy config, env-driven settings
6. CI/CD: Gitea Actions building images to the registry
Don't get ahead of the phases. GEDCOM lands before the assistant (so AI writes target a stable model); property follows a tested people graph; hints come last because they need multiple populated trees. If you think the order is wrong, raise it rather than reordering silently.
Don't get ahead of the phases. GEDCOM and the assistant's propose-diff foundation (provider abstraction + ChangeProposal approval flow) have shipped; the remaining dependency-ordered work is **Property** (Phase 3, on top of the tested people graph), then richer collaboration/audit UI, with **cross-tree hints last** (they need multiple populated trees and the embedding provider). If you think the order is wrong, raise it rather than reordering silently.
## Conventions
@@ -69,6 +77,23 @@ Don't get ahead of the phases. GEDCOM lands before the assistant (so AI writes t
- **Privacy/assistant/hint code gets extra care** — these are the areas where bugs do real harm. Prefer a design note before a large change.
- **No secrets in the repo.** Config via env; provide `.env.example` with placeholders.
## Patched dependencies (family-chart)
The tree view uses **family-chart** (d3-based). Two adjustments live in the repo:
- **CSS is vendored** at `frontend/app/trees/[id]/tree/chart.css` — the package blocks its CSS subpath export, so we copy it in.
- **The library is patched** via `patch-package` (`frontend/patches/family-chart+0.9.0.patch`, applied by the `postinstall` hook; the backend/frontend Dockerfiles `COPY patches` before install). Both hunks touch `dist/family-chart.js` **and** `dist/family-chart.esm.js` (the app loads the `esm` build). Current fixes:
1. **Spouse-centering layout** (`setupSpouses` / `sortChildrenWithSpouses`) — center a person between two spouses with children under the correct pair.
2. **`cardToMiddle` vertical centering** — the lib scaled `datum.x` by the zoom factor `k` but not `datum.y`, so "fly to a node" drifted vertically at any zoom ≠ 1; we add the missing `* k`.
To change a patch: edit the file(s) under `node_modules/family-chart/dist/`, then `cd frontend && npx patch-package family-chart` to regenerate, and verify with `npx patch-package --error-on-fail`.
**Upstreamed.** Both are general library bugfixes, not app-specific, and are submitted upstream:
- `cardToMiddle` vertical centering — **donatso/family-chart#103** (issue **#102**).
- Multi-spouse centered layout — **donatso/family-chart#105** (issue **#104**).
If either is merged + released, bump `family-chart`, drop the corresponding patch hunk, **and** remove any in-app compensation (e.g. the `cardToMiddle` caller in `tree/page.tsx` passes raw `y` precisely because the patch fixes it — pre-scaling there too would double-correct). Until then, keep the patch.
## License & contribution terms
Provenance is **source-available** under **BUSL-1.1** (see [LICENSE](LICENSE)): free for personal/family/non-commercial use, no third-party commercial hosting, and each release converts to **AGPL-3.0** four years after it ships. The DCO sign-off keeps the licensing chain clean so the maintainer can manage that conversion and a possible future hosted offering. Don't add code under an incompatible license, and don't vendor dependencies whose licenses conflict with eventual AGPL distribution.
+5 -4
View File
@@ -19,13 +19,14 @@ Every fact links to its source. Every claim can be traced. Nothing is just asser
## What it does
- **Build a tree that holds up.** People, relationships, events, and places — with every fact linked to the document, photo, or record it came from.
- **Trace the land, not just the family.** Properties are first-class. Record ownership events (grants, deeds, inheritances, sales), reconstruct chain-of-title, and tie parcels to the people who held them.
- **Bring your own archive.** Scans, PDFs, photos, audio recordings — first-class citizens, not afterthoughts.
- **A research assistant that proposes, never overwrites.** The built-in AI assistant searches legal sources, lays out what it found, and waits for your approval before anything touches your data. You can point it at the major model providers or a self-hosted model — your keys, your choice.
- **Standards over silos.** Full GEDCOM 7 import and export. Migrate in, migrate out.
- **Privacy you control.** Public, unlisted, or private per tree; any individual can be hidden; living people are protected by default.
- **Standards over silos.** GEDCOM import and export (5.5.1 / 7 common subset) — duplicate-aware import, citation-preserving export. Migrate in, migrate out.
- **Privacy you control.** Public, members-only (any signed-in user on your instance), unlisted, or private per tree; any individual can be hidden; living people are protected by default.
- **Find your people.** When another user's tree overlaps with yours, Provenance can surface an anonymous "possible match" — and only connects you if you both say yes.
- **Run it your way.** Container-native. Self-host behind Caddy and, if you like, a Cloudflare Tunnel. Multi-tenant, so your whole extended family — or a whole community of strangers — can coexist on one deployment.
- **Run it your way.** Container-native. Self-host behind Caddy and, if you like, a Cloudflare Tunnel. Multi-tenant, so your whole extended family — or a whole community of strangers — can coexist on one deployment. One-command backups (Postgres + object storage) and an instance-owner admin role keep operations in your hands.
**Where it's headed — trace the land, not just the family.** The same source-backed treatment for *property*: parcels, deeds, and ownership events, reconstructing chain-of-title and tying land to the people who held it. The people side ships today; the land half is on the roadmap, not yet built — but it's why Provenance exists, not an afterthought.
## Who it's for
+99 -13
View File
@@ -54,6 +54,36 @@ async def get_current_user_or_none(request: Request, session: SessionDep) -> Use
CurrentUserOrNone = Annotated[User | None, Depends(get_current_user_or_none)]
def is_instance_owner(user: User) -> bool:
"""Whether this account is an instance owner/operator — i.e. its email is
named in OWNER_EMAIL *and* that email has been verified. Instance ownership
is an operational/config role; it does NOT bypass the privacy engine or grant
access to others' tree data.
The verified-email requirement is load-bearing: registration is open and (by
default) doesn't require verification, so without it an attacker could claim
the owner email by registering it before the operator does — a land-grab to
the highest role with no proof of inbox control. Requiring verification ties
ownership to actual control of the named inbox regardless of the global
REQUIRE_EMAIL_VERIFICATION setting. (Self-hosts without SMTP can verify via
the link the console mailer prints to the operator-controlled logs.)"""
owners = get_settings().owner_emails()
return (
bool(owners)
and user.email_verified_at is not None
and user.email.strip().lower() in owners
)
async def require_instance_owner(current: CurrentUser) -> User:
if not is_instance_owner(current):
raise HTTPException(status.HTTP_403_FORBIDDEN, "instance owner only")
return current
InstanceOwner = Annotated[User, Depends(require_instance_owner)]
def get_mailer() -> Mailer:
settings = get_settings()
if settings.mailer == "smtp" and settings.smtp_host:
@@ -71,26 +101,82 @@ def get_objectstore() -> ObjectStore:
ObjectStoreDep = Annotated[ObjectStore, Depends(get_objectstore)]
def get_llm_provider() -> LLMProvider:
settings = get_settings()
if settings.model_provider == "anthropic" and settings.anthropic_api_key:
from app.integrations.models.anthropic_provider import AnthropicLLMProvider
def build_llm_providers() -> dict[str, LLMProvider]:
"""Every LLM provider whose credentials are configured, keyed by name. Run
several at once; pick one with get_llm_provider(name)."""
from app.integrations.models.anthropic_provider import AnthropicLLMProvider
from app.integrations.models.openai_compat import OpenAICompatibleLLMProvider
return AnthropicLLMProvider(
api_key=settings.anthropic_api_key,
model=settings.llm_model,
max_tokens=settings.llm_max_tokens,
s = get_settings()
providers: dict[str, LLMProvider] = {}
if s.anthropic_api_key:
providers["anthropic"] = AnthropicLLMProvider(
api_key=s.anthropic_api_key, model=s.anthropic_model, max_tokens=s.llm_max_tokens
)
return NullLLMProvider()
if s.openai_api_key:
providers["openai"] = OpenAICompatibleLLMProvider(
api_key=s.openai_api_key, base_url=s.openai_base_url, model=s.openai_model,
max_tokens=s.llm_max_tokens,
)
if s.xai_api_key:
providers["xai"] = OpenAICompatibleLLMProvider(
api_key=s.xai_api_key, base_url=s.xai_base_url, model=s.xai_model,
max_tokens=s.llm_max_tokens,
)
if s.ollama_enabled:
providers["ollama"] = OpenAICompatibleLLMProvider(
api_key=None, base_url=s.ollama_base_url, model=s.ollama_model,
max_tokens=s.llm_max_tokens,
)
return providers
def configured_llm_providers() -> list[dict]:
"""Configured LLM providers as {name, model} — for the AI admin view (no
secrets). Mirrors build_llm_providers() without constructing clients."""
s = get_settings()
out: list[dict] = []
if s.anthropic_api_key:
out.append({"name": "anthropic", "model": s.anthropic_model})
if s.openai_api_key:
out.append({"name": "openai", "model": s.openai_model})
if s.xai_api_key:
out.append({"name": "xai", "model": s.xai_model})
if s.ollama_enabled:
out.append({"name": "ollama", "model": s.ollama_model})
return out
def get_llm_provider(name: str | None = None) -> LLMProvider:
"""The named LLM provider, or the configured default, or Null if unconfigured."""
providers = build_llm_providers()
return providers.get(name or get_settings().default_llm_provider) or NullLLMProvider()
LLMProviderDep = Annotated[LLMProvider, Depends(get_llm_provider)]
def get_embedding_provider() -> EmbeddingProvider:
# Only the null provider exists today; concrete embedders (Ollama/Voyage)
# implement the same interface and are selected here by settings.embedding_provider.
return NullEmbeddingProvider()
def build_embedding_providers() -> dict[str, EmbeddingProvider]:
from app.integrations.models.openai_compat import OpenAICompatibleEmbeddingProvider
s = get_settings()
providers: dict[str, EmbeddingProvider] = {}
if s.openai_api_key:
providers["openai"] = OpenAICompatibleEmbeddingProvider(
api_key=s.openai_api_key, base_url=s.openai_base_url,
model=s.openai_embedding_model, dimensions=s.embedding_dimensions,
)
if s.ollama_enabled:
providers["ollama"] = OpenAICompatibleEmbeddingProvider(
api_key=None, base_url=s.ollama_base_url,
model=s.ollama_embedding_model, dimensions=s.embedding_dimensions,
)
return providers
def get_embedding_provider(name: str | None = None) -> EmbeddingProvider:
providers = build_embedding_providers()
return providers.get(name or get_settings().default_embedding_provider) or NullEmbeddingProvider()
EmbeddingProviderDep = Annotated[EmbeddingProvider, Depends(get_embedding_provider)]
+14 -2
View File
@@ -12,6 +12,7 @@ from sqlalchemy import text
from app.core.config import get_settings
from app.core.db import get_engine
from app.core.schema_version import schema_is_current
router = APIRouter(tags=["health"])
@@ -33,9 +34,20 @@ async def ready(response: Response) -> dict:
try:
async with get_engine().connect() as conn:
await conn.execute(text("SELECT 1"))
checks["database"] = "ok"
checks["database"] = "ok"
# Schema drift = code ahead of the DB; queries would 500. Fail
# readiness loudly rather than serve a broken surface.
ok, db, expected = await schema_is_current(conn)
if not ok:
checks["schema"] = (
f"drift: db={sorted(db) or ['none']} expected={sorted(expected)} "
"— run 'alembic upgrade head'"
)
response.status_code = status.HTTP_503_SERVICE_UNAVAILABLE
return {"status": "not ready", "checks": checks}
checks["schema"] = "ok"
return {"status": "ready", "checks": checks}
except Exception as exc: # noqa: BLE001 — surface any failure as "not ready"
checks["database"] = "error"
checks.setdefault("database", "error")
response.status_code = status.HTTP_503_SERVICE_UNAVAILABLE
return {"status": "not ready", "checks": checks, "detail": str(exc)}
+4
View File
@@ -3,6 +3,8 @@
from fastapi import APIRouter
from app.api.v1 import (
admin,
ai,
auth,
citations,
cleanup,
@@ -36,3 +38,5 @@ api_router.include_router(cleanup.router)
api_router.include_router(public.router)
api_router.include_router(members.router)
api_router.include_router(proposals.router)
api_router.include_router(ai.router)
api_router.include_router(admin.router)
+38
View File
@@ -0,0 +1,38 @@
"""Instance-admin surface — owner-only (OWNER_EMAIL). Operational status and
instance-wide configuration. Deliberately exposes no tree contents or PII:
instance ownership is an operator role, not a privacy bypass."""
from sqlalchemy import func, select
from fastapi import APIRouter
from app.api.deps import InstanceOwner, SessionDep, configured_llm_providers
from app.core.config import get_settings
from app.models.tree import Tree
from app.models.user import User
from app.schemas.admin import InstanceStatus
from app.schemas.ai_policy import ConfiguredProvider
router = APIRouter(prefix="/admin", tags=["admin"])
@router.get("/instance", response_model=InstanceStatus)
async def instance_status(owner: InstanceOwner, session: SessionDep) -> InstanceStatus:
"""Operator dashboard data. Requires the caller to be an instance owner."""
s = get_settings()
user_count = await session.scalar(
select(func.count()).select_from(User).where(User.deleted_at.is_(None))
)
tree_count = await session.scalar(
select(func.count()).select_from(Tree).where(Tree.deleted_at.is_(None))
)
return InstanceStatus(
version=s.version,
env=s.app_env,
owner_emails=sorted(s.owner_emails()),
require_email_verification=s.require_email_verification,
user_count=user_count or 0,
tree_count=tree_count or 0,
default_llm_provider=s.default_llm_provider,
ai_providers=[ConfiguredProvider(**p) for p in configured_llm_providers()],
)
+34
View File
@@ -0,0 +1,34 @@
"""Per-tree AI model policy — owner-only admin view."""
import uuid
from fastapi import APIRouter
from app.api.deps import CurrentUser, SessionDep
from app.schemas.ai_policy import TreeAiPolicyRead, TreeAiPolicyUpdate
from app.services import ai_policy_service, tree_service
router = APIRouter(prefix="/trees", tags=["ai"])
@router.get("/{tree_id}/ai", response_model=TreeAiPolicyRead)
async def get_ai_policy(
tree_id: uuid.UUID, session: SessionDep, current: CurrentUser
) -> TreeAiPolicyRead:
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
return TreeAiPolicyRead(**await ai_policy_service.get_policy(session, actor=current, tree=tree))
@router.patch("/{tree_id}/ai", response_model=TreeAiPolicyRead)
async def update_ai_policy(
tree_id: uuid.UUID, data: TreeAiPolicyUpdate, session: SessionDep, current: CurrentUser
) -> TreeAiPolicyRead:
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
policy = await ai_policy_service.update_policy(
session,
actor=current,
tree=tree,
member_provider=data.member_provider,
recommender_provider=data.recommender_provider,
)
return TreeAiPolicyRead(**policy)
+19
View File
@@ -6,6 +6,7 @@ from app.api.deps import CurrentUser, SessionDep
from app.schemas.cleanup import (
CleanupResult,
DeceasedApply,
DeceasedByChildCandidate,
DeceasedCandidate,
GenderApply,
GenderProposal,
@@ -31,6 +32,24 @@ async def preview_deceased(
return [DeceasedCandidate(**r) for r in rows]
@router.get(
"/{tree_id}/cleanup/deceased-by-child", response_model=list[DeceasedByChildCandidate]
)
async def preview_deceased_by_child(
tree_id: uuid.UUID,
session: SessionDep,
current: CurrentUser,
born_on_or_before: int = 1900,
) -> list[DeceasedByChildCandidate]:
"""People with a child born on/before the cutoff — necessarily deceased even
when their own birth date is missing. Apply via POST .../cleanup/deceased."""
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
rows = await cleanup_service.preview_deceased_by_child(
session, actor=current, tree=tree, year=born_on_or_before
)
return [DeceasedByChildCandidate(**r) for r in rows]
@router.post("/{tree_id}/cleanup/deceased", response_model=CleanupResult)
async def apply_deceased(
tree_id: uuid.UUID, data: DeceasedApply, session: SessionDep, current: CurrentUser
+11 -2
View File
@@ -1,6 +1,6 @@
import uuid
from fastapi import APIRouter, status
from fastapi import APIRouter, HTTPException, status
from app.api.deps import CurrentUser, SessionDep
from app.schemas.person import PersonCreate, PersonRead, PersonUpdate
@@ -41,9 +41,18 @@ async def list_persons(
current: CurrentUser,
deleted: bool = False,
q: str | None = None,
ids: str | None = None,
) -> list[PersonRead]:
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
if q:
if ids is not None:
try:
id_list = [uuid.UUID(x) for x in ids.split(",") if x.strip()]
except ValueError as exc:
raise HTTPException(status.HTTP_422_UNPROCESSABLE_ENTITY, "invalid ids") from exc
persons = await person_service.list_persons_by_ids(
session, viewer_id=current.id, tree=tree, ids=id_list
)
elif q:
persons = await person_service.search_persons(
session, viewer_id=current.id, tree=tree, query=q
)
+17 -2
View File
@@ -2,8 +2,8 @@ import uuid
from fastapi import APIRouter, status
from app.api.deps import CurrentUser, SessionDep
from app.schemas.tree import TreeCreate, TreeRead, TreeUpdate
from app.api.deps import CurrentUser, ObjectStoreDep, SessionDep
from app.schemas.tree import TreeCreate, TreePurge, TreeRead, TreeUpdate
from app.services import tree_service
router = APIRouter(prefix="/trees", tags=["trees"])
@@ -57,3 +57,18 @@ async def delete_tree(tree_id: uuid.UUID, session: SessionDep, current: CurrentU
async def restore_tree(tree_id: uuid.UUID, session: SessionDep, current: CurrentUser) -> TreeRead:
tree = await tree_service.restore_tree(session, actor=current, tree_id=tree_id)
return TreeRead.model_validate(tree)
@router.post("/{tree_id}/purge", status_code=status.HTTP_204_NO_CONTENT)
async def purge_tree(
tree_id: uuid.UUID,
data: TreePurge,
session: SessionDep,
current: CurrentUser,
store: ObjectStoreDep,
) -> None:
"""Permanently delete a soft-deleted tree and all its data — irreversible.
Owner-only; the tree must be in the trash and `confirm_name` must match."""
await tree_service.purge_tree(
session, store, actor=current, tree_id=tree_id, confirm_name=data.confirm_name
)
+9 -3
View File
@@ -1,15 +1,21 @@
from fastapi import APIRouter, File, Form, Response, UploadFile
from app.api.deps import CurrentUser, ObjectStoreDep, SessionDep
from app.api.deps import CurrentUser, ObjectStoreDep, SessionDep, is_instance_owner
from app.schemas.user import UserRead, UserSelfPersonUpdate
from app.services import account_service, user_service
router = APIRouter(prefix="/users", tags=["users"])
def _me(user) -> UserRead:
out = UserRead.model_validate(user)
out.is_instance_owner = is_instance_owner(user)
return out
@router.get("/me", response_model=UserRead)
async def read_me(current: CurrentUser) -> UserRead:
return UserRead.model_validate(current)
return _me(current)
@router.patch("/me/self-person", response_model=UserRead)
@@ -20,7 +26,7 @@ async def set_self_person(
user = await user_service.set_self_person(
session, user=current, person_id=data.self_person_id
)
return UserRead.model_validate(user)
return _me(user)
@router.get("/me/export")
+39 -5
View File
@@ -22,6 +22,18 @@ class Settings(BaseSettings):
version: str = "0.0.0"
app_env: str = Field(default="development", description="development | production")
# --- Instance owner / operator ---
# Email(s) of the instance owner(s) — the operator(s) who run this server.
# The matching account(s) get instance-admin rights (instance-wide settings;
# see /api/v1/admin). Comma-separated for several. Empty = no designated
# owner (the instance has no operator account). Derived at request time, so
# changing it takes effect immediately with no migration or DB state.
owner_email: str = ""
def owner_emails(self) -> frozenset[str]:
"""Normalized (lowercased, trimmed) owner emails; empty if none set."""
return frozenset(e.strip().lower() for e in self.owner_email.split(",") if e.strip())
# SQLAlchemy async URL, e.g. postgresql+asyncpg://user:pass@host:5432/db
database_url: str = Field(
default="postgresql+asyncpg://provenance:provenance@localhost:5432/provenance",
@@ -61,12 +73,34 @@ class Settings(BaseSettings):
smtp_from: str = "Provenance <no-reply@provenance.local>"
# --- Model providers (AI assistant + match-ranking embeddings) ---
# Separate because Anthropic has no embeddings endpoint; either can be off.
model_provider: str = "null" # null | anthropic
anthropic_api_key: str | None = None
llm_model: str = "claude-opus-4-8"
# Configure as many as you like; each is enabled when its credentials are
# present. `default_*_provider` picks which one is used by default. LLM and
# embeddings are independent (Anthropic has no embeddings endpoint).
default_llm_provider: str = "null" # null | anthropic | openai | xai | ollama
default_embedding_provider: str = "null" # null | openai | ollama
llm_max_tokens: int = 4096
embedding_provider: str = "null" # null | (future: ollama, voyage, …)
embedding_dimensions: int = 1536 # must match the embedding model + pgvector column
# Anthropic (LLM only)
anthropic_api_key: str | None = None
anthropic_model: str = "claude-opus-4-8"
# OpenAI (LLM + embeddings)
openai_api_key: str | None = None
openai_base_url: str = "https://api.openai.com/v1"
openai_model: str = "gpt-4o"
openai_embedding_model: str = "text-embedding-3-small"
# xAI / Grok — OpenAI-compatible (LLM)
xai_api_key: str | None = None
xai_base_url: str = "https://api.x.ai/v1"
xai_model: str = "grok-2-latest" # set to your account's current Grok model
# Ollama — local, OpenAI-compatible, no key (LLM + embeddings)
ollama_enabled: bool = False
ollama_base_url: str = "http://localhost:11434/v1"
ollama_model: str = "llama3.1"
ollama_embedding_model: str = "nomic-embed-text"
@lru_cache
+59
View File
@@ -0,0 +1,59 @@
"""Schema-drift detection — a safety net for the deploy pipeline.
If a deploy ships code whose models reference a column a migration hasn't added
yet (the code is ahead of the DB), every query against that table 500s with an
opaque ``UndefinedColumnError``. That is exactly the failure that took the tree
list down once: the backend image advanced but ``alembic upgrade head`` hadn't
run on the server.
The real prevention is auto-migrate on deploy (the entrypoint runs
``alembic upgrade head`` when ``RUN_MIGRATIONS=1``). This module is defense in
depth: it makes the drift *loud and explicit* — a readiness failure and a
CRITICAL startup log — instead of a silent storm of 500s, so a half-applied
deploy is obvious within seconds.
"""
from functools import lru_cache
from pathlib import Path
from sqlalchemy import text
from sqlalchemy.ext.asyncio import AsyncConnection
# app/core/schema_version.py -> backend/ (parents: core, app, backend)
_MIGRATIONS_DIR = Path(__file__).resolve().parents[2] / "migrations"
@lru_cache
def expected_heads() -> frozenset[str]:
"""Revision head(s) baked into this image's migration scripts. Static for a
given build, so cache it."""
from alembic.config import Config
from alembic.script import ScriptDirectory
cfg = Config()
cfg.set_main_option("script_location", str(_MIGRATIONS_DIR))
return frozenset(ScriptDirectory.from_config(cfg).get_heads())
async def db_heads(conn: AsyncConnection) -> frozenset[str] | None:
"""Revision(s) the database is stamped at, or ``None`` when the DB is not
Alembic-managed (no ``alembic_version`` table — e.g. a test DB built straight
from ``create_all``). ``to_regclass`` returns NULL rather than erroring when
the table is absent, so this never poisons the caller's transaction."""
if await conn.scalar(text("SELECT to_regclass('public.alembic_version')")) is None:
return None
result = await conn.execute(text("SELECT version_num FROM alembic_version"))
return frozenset(row[0] for row in result)
async def schema_is_current(
conn: AsyncConnection,
) -> tuple[bool, frozenset[str], frozenset[str]]:
"""``(ok, db, expected)``. ``ok`` is True when the DB is stamped at the
code's head(s). A DB with no ``alembic_version`` table is treated as current
(not Alembic-managed → nothing to compare), so this stays quiet in tests."""
expected = expected_heads()
current = await db_heads(conn)
if current is None:
return True, frozenset(), expected
return current == expected, current, expected
@@ -0,0 +1,40 @@
"""OpenAI-compatible providers (one implementation, many vendors).
OpenAI, xAI (api.x.ai/v1), Ollama (…:11434/v1), OpenRouter, Together, vLLM, etc.
all speak the OpenAI Chat Completions / Embeddings API — they differ only by
base URL, key, and model name. So a single class, parameterized by those, plugs
in every one of them via the official `openai` SDK.
"""
from openai import AsyncOpenAI
from app.integrations.models.base import EmbeddingProvider, LLMProvider
class OpenAICompatibleLLMProvider(LLMProvider):
def __init__(self, *, api_key: str | None, base_url: str, model: str, max_tokens: int = 4096) -> None:
# Local backends (Ollama) ignore the key but the SDK requires a non-empty one.
self._client = AsyncOpenAI(api_key=api_key or "not-needed", base_url=base_url)
self._model = model
self._max_tokens = max_tokens
async def complete(self, *, prompt: str, system: str | None = None) -> str:
messages: list[dict] = []
if system:
messages.append({"role": "system", "content": system})
messages.append({"role": "user", "content": prompt})
resp = await self._client.chat.completions.create(
model=self._model, max_tokens=self._max_tokens, messages=messages
)
return resp.choices[0].message.content or ""
class OpenAICompatibleEmbeddingProvider(EmbeddingProvider):
def __init__(self, *, api_key: str | None, base_url: str, model: str, dimensions: int) -> None:
self._client = AsyncOpenAI(api_key=api_key or "not-needed", base_url=base_url)
self._model = model
self.dimensions = dimensions
async def embed(self, texts: list[str]) -> list[list[float]]:
resp = await self._client.embeddings.create(model=self._model, input=texts)
return [d.embedding for d in resp.data]
+30
View File
@@ -7,6 +7,7 @@ engine is the single enforcement point for reads.
import logging
import sys
from contextlib import asynccontextmanager
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
@@ -14,6 +15,8 @@ from fastapi.responses import JSONResponse
from app.api.health import router as health_router
from app.api.v1 import api_router
from app.core.config import get_settings
from app.core.db import get_engine
from app.core.schema_version import schema_is_current
from app.services.exceptions import Conflict, Forbidden, NotFound
@@ -30,6 +33,32 @@ def _configure_logging() -> None:
app_logger.propagate = False
async def _check_schema_drift() -> None:
"""On startup, shout if the DB schema is behind the code. The entrypoint
runs migrations when RUN_MIGRATIONS=1; this catches the case where that
didn't happen, so a half-applied deploy is obvious in the logs instead of a
silent storm of 500s. Never blocks startup — purely advisory."""
logger = logging.getLogger("provenance")
try:
async with get_engine().connect() as conn:
ok, db, expected = await schema_is_current(conn)
if not ok:
logger.critical(
"SCHEMA DRIFT: database is at %s but this build expects %s. "
"Run 'alembic upgrade head' — queries will fail until migrated.",
sorted(db) or ["none"],
sorted(expected),
)
except Exception as exc: # noqa: BLE001 — advisory only; never block startup
logger.warning("schema drift check skipped: %s", exc)
@asynccontextmanager
async def _lifespan(app: FastAPI):
await _check_schema_drift()
yield
def _register_error_handlers(app: FastAPI) -> None:
@app.exception_handler(NotFound)
async def _not_found(request: Request, exc: NotFound) -> JSONResponse:
@@ -51,6 +80,7 @@ def create_app() -> FastAPI:
title=settings.app_name,
version=settings.version,
description="Provenance API — family and land provenance.",
lifespan=_lifespan,
)
app.include_router(health_router)
app.include_router(api_router)
+5
View File
@@ -36,6 +36,11 @@ class Tree(Base, UUIDPrimaryKey, Timestamps, SoftDelete):
use_alter=True,
)
)
# Per-tree AI model policy (owner-configured). The names reference configured
# providers from the registry; null = that role has no model. The owner may
# use any configured provider; these limit members + the recommender.
ai_member_provider: Mapped[str | None] = mapped_column(String(32))
ai_recommender_provider: Mapped[str | None] = mapped_column(String(32))
class TreeMembership(Base, UUIDPrimaryKey, Timestamps):
+20
View File
@@ -0,0 +1,20 @@
"""Instance-admin schemas. Operator-facing, owner-only — operational status and
config, never tree data or PII (instance ownership doesn't bypass privacy)."""
from pydantic import BaseModel
from app.schemas.ai_policy import ConfiguredProvider
class InstanceStatus(BaseModel):
version: str
env: str
# Operator account(s) — the email(s) named in OWNER_EMAIL.
owner_emails: list[str]
require_email_verification: bool
# Aggregate, non-identifying counts (live rows only).
user_count: int
tree_count: int
# Instance-wide AI configuration (no secrets).
default_llm_provider: str
ai_providers: list[ConfiguredProvider]
+22
View File
@@ -0,0 +1,22 @@
from pydantic import BaseModel
class ConfiguredProvider(BaseModel):
name: str
model: str
class TreeAiPolicyRead(BaseModel):
# The model non-owners' assistant uses (null = none).
member_provider: str | None
# The model the association/recommendation engine uses (null = none).
recommender_provider: str | None
# Providers the operator has configured (from env). The owner may use any of
# these; the two settings above restrict members and the recommender to one.
configured_providers: list[ConfiguredProvider]
default_provider: str
class TreeAiPolicyUpdate(BaseModel):
member_provider: str | None = None
recommender_provider: str | None = None
+6
View File
@@ -9,6 +9,12 @@ class DeceasedCandidate(BaseModel):
birth_year: int
class DeceasedByChildCandidate(BaseModel):
person_id: uuid.UUID
name: str
child_birth_year: int
class DeceasedApply(BaseModel):
person_ids: list[uuid.UUID]
+5
View File
@@ -19,6 +19,11 @@ class TreeUpdate(BaseModel):
home_person_id: uuid.UUID | None = None
class TreePurge(BaseModel):
# Retype the tree's name to confirm a permanent, irreversible delete.
confirm_name: str
class TreeRead(BaseModel):
model_config = ConfigDict(from_attributes=True)
+3
View File
@@ -21,6 +21,9 @@ class UserRead(BaseModel):
email_verified_at: datetime | None
self_person_id: uuid.UUID | None = None
created_at: datetime
# Operational role, not a DB column: true when this account's email is named
# in OWNER_EMAIL. Set by the API layer (see users.read_me).
is_instance_owner: bool = False
class UserSelfPersonUpdate(BaseModel):
+77
View File
@@ -0,0 +1,77 @@
"""Per-tree AI model policy — owner-only. Assigns which configured provider
members and the recommender use; the owner may use any configured provider.
The operator decides which providers exist (env / registry); the tree owner
decides who uses which. See app/api/deps.py for the registry.
"""
import uuid
from sqlalchemy.ext.asyncio import AsyncSession
from app.api.deps import configured_llm_providers
from app.models.enums import MembershipRole
from app.models.tree import Tree
from app.models.user import User
from app.services import privacy
from app.services.exceptions import Forbidden
async def _require_owner(session: AsyncSession, *, actor: User, tree: Tree) -> None:
role = await privacy.get_membership_role(session, actor.id, tree.id)
if role is not MembershipRole.owner:
raise Forbidden("only the tree owner can configure AI")
def _names() -> set[str]:
return {p["name"] for p in configured_llm_providers()}
async def get_policy(session: AsyncSession, *, actor: User, tree: Tree) -> dict:
await _require_owner(session, actor=actor, tree=tree)
from app.core.config import get_settings
return {
"member_provider": tree.ai_member_provider,
"recommender_provider": tree.ai_recommender_provider,
"configured_providers": configured_llm_providers(),
"default_provider": get_settings().default_llm_provider,
}
async def update_policy(
session: AsyncSession,
*,
actor: User,
tree: Tree,
member_provider: str | None,
recommender_provider: str | None,
) -> dict:
await _require_owner(session, actor=actor, tree=tree)
valid = _names()
for value in (member_provider, recommender_provider):
if value is not None and value not in valid:
raise Forbidden(f"'{value}' is not a configured provider")
tree.ai_member_provider = member_provider
tree.ai_recommender_provider = recommender_provider
await session.commit()
await session.refresh(tree)
return await get_policy(session, actor=actor, tree=tree)
# --- Resolution helpers (for the future assistant / recommender) -------------
def provider_name_for_member(tree: Tree) -> str | None:
"""Provider an ordinary member's assistant should use, if any."""
return tree.ai_member_provider
def provider_name_for_recommender(tree: Tree) -> str | None:
return tree.ai_recommender_provider
def provider_name_for_owner(tree: Tree, requested: str | None = None) -> str | None:
"""The owner may use any configured provider; default to the requested one."""
if requested and requested in _names():
return requested
return tree.ai_member_provider # fall back to the member model
+9
View File
@@ -105,6 +105,15 @@ async def list_citations(
indicators in a single round-trip."""
if not await privacy.can_view_tree(session, user_id=viewer_id, tree=tree):
raise Forbidden("not permitted to view this tree")
# Non-members get only citations whose cited fact resolves to a full-
# visibility person — a citation on a redacted living person's fact would
# otherwise leak that the person has that sourced fact.
if await privacy.get_membership_role(session, viewer_id, tree.id) is None:
from app.services import public_view_service
return await public_view_service.list_public_citations(
session, viewer_id=viewer_id, tree=tree
)
stmt = (
select(Citation)
.where(Citation.tree_id == tree.id, Citation.deleted_at.is_(None))
+45
View File
@@ -133,6 +133,51 @@ async def apply_deceased(
return len(persons)
# ---- 1b. Mark deceased by a CHILD's birth year -------------------------------------
# For parents whose own birth date is missing (so the birth-year rule can't reach
# them) but who have a child born long ago — they're necessarily deceased. Applies
# through the same apply_deceased() path.
async def preview_deceased_by_child(
session: AsyncSession, *, actor: User, tree: Tree, year: int
) -> list[dict]:
await _require_editor(session, actor=actor, tree=tree)
names = await _primary_name_by_person(session, tree.id)
years = await _birth_year_by_person(session, tree.id)
rels = (
await session.execute(
select(Relationship).where(
Relationship.tree_id == tree.id,
Relationship.deleted_at.is_(None),
Relationship.type == RelationshipType.parent_child,
)
)
).scalars().all()
# parent id -> earliest child birth year, among children born on/before `year`.
earliest_child: dict[uuid.UUID, int] = {}
for r in rels:
cy = years.get(r.person_to_id) # the child's birth year
if cy is None or cy > year:
continue
if r.person_from_id not in earliest_child or cy < earliest_child[r.person_from_id]:
earliest_child[r.person_from_id] = cy
persons = {p.id: p for p in await _persons(session, tree.id)}
out: list[dict] = []
for parent_id, cy in earliest_child.items():
p = persons.get(parent_id)
if p is None or p.is_living is False: # gone or already deceased
continue
out.append(
{
"person_id": str(parent_id),
"name": _display(names.get(parent_id)),
"child_birth_year": cy,
}
)
out.sort(key=lambda r: r["child_birth_year"])
return out
# ---- 2. Re-derive gender from a source GEDCOM (matches by name) ----------------------
async def preview_gender(
+21 -2
View File
@@ -4,9 +4,10 @@ engine. Every event has exactly one subject — a Person or a partnership."""
import uuid
from datetime import date
from sqlalchemy import select
from sqlalchemy import or_, select
from sqlalchemy.ext.asyncio import AsyncSession
from app.models.enums import RelationshipType
from app.models.event import Event
from app.models.person import Person
from app.models.place import Place
@@ -124,12 +125,30 @@ async def list_events_for_person(
return await public_view_service.list_public_person_events(
session, viewer_id=viewer_id, tree=tree, person_id=person_id
)
# Member view: this person's own events PLUS their partnership events (which
# live on the relationship and show on both partners). Returning both here
# means the person page doesn't have to load every event in the tree.
partner_rel_ids = (
select(Relationship.id)
.where(
Relationship.tree_id == tree.id,
Relationship.type == RelationshipType.partnership,
Relationship.deleted_at.is_(None),
or_(
Relationship.person_from_id == person_id,
Relationship.person_to_id == person_id,
),
)
)
stmt = (
select(Event)
.where(
Event.tree_id == tree.id,
Event.person_id == person_id,
Event.deleted_at.is_(None),
or_(
Event.person_id == person_id,
Event.relationship_id.in_(partner_rel_ids),
),
)
.order_by(Event.date_start.nulls_last(), Event.created_at)
)
+86 -5
View File
@@ -45,6 +45,29 @@ async def _attach_primary_name(session: AsyncSession, person: Person) -> None:
person.primary_name = _format_name(name) if name is not None else None
async def _attach_primary_names(session: AsyncSession, persons: list[Person]) -> None:
"""Batch version of ``_attach_primary_name`` — ONE query for the whole list
instead of one per person (the difference between 1 and N queries when
rendering a 2k-person tree). The global order (is_primary desc, sort_order)
matches the single-person query, so the first row seen per person is the same
name ``_attach_primary_name`` would pick."""
if not persons:
return
rows = (
await session.execute(
select(Name)
.where(Name.person_id.in_([p.id for p in persons]), Name.deleted_at.is_(None))
.order_by(Name.is_primary.desc(), Name.sort_order)
)
).scalars().all()
best: dict[uuid.UUID, Name] = {}
for n in rows:
best.setdefault(n.person_id, n)
for p in persons:
n = best.get(p.id)
p.primary_name = _format_name(n) if n is not None else None
async def create_person(
session: AsyncSession,
*,
@@ -336,15 +359,18 @@ async def list_deleted_persons(
.order_by(Person.deleted_at.desc())
)
persons = list((await session.execute(stmt)).scalars().all())
for person in persons:
await _attach_primary_name(session, person)
await _attach_primary_names(session, persons)
return persons
async def list_persons(
session: AsyncSession, *, viewer_id: uuid.UUID, tree: Tree
) -> list[Person]:
if not await privacy.can_view_tree(session, user_id=viewer_id, tree=tree):
# Resolve the viewer's role ONCE. Members see the whole tree (full), so we
# skip the per-person privacy engine entirely and batch the name fetch — the
# difference between ~3 queries and ~3·N queries on a 2k-person tree.
role = await privacy.get_membership_role(session, viewer_id, tree.id)
if role is None and not await privacy.can_view_tree(session, user_id=viewer_id, tree=tree):
raise Forbidden("not permitted to view this tree")
stmt = (
@@ -354,7 +380,15 @@ async def list_persons(
)
persons = list((await session.execute(stmt)).scalars().all())
if role is not None:
await _attach_primary_names(session, persons)
return persons
# Non-member on a viewable (public/unlisted/site_members) tree: redact per
# person. Names are batched for the non-redacted ones; redacted ones already
# have their display name overwritten by _redact.
visible: list[Person] = []
full: list[Person] = []
for person in persons:
vis = await privacy.person_visibility(
session, user_id=viewer_id, tree=tree, person=person
@@ -364,8 +398,50 @@ async def list_persons(
if vis == Visibility.redacted:
_redact(person)
else:
await _attach_primary_name(session, person)
full.append(person)
visible.append(person)
await _attach_primary_names(session, full)
return visible
async def list_persons_by_ids(
session: AsyncSession, *, viewer_id: uuid.UUID, tree: Tree, ids: list[uuid.UUID]
) -> list[Person]:
"""Just the named persons (privacy-filtered, names batched). Lets a page show
the names of someone's relatives without loading the whole tree."""
role = await privacy.get_membership_role(session, viewer_id, tree.id)
if role is None and not await privacy.can_view_tree(session, user_id=viewer_id, tree=tree):
raise Forbidden("not permitted to view this tree")
if not ids:
return []
persons = list(
(
await session.execute(
select(Person).where(
Person.id.in_(ids),
Person.tree_id == tree.id,
Person.deleted_at.is_(None),
)
)
).scalars().all()
)
if role is not None:
await _attach_primary_names(session, persons)
return persons
visible: list[Person] = []
full: list[Person] = []
for person in persons:
vis = await privacy.person_visibility(
session, user_id=viewer_id, tree=tree, person=person
)
if vis == Visibility.hidden:
continue
if vis == Visibility.redacted:
_redact(person)
else:
full.append(person)
visible.append(person)
await _attach_primary_names(session, full)
return visible
@@ -406,7 +482,11 @@ async def search_persons(
.order_by(sub.c.score.desc())
)
persons = list((await session.execute(stmt)).scalars().all())
if await privacy.get_membership_role(session, viewer_id, tree.id) is not None:
await _attach_primary_names(session, persons)
return persons
out: list[Person] = []
full: list[Person] = []
for person in persons:
vis = await privacy.person_visibility(
session, user_id=viewer_id, tree=tree, person=person
@@ -416,6 +496,7 @@ async def search_persons(
if vis == Visibility.redacted:
_redact(person)
else:
await _attach_primary_name(session, person)
full.append(person)
out.append(person)
await _attach_primary_names(session, full)
return out
+100 -2
View File
@@ -12,6 +12,8 @@ person's real name, dates, alternate names, or media. The rules:
living partner's timeline otherwise).
- names : only for FULL-visibility persons.
- media : NOT exposed yet (deferred — see docs/design/tree-visibility.md).
- citations : only when the cited fact resolves to FULL person(s).
- sources : only when they back at least one visible citation.
A tree that isn't viewable raises NotFound (never Forbidden) so the public
surface can't be used to probe whether a private tree exists.
@@ -27,10 +29,15 @@ from app.models.event import Event
from app.models.media import Media
from app.models.person import Name, Person
from app.models.relationship import Relationship
from app.models.source import Citation, Source
from app.models.tree import Tree
from app.services import privacy
from app.services.exceptions import NotFound
from app.services.person_service import _attach_primary_name, _redact
from app.services.person_service import (
_attach_primary_name,
_attach_primary_names,
_redact,
)
from app.services.privacy import Visibility
@@ -75,6 +82,7 @@ async def list_public_persons(
session: AsyncSession, *, viewer_id: uuid.UUID | None, tree: Tree
) -> list[Person]:
out: list[Person] = []
full: list[Person] = []
for p in await _persons(session, tree):
vis = await privacy.person_visibility(session, user_id=viewer_id, tree=tree, person=p)
if vis == Visibility.hidden:
@@ -82,8 +90,9 @@ async def list_public_persons(
if vis == Visibility.redacted:
_redact(p)
else:
await _attach_primary_name(session, p)
full.append(p)
out.append(p)
await _attach_primary_names(session, full) # one query, not one per person
return out
@@ -296,6 +305,95 @@ async def can_view_media(
return vis == Visibility.full
async def _full_person_ids(
session: AsyncSession, *, viewer_id: uuid.UUID | None, tree: Tree
) -> set[uuid.UUID]:
persons = await _persons(session, tree)
vis = await _visibility_map(session, viewer_id=viewer_id, tree=tree, persons=persons)
return {pid for pid, v in vis.items() if v == Visibility.full}
async def list_public_citations(
session: AsyncSession, *, viewer_id: uuid.UUID | None, tree: Tree
) -> list[Citation]:
"""Only citations whose cited fact resolves to FULL-visibility person(s). A
citation on a redacted/hidden person's fact (or a partnership where either
partner isn't full) is dropped — its existence plus page/detail would leak
that the person has that sourced fact. Mirrors the events/names rule (FULL
only)."""
full = await _full_person_ids(session, viewer_id=viewer_id, tree=tree)
async def _by_id(model):
rows = (
await session.execute(
select(model).where(model.tree_id == tree.id, model.deleted_at.is_(None))
)
).scalars().all()
return {r.id: r for r in rows}
names = await _by_id(Name)
rels = await _by_id(Relationship)
events = await _by_id(Event)
def target_is_full(c: Citation) -> bool:
if c.person_id is not None:
return c.person_id in full
if c.name_id is not None:
n = names.get(c.name_id)
return n is not None and n.person_id in full
if c.event_id is not None:
e = events.get(c.event_id)
if e is None:
return False
if e.person_id is not None:
return e.person_id in full
if e.relationship_id is not None:
r = rels.get(e.relationship_id)
return r is not None and r.person_from_id in full and r.person_to_id in full
return False
if c.relationship_id is not None:
r = rels.get(c.relationship_id)
return r is not None and r.person_from_id in full and r.person_to_id in full
return False
citations = (
await session.execute(
select(Citation)
.where(Citation.tree_id == tree.id, Citation.deleted_at.is_(None))
.order_by(Citation.created_at)
)
).scalars().all()
return [c for c in citations if target_is_full(c)]
async def list_public_sources(
session: AsyncSession, *, viewer_id: uuid.UUID | None, tree: Tree
) -> list[Source]:
"""Only sources backing at least one visible citation. A source used solely
for a redacted/hidden person's facts is withheld — its title or notes could
name that living person."""
visible = await list_public_citations(session, viewer_id=viewer_id, tree=tree)
cited = {c.source_id for c in visible}
sources = (
await session.execute(
select(Source)
.where(Source.tree_id == tree.id, Source.deleted_at.is_(None))
.order_by(Source.title)
)
).scalars().all()
return [s for s in sources if s.id in cited]
async def get_public_source(
session: AsyncSession, *, viewer_id: uuid.UUID | None, tree: Tree, source_id: uuid.UUID
) -> Source:
for s in await list_public_sources(session, viewer_id=viewer_id, tree=tree):
if s.id == source_id:
return s
# 404 (not 403): don't reveal that a withheld source exists.
raise NotFound("source not found")
async def list_public_trees(
session: AsyncSession,
*,
+14
View File
@@ -61,6 +61,14 @@ async def create_source(
async def list_sources(session: AsyncSession, *, viewer_id: uuid.UUID, tree: Tree) -> list[Source]:
if not await privacy.can_view_tree(session, user_id=viewer_id, tree=tree):
raise Forbidden("not permitted to view this tree")
# Non-members see only sources backing a visible citation (see citation
# redaction) — a source used solely for a redacted person could name them.
if await privacy.get_membership_role(session, viewer_id, tree.id) is None:
from app.services import public_view_service
return await public_view_service.list_public_sources(
session, viewer_id=viewer_id, tree=tree
)
stmt = (
select(Source)
.where(Source.tree_id == tree.id, Source.deleted_at.is_(None))
@@ -74,6 +82,12 @@ async def get_source(
) -> Source:
if not await privacy.can_view_tree(session, user_id=viewer_id, tree=tree):
raise Forbidden("not permitted to view this tree")
if await privacy.get_membership_role(session, viewer_id, tree.id) is None:
from app.services import public_view_service
return await public_view_service.get_public_source(
session, viewer_id=viewer_id, tree=tree, source_id=source_id
)
source = (
await session.execute(
select(Source).where(
+48 -2
View File
@@ -5,16 +5,18 @@ authorization basis) and an audit entry. Reads go through the privacy engine.
import uuid
from datetime import UTC, datetime
from sqlalchemy import select
from sqlalchemy import delete, select
from sqlalchemy.ext.asyncio import AsyncSession
from app.integrations.objectstore.base import ObjectStore
from app.models.enums import MembershipRole, TreeVisibility
from app.models.media import Media
from app.models.tree import Tree, TreeMembership
from app.models.user import User
from app.repositories.base import BaseRepository
from app.services import privacy
from app.services.audit import record_audit
from app.services.exceptions import Forbidden, NotFound
from app.services.exceptions import Conflict, Forbidden, NotFound
async def create_tree(
@@ -128,6 +130,50 @@ async def restore_tree(session: AsyncSession, *, actor: User, tree_id: uuid.UUID
return tree
async def purge_tree(
session: AsyncSession,
store: ObjectStore,
*,
actor: User,
tree_id: uuid.UUID,
confirm_name: str,
) -> None:
"""Permanently delete a soft-deleted tree and ALL its data — irreversible.
Owner-only. The tree must already be in the trash (soft-deleted) and the
caller must retype its name. Tree-owned rows are removed by the `tree_id`
ON DELETE CASCADE; we delete the media objects from storage first (the DB
cascade drops the rows but not the bytes). Audit entries survive with their
`tree_id` nulled (ON DELETE SET NULL), so the purge stays in the log."""
tree = await _owned_tree(session, actor=actor, tree_id=tree_id)
if tree.deleted_at is None:
raise Conflict("delete the tree first, then purge it from the trash")
if confirm_name.strip() != (tree.name or "").strip():
raise Forbidden("tree name confirmation does not match")
keys = list(
(
await session.execute(select(Media.storage_key).where(Media.tree_id == tree.id))
).scalars().all()
)
for key in keys:
try:
await store.delete_object(key=key)
except Exception: # noqa: BLE001 — best-effort; a missing object must not block the purge
pass
record_audit(
session,
action="purge",
entity_type="Tree",
entity_id=tree.id,
tree_id=tree.id,
actor_user_id=actor.id,
before={"name": tree.name},
)
await session.execute(delete(Tree).where(Tree.id == tree.id))
await session.commit()
async def list_deleted_trees_for_user(session: AsyncSession, *, user: User) -> list[Tree]:
stmt = (
select(Tree)
@@ -0,0 +1,26 @@
"""tree AI model policy (ai_member_provider, ai_recommender_provider)
Revision ID: b2c3d4e5f6a7
Revises: a1b2c3d4e5f6
Create Date: 2026-06-09
"""
from collections.abc import Sequence
import sqlalchemy as sa
from alembic import op
revision: str = "b2c3d4e5f6a7"
down_revision: str | None = "a1b2c3d4e5f6"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
def upgrade() -> None:
op.add_column("trees", sa.Column("ai_member_provider", sa.String(length=32), nullable=True))
op.add_column("trees", sa.Column("ai_recommender_provider", sa.String(length=32), nullable=True))
def downgrade() -> None:
op.drop_column("trees", "ai_recommender_provider")
op.drop_column("trees", "ai_member_provider")
+1
View File
@@ -15,6 +15,7 @@ dependencies = [
"boto3>=1.35",
"python-multipart>=0.0.12",
"anthropic>=0.108.0",
"openai>=2.41.0",
]
[dependency-groups]
+62
View File
@@ -0,0 +1,62 @@
"""Per-tree AI model policy: owner-only, validated against configured providers."""
from app.core.config import get_settings
from tests.conftest import auth, register
async def test_ai_policy_is_owner_only(client):
owner = auth(await register(client, "ai-o@ex.com"))
editor = auth(await register(client, "ai-x@ex.com"))
tid = (await client.post("/api/v1/trees", json={"name": "T"}, headers=owner)).json()["id"]
await client.post(
f"/api/v1/trees/{tid}/members", json={"email": "ai-x@ex.com", "role": "editor"}, headers=owner
)
g = await client.get(f"/api/v1/trees/{tid}/ai", headers=owner)
assert g.status_code == 200
assert g.json()["member_provider"] is None
assert g.json()["configured_providers"] == [] # nothing configured in tests
# An editor (not owner) can neither view nor change the policy.
assert (await client.get(f"/api/v1/trees/{tid}/ai", headers=editor)).status_code == 403
assert (
await client.patch(
f"/api/v1/trees/{tid}/ai",
json={"member_provider": None, "recommender_provider": None},
headers=editor,
)
).status_code == 403
async def test_ai_policy_set_and_validate(client, monkeypatch):
monkeypatch.setattr(get_settings(), "anthropic_api_key", "sk-ant-test")
owner = auth(await register(client, "ai-set@ex.com"))
tid = (await client.post("/api/v1/trees", json={"name": "T"}, headers=owner)).json()["id"]
g = (await client.get(f"/api/v1/trees/{tid}/ai", headers=owner)).json()
assert {p["name"] for p in g["configured_providers"]} == {"anthropic"}
# Assign the member + recommender model.
p = await client.patch(
f"/api/v1/trees/{tid}/ai",
json={"member_provider": "anthropic", "recommender_provider": "anthropic"},
headers=owner,
)
assert p.status_code == 200 and p.json()["member_provider"] == "anthropic"
# A provider that isn't configured is rejected.
assert (
await client.patch(
f"/api/v1/trees/{tid}/ai",
json={"member_provider": "openai", "recommender_provider": None},
headers=owner,
)
).status_code == 403
# Clearing is allowed.
c = await client.patch(
f"/api/v1/trees/{tid}/ai",
json={"member_provider": None, "recommender_provider": None},
headers=owner,
)
assert c.status_code == 200 and c.json()["member_provider"] is None
@@ -106,6 +106,142 @@ async def test_authed_nonmember_does_not_see_living_pii(client):
).status_code == 200
async def _setup_sources(client):
owner = auth(await register(client, "anmcs-owner@ex.com"))
tid = (
await client.post(
"/api/v1/trees", json={"name": "PubCS", "visibility": "public"}, headers=owner
)
).json()["id"]
old = (
await client.post(
f"/api/v1/trees/{tid}/persons",
json={"given": "Oldcs", "surname": "Gonecs", "is_living": False},
headers=owner,
)
).json()["id"]
young = (
await client.post(
f"/api/v1/trees/{tid}/persons",
json={"given": "Youngcs", "surname": "Csleaksurname", "is_living": True},
headers=owner,
)
).json()["id"]
for pid, year in ((old, "1851"), (young, "2004")):
await client.post(
f"/api/v1/trees/{tid}/events",
json={"event_type": "birth", "person_id": pid, "date_value": year},
headers=owner,
)
s_old = (
await client.post(
f"/api/v1/trees/{tid}/sources", json={"title": "Oldsource record"}, headers=owner
)
).json()["id"]
s_young = (
await client.post(
f"/api/v1/trees/{tid}/sources",
json={"title": "Youngsource Csleaktitle"}, # title names the living person
headers=owner,
)
).json()["id"]
await client.post(
f"/api/v1/trees/{tid}/citations",
json={"source_id": s_old, "person_id": old, "page": "p.1"},
headers=owner,
)
await client.post(
f"/api/v1/trees/{tid}/citations",
json={"source_id": s_young, "person_id": young, "page": "p.2"},
headers=owner,
)
return owner, tid, old, young, s_old, s_young
async def test_authed_nonmember_citation_source_redaction(client):
"""A non-member must not see citations on a redacted living person's facts,
nor sources used only for them."""
owner, tid, old, young, s_old, s_young = await _setup_sources(client)
stranger = auth(await register(client, "anmcs-stranger@ex.com"))
cites = (await client.get(f"/api/v1/trees/{tid}/citations", headers=stranger)).json()
cited = {c.get("person_id") for c in cites}
assert old in cited
assert young not in cited # living person's citation dropped
srcs = (await client.get(f"/api/v1/trees/{tid}/sources", headers=stranger))
src_ids = {s["id"] for s in srcs.json()}
assert s_old in src_ids
assert s_young not in src_ids # source used only for the living person withheld
assert "Csleaktitle" not in srcs.text # its title (which names them) must not leak
# The withheld source 404s — don't reveal it exists; the visible one is fine.
assert (
await client.get(f"/api/v1/trees/{tid}/sources/{s_young}", headers=stranger)
).status_code == 404
assert (
await client.get(f"/api/v1/trees/{tid}/sources/{s_old}", headers=stranger)
).status_code == 200
# Members still see everything.
mc = {c.get("person_id") for c in (await client.get(f"/api/v1/trees/{tid}/citations", headers=owner)).json()}
assert {old, young} <= mc
ms = {s["id"] for s in (await client.get(f"/api/v1/trees/{tid}/sources", headers=owner)).json()}
assert {s_old, s_young} <= ms
async def test_citation_redaction_via_indirect_targets(client):
"""Citations targeting a living person *indirectly* (via their event or name,
not person_id) must also be dropped for non-members."""
owner = auth(await register(client, "anmind-owner@ex.com"))
tid = (
await client.post(
"/api/v1/trees", json={"name": "PubInd", "visibility": "public"}, headers=owner
)
).json()["id"]
young = (
await client.post(
f"/api/v1/trees/{tid}/persons",
json={"given": "Youngind", "surname": "Indsurname", "is_living": True},
headers=owner,
)
).json()["id"]
ev = (
await client.post(
f"/api/v1/trees/{tid}/events",
json={"event_type": "birth", "person_id": young, "date_value": "2005"},
headers=owner,
)
).json()["id"]
nm = (
await client.post(
f"/api/v1/trees/{tid}/persons/{young}/names",
json={"name_type": "alias", "given": "Indalias"},
headers=owner,
)
).json()["id"]
s_ev = (await client.post(f"/api/v1/trees/{tid}/sources", json={"title": "EvSrc"}, headers=owner)).json()["id"]
s_nm = (await client.post(f"/api/v1/trees/{tid}/sources", json={"title": "NmSrc"}, headers=owner)).json()["id"]
await client.post(
f"/api/v1/trees/{tid}/citations", json={"source_id": s_ev, "event_id": ev}, headers=owner
)
await client.post(
f"/api/v1/trees/{tid}/citations", json={"source_id": s_nm, "name_id": nm}, headers=owner
)
stranger = auth(await register(client, "anmind-stranger@ex.com"))
cites = (await client.get(f"/api/v1/trees/{tid}/citations", headers=stranger)).json()
# Neither the event-citation nor the name-citation may surface.
assert not any(c.get("event_id") == ev for c in cites)
assert not any(c.get("name_id") == nm for c in cites)
src_ids = {s["id"] for s in (await client.get(f"/api/v1/trees/{tid}/sources", headers=stranger)).json()}
assert s_ev not in src_ids and s_nm not in src_ids
# Owner (member) sees both citations and both sources.
mc = (await client.get(f"/api/v1/trees/{tid}/citations", headers=owner)).json()
assert any(c.get("event_id") == ev for c in mc) and any(c.get("name_id") == nm for c in mc)
async def test_member_still_sees_everything(client):
owner, tid, old, young, om, ym = await _setup(client)
+47
View File
@@ -51,6 +51,53 @@ async def test_deceased_preview_and_apply(client):
assert old not in [r["person_id"] for r in prev2]
async def test_deceased_by_child_preview_and_apply(client):
h, tid = await _tree(client, "cl-decchild@example.com")
# Parent with NO birth date (the gap the birth-year rule can't reach).
parent = await _person(client, h, tid, "Gesche", "Frerking")
child = await _person(client, h, tid, "Kindt", "Frerking")
await _birth(client, h, tid, child, 1880) # child born before the cutoff
await client.post(
f"/api/v1/trees/{tid}/relationships",
json={"type": "parent_child", "person_from_id": parent, "person_to_id": child},
headers=h,
)
# A parent of a modern child must NOT be flagged.
p_modern = await _person(client, h, tid, "Modern", "Parent")
c_modern = await _person(client, h, tid, "Kid", "Parent")
await _birth(client, h, tid, c_modern, 1990)
await client.post(
f"/api/v1/trees/{tid}/relationships",
json={"type": "parent_child", "person_from_id": p_modern, "person_to_id": c_modern},
headers=h,
)
prev = (
await client.get(
f"/api/v1/trees/{tid}/cleanup/deceased-by-child?born_on_or_before=1900", headers=h
)
).json()
ids = [r["person_id"] for r in prev]
assert parent in ids and p_modern not in ids
assert next(r for r in prev if r["person_id"] == parent)["child_birth_year"] == 1880
# Apply through the shared deceased endpoint.
r = await client.post(
f"/api/v1/trees/{tid}/cleanup/deceased", json={"person_ids": [parent]}, headers=h
)
assert r.status_code == 200 and r.json()["updated"] == 1
assert (
await client.get(f"/api/v1/trees/{tid}/persons/{parent}", headers=h)
).json()["is_living"] is False
# Re-preview drops the now-deceased parent.
prev2 = (
await client.get(
f"/api/v1/trees/{tid}/cleanup/deceased-by-child?born_on_or_before=1900", headers=h
)
).json()
assert parent not in [r["person_id"] for r in prev2]
async def test_gender_from_spouse_preview_and_apply(client):
h, tid = await _tree(client, "cl-spouse@example.com")
husband = (
+72
View File
@@ -0,0 +1,72 @@
"""Instance owner (OWNER_EMAIL): the operator account + the owner-only /admin
surface. Ownership is derived from the env at request time no DB column and
requires a *verified* email so the owner address can't be land-grabbed by
whoever registers it first."""
from datetime import datetime, timezone
from sqlalchemy import text
from app.api.deps import is_instance_owner
from app.core.config import get_settings
from app.models.user import User
from tests.conftest import auth, register
VERIFIED = datetime(2020, 1, 1, tzinfo=timezone.utc)
def test_is_instance_owner_matches_case_insensitively(monkeypatch):
monkeypatch.setattr(get_settings(), "owner_email", "Owner@Example.com, second@ex.com")
assert is_instance_owner(User(email="owner@example.com", email_verified_at=VERIFIED)) is True
assert is_instance_owner(User(email="SECOND@ex.com", email_verified_at=VERIFIED)) is True
assert is_instance_owner(User(email="nope@ex.com", email_verified_at=VERIFIED)) is False
def test_unverified_owner_email_is_not_owner(monkeypatch):
"""The land-grab guard: a matching email with no verification is NOT owner."""
monkeypatch.setattr(get_settings(), "owner_email", "boss@ex.com")
assert is_instance_owner(User(email="boss@ex.com", email_verified_at=None)) is False
assert is_instance_owner(User(email="boss@ex.com", email_verified_at=VERIFIED)) is True
def test_no_owner_when_unset(monkeypatch):
monkeypatch.setattr(get_settings(), "owner_email", "")
# An empty OWNER_EMAIL designates no owner — and must never match the (also
# empty-string-normalizing) edges.
assert is_instance_owner(User(email="anyone@ex.com", email_verified_at=VERIFIED)) is False
assert is_instance_owner(User(email="", email_verified_at=VERIFIED)) is False
monkeypatch.setattr(get_settings(), "owner_email", " , ")
assert is_instance_owner(User(email="", email_verified_at=VERIFIED)) is False
async def _verify(db_session, email: str) -> None:
await db_session.execute(
text("UPDATE users SET email_verified_at = now() WHERE email = :e"), {"e": email}
)
await db_session.commit()
async def test_me_reports_instance_owner(client, db_session, monkeypatch):
monkeypatch.setattr(get_settings(), "owner_email", "boss@ex.com")
boss = auth(await register(client, "boss@ex.com"))
other = auth(await register(client, "peon@ex.com"))
await _verify(db_session, "boss@ex.com")
assert (await client.get("/api/v1/users/me", headers=boss)).json()["is_instance_owner"] is True
assert (await client.get("/api/v1/users/me", headers=other)).json()["is_instance_owner"] is False
async def test_admin_instance_is_owner_only(client, db_session, monkeypatch):
monkeypatch.setattr(get_settings(), "owner_email", "boss@ex.com")
boss = auth(await register(client, "boss@ex.com"))
other = auth(await register(client, "peon@ex.com"))
await _verify(db_session, "boss@ex.com")
assert (await client.get("/api/v1/admin/instance")).status_code == 401 # anon
assert (await client.get("/api/v1/admin/instance", headers=other)).status_code == 403 # non-owner
r = await client.get("/api/v1/admin/instance", headers=boss)
assert r.status_code == 200
body = r.json()
assert body["owner_emails"] == ["boss@ex.com"]
assert body["user_count"] >= 2
assert "ai_providers" in body and "default_llm_provider" in body
+63 -22
View File
@@ -1,43 +1,84 @@
"""Model-provider selection + the null-provider fail-loud behavior.
No network: we only assert which provider the factory returns and that the null
providers raise a clear error. (Live LLM/embedding calls aren't unit-tested.)
"""Model-provider registry: configure several vendors at once, select by name,
default selection, and the null fail-loud behavior. No network we only assert
which provider the factory returns and that null providers raise.
"""
import pytest
from app.api.deps import get_embedding_provider, get_llm_provider
from app.api.deps import (
build_embedding_providers,
build_llm_providers,
get_embedding_provider,
get_llm_provider,
)
from app.core.config import get_settings
from app.integrations.models.anthropic_provider import AnthropicLLMProvider
from app.integrations.models.base import ModelProviderNotConfigured
from app.integrations.models.null import NullEmbeddingProvider, NullLLMProvider
from app.integrations.models.openai_compat import (
OpenAICompatibleEmbeddingProvider,
OpenAICompatibleLLMProvider,
)
async def test_default_llm_is_null_and_fails_loud(monkeypatch):
monkeypatch.setattr(get_settings(), "model_provider", "null")
def _reset(monkeypatch):
s = get_settings()
for attr, val in {
"default_llm_provider": "null",
"default_embedding_provider": "null",
"anthropic_api_key": None,
"openai_api_key": None,
"xai_api_key": None,
"ollama_enabled": False,
}.items():
monkeypatch.setattr(s, attr, val)
return s
async def test_default_is_null_and_fails_loud(monkeypatch):
_reset(monkeypatch)
provider = get_llm_provider()
assert isinstance(provider, NullLLMProvider)
with pytest.raises(ModelProviderNotConfigured):
await provider.complete(prompt="hello")
assert isinstance(get_embedding_provider(), NullEmbeddingProvider)
async def test_anthropic_selected_when_configured(monkeypatch):
s = get_settings()
monkeypatch.setattr(s, "model_provider", "anthropic")
monkeypatch.setattr(s, "anthropic_api_key", "sk-ant-test-key")
monkeypatch.setattr(s, "llm_model", "claude-opus-4-8")
assert isinstance(get_llm_provider(), AnthropicLLMProvider) # no network call
async def test_multiple_llm_providers_at_once(monkeypatch):
s = _reset(monkeypatch)
monkeypatch.setattr(s, "anthropic_api_key", "sk-ant-x")
monkeypatch.setattr(s, "openai_api_key", "sk-openai-x")
monkeypatch.setattr(s, "xai_api_key", "xai-x")
monkeypatch.setattr(s, "ollama_enabled", True)
monkeypatch.setattr(s, "default_llm_provider", "anthropic")
registry = build_llm_providers()
assert set(registry) == {"anthropic", "openai", "xai", "ollama"}
# Select any by name.
assert isinstance(get_llm_provider("anthropic"), AnthropicLLMProvider)
assert isinstance(get_llm_provider("openai"), OpenAICompatibleLLMProvider)
assert isinstance(get_llm_provider("xai"), OpenAICompatibleLLMProvider)
assert isinstance(get_llm_provider("ollama"), OpenAICompatibleLLMProvider)
# Default resolves to the configured default.
assert isinstance(get_llm_provider(), AnthropicLLMProvider)
# Unknown name → null.
assert isinstance(get_llm_provider("nope"), NullLLMProvider)
async def test_anthropic_without_key_falls_back_to_null(monkeypatch):
s = get_settings()
monkeypatch.setattr(s, "model_provider", "anthropic")
monkeypatch.setattr(s, "anthropic_api_key", None)
async def test_provider_disabled_without_credentials(monkeypatch):
s = _reset(monkeypatch)
monkeypatch.setattr(s, "default_llm_provider", "openai") # default names openai…
# …but no openai key → registry empty → null fallback.
assert build_llm_providers() == {}
assert isinstance(get_llm_provider(), NullLLMProvider)
async def test_embedding_default_is_null_and_fails_loud():
provider = get_embedding_provider()
assert isinstance(provider, NullEmbeddingProvider)
with pytest.raises(ModelProviderNotConfigured):
await provider.embed(["text"])
async def test_embedding_providers(monkeypatch):
s = _reset(monkeypatch)
monkeypatch.setattr(s, "openai_api_key", "sk-openai-x")
monkeypatch.setattr(s, "ollama_enabled", True)
monkeypatch.setattr(s, "default_embedding_provider", "openai")
registry = build_embedding_providers()
assert set(registry) == {"openai", "ollama"}
assert isinstance(get_embedding_provider(), OpenAICompatibleEmbeddingProvider)
assert isinstance(get_embedding_provider("ollama"), OpenAICompatibleEmbeddingProvider)
@@ -0,0 +1,39 @@
"""Regression guard: list_persons must batch — a constant number of queries,
not one (or three) per person. A 2k-person tree took ~4s before this was fixed."""
import sqlalchemy as sa
from tests.conftest import auth, register
async def test_list_persons_does_not_n_plus_one(client, engine):
owner = auth(await register(client, "perf-owner@ex.com"))
tid = (await client.post("/api/v1/trees", json={"name": "Perf"}, headers=owner)).json()["id"]
n = 25
for i in range(n):
await client.post(
f"/api/v1/trees/{tid}/persons",
json={"given": f"P{i}", "surname": "X"},
headers=owner,
)
selects = 0
def _count(conn, cursor, statement, params, context, executemany):
nonlocal selects
if statement.lstrip().upper().startswith("SELECT"):
selects += 1
sa.event.listen(engine.sync_engine, "before_cursor_execute", _count)
try:
resp = await client.get(f"/api/v1/trees/{tid}/persons", headers=owner)
finally:
sa.event.remove(engine.sync_engine, "before_cursor_execute", _count)
assert resp.status_code == 200
body = resp.json()
assert len(body) == n
assert all(p["primary_name"] for p in body) # names still resolve correctly
# Batched: a small constant (auth, role, persons, one names query, …) — NOT
# proportional to n. The old per-person path was ~3·n SELECTs.
assert 0 < selects < n, f"expected a constant query count, got {selects} for {n} people"
+60
View File
@@ -0,0 +1,60 @@
"""Backing the trimmed person-page fetch: batch persons by id (for relative-name
display) and partnership events on the per-person events endpoint (so the page
doesn't load every event in the tree)."""
from tests.conftest import auth, register
async def _tree(client, h):
return (await client.post("/api/v1/trees", json={"name": "T"}, headers=h)).json()["id"]
async def test_list_persons_by_ids(client):
h = auth(await register(client, "ids@ex.com"))
tid = await _tree(client, h)
a = (await client.post(f"/api/v1/trees/{tid}/persons", json={"given": "Aaa"}, headers=h)).json()["id"]
b = (await client.post(f"/api/v1/trees/{tid}/persons", json={"given": "Bbb"}, headers=h)).json()["id"]
c = (await client.post(f"/api/v1/trees/{tid}/persons", json={"given": "Ccc"}, headers=h)).json()["id"]
r = await client.get(f"/api/v1/trees/{tid}/persons", params={"ids": f"{a},{c}"}, headers=h)
assert r.status_code == 200
assert {p["id"] for p in r.json()} == {a, c} # only the requested, not b
assert all(p["primary_name"] for p in r.json()) # names resolved
assert (
await client.get(f"/api/v1/trees/{tid}/persons", params={"ids": "nope"}, headers=h)
).status_code == 422
assert (
await client.get(f"/api/v1/trees/{tid}/persons", params={"ids": ""}, headers=h)
).json() == []
async def test_person_events_include_partnership(client):
h = auth(await register(client, "pev@ex.com"))
tid = await _tree(client, h)
p1 = (await client.post(f"/api/v1/trees/{tid}/persons", json={"given": "P1"}, headers=h)).json()["id"]
p2 = (await client.post(f"/api/v1/trees/{tid}/persons", json={"given": "P2"}, headers=h)).json()["id"]
await client.post(
f"/api/v1/trees/{tid}/events",
json={"event_type": "birth", "person_id": p1, "date_value": "1900"},
headers=h,
)
rel = (
await client.post(
f"/api/v1/trees/{tid}/relationships",
json={"type": "partnership", "person_from_id": p1, "person_to_id": p2},
headers=h,
)
).json()["id"]
await client.post(
f"/api/v1/trees/{tid}/events",
json={"event_type": "marriage", "relationship_id": rel, "date_value": "1925"},
headers=h,
)
# P1's events: own birth + the partnership marriage, in one call.
e1 = {e["event_type"] for e in (await client.get(f"/api/v1/trees/{tid}/persons/{p1}/events", headers=h)).json()}
assert {"birth", "marriage"} <= e1
# The marriage shows on BOTH partners' pages.
e2 = {e["event_type"] for e in (await client.get(f"/api/v1/trees/{tid}/persons/{p2}/events", headers=h)).json()}
assert "marriage" in e2
+42
View File
@@ -0,0 +1,42 @@
"""Schema-drift guard: the DB-vs-code head check behind /health/ready and the
startup log. Regression cover for the outage where the backend image shipped
ahead of an un-applied migration and every trees query 500'd."""
from sqlalchemy import text
from app.core.schema_version import db_heads, expected_heads, schema_is_current
def test_expected_heads_is_a_single_known_head():
heads = expected_heads()
# Linear migration history → exactly one head, and it's a real revision id.
assert len(heads) == 1
assert all(h and isinstance(h, str) for h in heads)
async def test_schema_is_current_detects_drift(db_session):
conn = await db_session.connection()
# The test DB is built from create_all (no alembic_version table), so it is
# not Alembic-managed and the check stays quiet — treated as current.
await conn.execute(text("DROP TABLE IF EXISTS alembic_version"))
assert await db_heads(conn) is None
ok, _, _ = await schema_is_current(conn)
assert ok is True
# Stamp an old/wrong revision → drift detected.
await conn.execute(text("CREATE TABLE alembic_version (version_num varchar(32) NOT NULL)"))
await conn.execute(text("INSERT INTO alembic_version (version_num) VALUES ('0000deadbeef')"))
ok, db, expected = await schema_is_current(conn)
assert ok is False
assert db == frozenset({"0000deadbeef"})
# Stamp the code's real head → current again.
head = next(iter(expected))
await conn.execute(text("DELETE FROM alembic_version"))
await conn.execute(text("INSERT INTO alembic_version (version_num) VALUES (:h)"), {"h": head})
ok, _, _ = await schema_is_current(conn)
assert ok is True
# Leave no alembic_version behind for other tests.
await conn.execute(text("DROP TABLE IF EXISTS alembic_version"))
+78
View File
@@ -0,0 +1,78 @@
"""On-demand purge of a soft-deleted tree: permanent, owner-only, name-confirmed,
and cascades to all tree data."""
import uuid
from sqlalchemy import func, select
from app.models.person import Person
from app.models.tree import Tree
from tests.conftest import auth, register
async def _tree_with_person(client, owner):
tid = (await client.post("/api/v1/trees", json={"name": "Purge Me"}, headers=owner)).json()["id"]
await client.post(
f"/api/v1/trees/{tid}/persons", json={"given": "Doomed", "surname": "Soul"}, headers=owner
)
return tid
async def test_purge_requires_soft_delete_first(client):
owner = auth(await register(client, "purge-a@ex.com"))
tid = await _tree_with_person(client, owner)
# A live tree can't be purged — it must be trashed first.
r = await client.post(
f"/api/v1/trees/{tid}/purge", json={"confirm_name": "Purge Me"}, headers=owner
)
assert r.status_code == 409
async def test_purge_name_must_match(client):
owner = auth(await register(client, "purge-b@ex.com"))
tid = await _tree_with_person(client, owner)
await client.delete(f"/api/v1/trees/{tid}", headers=owner) # soft-delete
r = await client.post(
f"/api/v1/trees/{tid}/purge", json={"confirm_name": "WRONG"}, headers=owner
)
assert r.status_code == 403
# Still in the trash — nothing destroyed.
deleted = (await client.get("/api/v1/trees", params={"deleted": True}, headers=owner)).json()
assert any(t["id"] == tid for t in deleted)
async def test_purge_owner_only(client):
owner = auth(await register(client, "purge-c@ex.com"))
other = auth(await register(client, "purge-c2@ex.com"))
tid = await _tree_with_person(client, owner)
await client.delete(f"/api/v1/trees/{tid}", headers=owner)
r = await client.post(
f"/api/v1/trees/{tid}/purge", json={"confirm_name": "Purge Me"}, headers=other
)
assert r.status_code in (403, 404)
async def test_purge_removes_tree_and_cascades(client, db_session):
owner = auth(await register(client, "purge-d@ex.com"))
tid = await _tree_with_person(client, owner)
await client.delete(f"/api/v1/trees/{tid}", headers=owner)
r = await client.post(
f"/api/v1/trees/{tid}/purge", json={"confirm_name": "Purge Me"}, headers=owner
)
assert r.status_code == 204
# Gone from the trash...
deleted = (await client.get("/api/v1/trees", params={"deleted": True}, headers=owner)).json()
assert not any(t["id"] == tid for t in deleted)
# ...and cascaded: no tree row, no person rows.
tuuid = uuid.UUID(tid)
assert (
await db_session.execute(select(func.count()).select_from(Tree).where(Tree.id == tuuid))
).scalar() == 0
assert (
await db_session.execute(
select(func.count()).select_from(Person).where(Person.tree_id == tuuid)
)
).scalar() == 0
+33
View File
@@ -549,6 +549,25 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/70/bc/6f1c2f612465f5fa89b95bead1f44dcb607670fd42891d8fdcd5d039f4f4/markupsafe-3.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:32001d6a8fc98c8cb5c947787c5d08b0a50663d139f1305bac5885d98d9b40fa", size = 14146, upload-time = "2025-09-27T18:37:28.327Z" },
]
[[package]]
name = "openai"
version = "2.41.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
{ name = "distro" },
{ name = "httpx" },
{ name = "jiter" },
{ name = "pydantic" },
{ name = "sniffio" },
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/3c/a6/5815fe2e2aca74b36c650d1bd43b69827cee568073d0d2d9b6fc5aaac80c/openai-2.41.0.tar.gz", hash = "sha256:db5c362acd6604b84f076abbefa66826ea4b46ecba2954ed866e6a149a1352c0", size = 783525, upload-time = "2026-06-03T22:39:40.719Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/be/51/d82bb424e8aa372190c5233253a2ceb399a778747d18b42cff487411e663/openai-2.41.0-py3-none-any.whl", hash = "sha256:20cc7952e8501c7e5773dd2ef7be437bae9cb549044902e1041a83a54516e375", size = 1353378, upload-time = "2026-06-03T22:39:38.964Z" },
]
[[package]]
name = "packaging"
version = "26.2"
@@ -578,6 +597,7 @@ dependencies = [
{ name = "asyncpg" },
{ name = "boto3" },
{ name = "fastapi" },
{ name = "openai" },
{ name = "pydantic" },
{ name = "pydantic-settings" },
{ name = "python-multipart" },
@@ -601,6 +621,7 @@ requires-dist = [
{ name = "asyncpg", specifier = ">=0.30" },
{ name = "boto3", specifier = ">=1.35" },
{ name = "fastapi", specifier = ">=0.115" },
{ name = "openai", specifier = ">=2.41.0" },
{ name = "pydantic", specifier = ">=2.9" },
{ name = "pydantic-settings", specifier = ">=2.5" },
{ name = "python-multipart", specifier = ">=0.0.12" },
@@ -919,6 +940,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/1c/54/196d0c1db10af76baa4f64894448505d60d3cdf70ef92cbb35f46a4e4c71/starlette-1.2.1-py3-none-any.whl", hash = "sha256:4de0082d08c8f6764a85a54cf1120d6939507a19905c7768acad2a9f875d2b89", size = 73350, upload-time = "2026-05-31T01:07:50.09Z" },
]
[[package]]
name = "tqdm"
version = "4.68.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "colorama", marker = "sys_platform == 'win32'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/85/05/0d5260f1f1ca784f4a4a0def9cbe6affe587f5b4025328d446c3d67765f4/tqdm-4.68.2.tar.gz", hash = "sha256:89c230e8dbc67c7615c142487111222f878c77427ea09549960f62389e258add", size = 171923, upload-time = "2026-06-09T13:26:42.539Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/eb/75/1a0392bcc21c44dcdf87b3cf2d137e7829be2c083a1e38d44efca3d57a16/tqdm-4.68.2-py3-none-any.whl", hash = "sha256:d4240441fb5353290b87d6a85968c9decc131a99b8c7faa28269d829de669ede", size = 78578, upload-time = "2026-06-09T13:26:40.731Z" },
]
[[package]]
name = "typing-extensions"
version = "4.15.0"
+53 -13
View File
@@ -4,6 +4,18 @@
# --- Core ---
APP_ENV=development
# Instance owner / operator. The account(s) whose email is named here get
# instance-admin rights (the owner-only /admin surface, instance-wide settings).
# Comma-separated for several owners. Leave empty for an instance with no
# designated operator. Derived at request time — no migration, takes effect on
# restart. Set this to YOUR account email on a real deployment.
#
# The named account must have a VERIFIED email to be recognized as owner — this
# stops someone from claiming the owner address by registering it before you do.
# Register this email and verify it (via SMTP, or the link the console mailer
# prints to the backend logs) — ideally before exposing registration publicly.
OWNER_EMAIL=
# --- Images (pulled from git.jpaul.io; CI pushes to the LAN registry) ---
# test-main = current main build; or pin a semver / test-sha-<sha> for rollback.
IMAGE_TAG=test-main
@@ -23,6 +35,8 @@ S3_BUCKET=provenance
S3_ACCESS_KEY=provenance
S3_SECRET_KEY=change-me-too
S3_REGION=us-east-1
# Presigned media URL lifetime in seconds.
S3_PRESIGN_TTL=3600
# --- Edge (Caddy) ---
# Local: ':80' (http://localhost). Production: 'provenance.example.com' for auto-HTTPS.
@@ -40,6 +54,8 @@ COMPOSE_PROFILES=
# --- Auth / sessions ---
SESSION_TTL_DAYS=30
TOKEN_TTL_HOURS=24
# Name of the session cookie.
COOKIE_NAME=provenance_session
# Set false for local http; true (default) behind TLS.
COOKIE_SECURE=false
# Base URL used to build links in outbound email.
@@ -50,23 +66,47 @@ MAILER=console
# until SMTP works and existing accounts are verified, or you will lock users out.
REQUIRE_EMAIL_VERIFICATION=false
# --- Email (SMTP) — wired in a later phase ---
# --- Email (SMTP) ---
# Active when MAILER=smtp (above) and SMTP_HOST is set.
SMTP_HOST=
SMTP_PORT=587
SMTP_USERNAME=
SMTP_PASSWORD=
SMTP_FROM=
# --- Model providers (AI assistant + embeddings; both optional, default off) ---
# LLM: 'null' disables AI features; 'anthropic' uses the Claude API.
MODEL_PROVIDER=null
ANTHROPIC_API_KEY=
LLM_MODEL=claude-opus-4-8
LLM_MAX_TOKENS=4096
# Embeddings are separate (Anthropic has no embeddings endpoint). 'null' for now.
EMBEDDING_PROVIDER=null
# --- Worker (soft-delete purge) ---
# How often the purge job runs, and how old a soft-deleted row must be before it
# is permanently removed (and its media objects cleaned up).
PURGE_INTERVAL_SECONDS=3600
PURGE_AFTER_DAYS=30
# --- Model providers — wired in Phase 4 (AI assistant). BYO key. ---
# ANTHROPIC_API_KEY=
# OPENAI_API_KEY=
# XAI_API_KEY=
# --- Model providers (AI assistant + embeddings) -----------------------------
# Configure as many as you like — each turns on when its key is set. The
# default_* vars pick which one is used by default; the app can also select any
# configured provider by name. LLM and embeddings are independent (Anthropic has
# no embeddings endpoint). Leave the defaults 'null' to keep AI off.
DEFAULT_LLM_PROVIDER=null # null | anthropic | openai | xai | ollama
DEFAULT_EMBEDDING_PROVIDER=null # null | openai | ollama
LLM_MAX_TOKENS=4096
EMBEDDING_DIMENSIONS=1536 # must match the embedding model + pgvector column
# Anthropic (LLM)
ANTHROPIC_API_KEY=
ANTHROPIC_MODEL=claude-opus-4-8
# OpenAI (LLM + embeddings)
OPENAI_API_KEY=
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_MODEL=gpt-4o
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
# xAI / Grok — OpenAI-compatible (LLM)
XAI_API_KEY=
XAI_BASE_URL=https://api.x.ai/v1
XAI_MODEL=grok-2-latest # set to your account's current Grok model
# Ollama — local, OpenAI-compatible, no key (LLM + embeddings)
OLLAMA_ENABLED=false
OLLAMA_BASE_URL=http://localhost:11434/v1
OLLAMA_MODEL=llama3.1
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
+27 -17
View File
@@ -51,8 +51,13 @@ services:
command: ["uv", "run", "--no-dev", "alembic", "upgrade", "head"]
labels:
com.centurylinklabs.watchtower.enable: "true"
# All app config comes from .env (twelve-factor) — no per-setting allow-list
# to maintain. The `environment:` block below only pins values that must NOT
# come from .env. See the backend service for the full rationale.
env_file:
- path: .env
required: false
environment:
APP_ENV: ${APP_ENV:-development}
DATABASE_URL: ${DATABASE_URL:-postgresql+asyncpg://provenance:provenance@postgres:5432/provenance}
depends_on:
postgres:
@@ -63,19 +68,24 @@ services:
image: git.jpaul.io/justin/provenance-backend:${IMAGE_TAG:-test-main}
labels:
com.centurylinklabs.watchtower.enable: "true"
# Twelve-factor: ALL application settings come straight from .env — owner,
# AI providers, mailer/SMTP, S3, sessions, everything in app/core/config.py.
# No per-setting allow-list to maintain, so a new setting in .env (and
# .env.example) reaches the app with no compose edit. The `environment:`
# block below is only for values that must NOT come from .env:
# - RUN_MIGRATIONS: backend-only flag, not an app setting.
# - DATABASE_URL: pinned to the compose-internal host as a safety net —
# the code default points at localhost, which is wrong inside the
# network. (.env normally sets it; this guards against it being absent.)
# `environment:` wins over `env_file`, so these always take effect.
# Trade-off (accepted): env_file also exposes infra secrets (POSTGRES_*,
# MINIO_*, CLOUDFLARE_TUNNEL_TOKEN) to the app process; the app ignores them.
env_file:
- path: .env
required: false
environment:
APP_ENV: ${APP_ENV:-development}
# Self-migrate on start so a Watchtower in-place image swap applies any new
# migrations (idempotent). The one-shot `migrate` service covers the same
# for `compose up`; the depends_on below serializes them so they never run
# alembic concurrently.
RUN_MIGRATIONS: "1"
DATABASE_URL: ${DATABASE_URL:-postgresql+asyncpg://provenance:provenance@postgres:5432/provenance}
S3_ENDPOINT_URL: ${S3_ENDPOINT_URL:-http://minio:9000}
S3_BUCKET: ${S3_BUCKET:-provenance}
S3_ACCESS_KEY: ${S3_ACCESS_KEY:-provenance}
S3_SECRET_KEY: ${S3_SECRET_KEY:-change-me-too}
S3_REGION: ${S3_REGION:-us-east-1}
depends_on:
postgres:
condition: service_healthy
@@ -102,14 +112,14 @@ services:
command: ["uv", "run", "--no-dev", "python", "-m", "app.worker"]
labels:
com.centurylinklabs.watchtower.enable: "true"
# Same .env-driven config as the backend (see its comment). The worker reads
# the model-provider settings too, so the upcoming embedding/matching jobs
# are configured the moment they land — no compose change needed.
env_file:
- path: .env
required: false
environment:
APP_ENV: ${APP_ENV:-development}
DATABASE_URL: ${DATABASE_URL:-postgresql+asyncpg://provenance:provenance@postgres:5432/provenance}
S3_ENDPOINT_URL: ${S3_ENDPOINT_URL:-http://minio:9000}
S3_BUCKET: ${S3_BUCKET:-provenance}
S3_ACCESS_KEY: ${S3_ACCESS_KEY:-provenance}
S3_SECRET_KEY: ${S3_SECRET_KEY:-change-me-too}
S3_REGION: ${S3_REGION:-us-east-1}
depends_on:
postgres:
condition: service_healthy
+17 -14
View File
@@ -69,7 +69,7 @@ Layered, dependency pointing inward:
- **Service layer** — all domain logic and the only place writes happen. Enforces invariants (e.g., "a write must carry an actor for the audit log"). The privacy engine is invoked here on every read.
- **Repository layer** — data access over SQLAlchemy; no business rules.
- **Domain models** — the entities in §5.
- **Integrations** — adapters behind interfaces: `AuthProvider`, `ObjectStore`, `Mailer`, `ModelProvider`, `SourceConnector`, `Queue`. Swapping an implementation is a config change, not a code change.
- **Integrations** — adapters behind interfaces: `AuthProvider`, `ObjectStore`, `Mailer`, `LLMProvider` / `EmbeddingProvider` (two separate model abstractions), `SourceConnector`, `Queue`. Swapping an implementation is a config change, not a code change.
Async throughout (FastAPI + async SQLAlchemy). Anything that can be slow or can fail externally (model calls, scraping, large imports) goes to the worker, never inline in a request.
@@ -87,8 +87,9 @@ Core entities and the important relationships. (Illustrative, not final DDL.)
### Tenancy & identity
- **User** — a person with login. Auth method(s) are attached but identity is internal, so one user can link multiple providers.
- **Tree** — the top-level tenant boundary for genealogical data. Owned by a User; may have additional members.
- **TreeMembership** — (User, Tree, role) where role ∈ {owner, editor, viewer}. The basis for authorization.
- **Tree** — the top-level tenant boundary for genealogical data. Owned by a User; may have additional members. Carries a per-tree **AI model policy** (owner-configured): `ai_member_provider` and `ai_recommender_provider` name configured providers from the model-provider registry (null = no model for that role); the owner may use any configured provider, while these cap what members and the recommender may use. Set via the owner-only `GET`/`PATCH /trees/{id}/ai`.
- **TreeMembership** — (User, Tree, role) where role ∈ {owner, editor, viewer}. The basis for authorization *within a tree*.
- **Instance owner / operator** — orthogonal to tree roles. The account(s) whose email is named in the `OWNER_EMAIL` env var **and whose email is verified** are the instance's operator(s), with access to the owner-only `/api/v1/admin` surface (operational status, instance-wide config). Derived from the env at request time — no DB column, no migration, can't drift, survives DB resets. The verified-email requirement is deliberate: registration is open, so without it whoever registers the owner address first would seize the role — verification ties ownership to proven control of the inbox. Crucially this is **not** a privacy bypass: an instance owner gets operational/config rights, **not** read access to other users' private trees or living-person PII — those still resolve only through the privacy engine. (`is_instance_owner` in `api/deps.py`.)
### Genealogical core
- **Person** — belongs to a Tree. Has many **Name** records (with parts: given, surname, prefix/suffix, and a type such as birth/married/alias) to support variants and changes over time. Carries living/deceased status.
@@ -108,7 +109,7 @@ Core entities and the important relationships. (Illustrative, not final DDL.)
### Cross-cutting
- **AuditEntry** — append-only: actor (User *or* the assistant principal acting for a User), action, entity, before/after snapshot, timestamp. Immutable.
- **SoftDelete** — entities carry `deleted_at`; a scheduled worker purges rows older than 30 days. Recovery = clearing `deleted_at` within the window.
- **ChangeProposal** — a pending set of writes generated by the assistant (or potentially a collaborator suggestion later): a structured diff the user approves, edits, or rejects. Approved proposals are applied through the normal service layer (so they hit the privacy engine and the audit log like any other write).
- **ChangeProposal** — a pending set of writes: records an `origin` (`assistant` | `contributor` — collaborator suggestions are encoded today, not just a future idea), a `status` (pending/applied/rejected), a structured `operations` diff (JSONB list of `{op, entity_type, entity_id?, payload}`), a summary/rationale, and review/apply-error metadata. The user approves, edits, or rejects; approved proposals are applied through the normal service layer (so they hit the privacy engine and audit log like any other write). *Note: v1 apply is not cross-op transactional — see `docs/design/change-proposal.md`.*
## 6. Privacy engine
@@ -118,11 +119,12 @@ A single function conceptually:
visible(viewer, entity) -> { full | redacted | hidden }
```
Inputs: viewer's role on the entity's Tree (including "anonymous"), the Tree's visibility (public/unlisted/private), per-Person privacy override, and living-person status.
Inputs: viewer's role on the entity's Tree (including "anonymous"), the Tree's visibility (public / site_members / unlisted / private), per-Person privacy override, and living-person status.
Rules:
- **Tree private** → only members see anything.
- **Tree public/unlisted** → non-members get a read view, *but* every Person is run through the living-person check and per-person override first.
- **Tree site_members** → any authenticated account on this instance gets a read view (anonymous viewers get nothing), still per-person living/override filtered.
- **Tree unlisted / public** → non-members *including anonymous viewers* get a read view, *but* every Person is run through the living-person check and per-person override first. Unlisted is gated only by knowing the link (never listed or search-indexed); public is listed in `/explore` and indexable.
- **Living-person rule** — a Person with no death fact, whose birth is within a configurable recency window (default ~100 years; unknown birth treated as possibly-living), is redacted (name minimized, vitals/events/media hidden) for non-owners. Owners may override per Person.
- The engine is invoked in the **service layer**, so it covers API, server-rendered public pages, search results, and any data the assistant can read. There is intentionally no path that returns rows without passing through it.
@@ -130,7 +132,7 @@ Rules:
Three parts, deliberately separated:
1. **Model provider abstraction** (`ModelProvider`) — one interface over hosted models (Anthropic, OpenAI, xAI) and self-hosted/local models via an OpenAI-compatible endpoint or Ollama. Configurable per deployment; keys supplied by the operator (this deployment) or by the user (BYO-key deployments).
1. **Model provider abstraction** — two separate interfaces, `LLMProvider` and `EmbeddingProvider` (configured independently — e.g. Anthropic has no embeddings endpoint), over hosted models (Anthropic, OpenAI, xAI) and self-hosted/local models via an OpenAI-compatible endpoint or Ollama. An operator can configure **several providers at once** through a registry (`build_llm_providers()`/`configured_llm_providers()`), each selectable by name — the basis for the per-tree AI policy and the `default_llm_provider`/`default_embedding_provider` settings. Keys supplied by the operator (this deployment) or by the user (BYO-key deployments).
2. **Scoped tool surface** — the assistant can only act through a constrained set of tools that map to service-layer operations, **scoped to the user it is helping.** It is its own principal: it cannot exceed that user's rights, and every action is attributed to "assistant (on behalf of User X)" in the audit log. This is the MCP-style boundary referenced in the PRD — the assistant gets capabilities, not raw database access.
3. **Source connectors** (`SourceConnector`) — a plugin framework for *reading* external data: FamilySearch API, Find A Grave, WikiTree, BLM/GLO land patents, USGS maps, public-domain newspapers, public county records. Only legally permissible sources ship with the project; operators can add their own. Connectors are read-only and rate-limited, and run in the worker.
@@ -148,7 +150,8 @@ Three parts, deliberately separated:
- `AuthProvider` interface with implementations for **local** (password + email verification/reset), **OIDC** (validated against Authentik; expected to work with Keycloak, Auth0, etc.), and **social** (Google, Apple, Facebook).
- Operators enable any subset via config. This deployment will use Authentik (`auth.jpaul.io`) plus selected social providers; a bare self-hoster can run local-only.
- Sessions are backend-issued; the assistant principal is minted per-session and scoped to the acting user.
- *Status:* **local auth has landed** — Argon2id password hashing, opaque backend-issued sessions (only the token hash is stored; presented as a Bearer token or HttpOnly cookie), and email verification + password reset via the `Mailer` interface (console in dev, SMTP for operators). OIDC and social providers are Phase 5. Every write records an attributable actor in the audit log.
- *Status:* **local auth has landed** — Argon2id password hashing, opaque backend-issued sessions (only the token hash is stored; presented as a Bearer token or HttpOnly cookie), and email verification + password reset via the `Mailer` interface (console in dev, SMTP for operators). An opt-in gate (`REQUIRE_EMAIL_VERIFICATION`, default off so SMTP-less self-hosts and pre-existing accounts aren't locked out) refuses sessions for accounts without a verified email — login is denied and existing sessions stop resolving until the address is verified. OIDC and social providers are Phase 5. Every write records an attributable actor in the audit log.
- **Instance owner / operator** (orthogonal to the per-tree roles): the account(s) whose email is in `OWNER_EMAIL` *and* is verified are the instance operator(s), with the owner-only `/api/v1/admin` surface (operational status, instance-wide config). Derived from the env at request time — no DB column. It is an operator/config role, **not** a privacy bypass: it grants no read access to other users' private trees or living-person PII. (`is_instance_owner` in `api/deps.py`.)
## 10. Search
@@ -175,20 +178,20 @@ Jobs are idempotent and retryable; an external failure degrades gracefully rathe
- Tag scheme: `test-main` (current main), `test-sha-<long>` (rollback pins), the component version, and `latest` on `v*` tags.
- Servers **pull** new images to deploy — no build on the host. The deploy compose references `git.jpaul.io/justin/provenance-{backend,frontend}:${IMAGE_TAG:-test-main}`; `docker-compose.dev.yml` is a local-build override.
- **Caddy** terminates TLS and reverse-proxies frontend + backend. **Cloudflare Tunnel** is the preferred ingress (no open inbound ports) but is never required; a plain Caddy-on-a-public-host deployment is equally supported.
- **Configuration** is entirely environment-driven (twelve-factor). One `.env` plus the compose file is enough to stand up a deployment.
- **Migrations** run on backend start (or via an explicit job) so an image pull + restart is a complete upgrade.
- **Backups:** documented procedure for Postgres dump + object-store sync; restore is the inverse.
- **Configuration** is entirely environment-driven (twelve-factor). One `.env` plus the compose file is enough to stand up a deployment; the backend/worker/migrate services read it via `env_file`, so every setting in `app/core/config.py` is configurable without a compose edit.
- **Migrations** run on backend start (`RUN_MIGRATIONS=1`) and via a one-shot `migrate` compose service, so an image pull + restart is a complete upgrade. A **schema-drift guard** (defense in depth) makes a half-applied deploy loud rather than a silent storm of 500s: `/health/ready` returns 503 and startup logs a CRITICAL `SCHEMA DRIFT` line when the DB's `alembic_version` is behind the heads baked into the image (`app/core/schema_version.py`).
- **Backups:** a one-command operator script (`deploy/backup.sh``pg_dump` + MinIO object sync, see `deploy/BACKUP.md`) plus a per-account ZIP export; restore is the inverse.
**Repository layout (as scaffolded):**
```
/backend # FastAPI, uv-managed. app/{api/v1, services (+privacy), repositories, models, schemas, integrations (auth/mailer), core}; migrations/ = Alembic
/deploy # docker-compose.yml, Caddyfile, .env.example
/backend # FastAPI, uv-managed. app/{api/v1, services (+privacy), repositories, models, schemas, integrations (auth, mailer, objectstore, models = LLM/embedding providers), core}; migrations/ = Alembic
/deploy # docker-compose.yml (+ docker-compose.dev.yml), Caddyfile, .env.example, backup.sh + BACKUP.md
/.gitea/workflows # Gitea Actions: build images → Gitea registry
/frontend # Next.js (App Router, TS, Tailwind). app/ pages, lib/api (openapi-typescript client), components/ui, Dockerfile (standalone)
```
The compose stack runs `postgres` (pgvector image — includes `pgvector`; `pg_trgm` ships in contrib), `minio`, `backend`, and `caddy`. The **worker** container (same image as backend, worker mode) joins once queue-driven jobs exist. Phase 0 ships a minimal backend with `/health` (liveness) and `/health/ready` (Postgres reachability) to validate the deploy wiring before the data model lands.
The compose stack runs `postgres` (pgvector image — includes `pgvector`; `pg_trgm` ships in contrib), `minio`, a one-shot `migrate` job, `backend`, the **worker** (same image as backend, worker mode — runs the scheduled soft-delete purge), `caddy`, and an optional `cloudflared` tunnel. The backend exposes `/health` (liveness) and `/health/ready` (Postgres reachability + schema-drift check).
## 13. Observability
+35 -39
View File
@@ -16,19 +16,18 @@
**Where Provenance is strong today.** The foundation is genuinely solid and, in several places, ahead of the OSS field:
- **Sources-first spine is real.** A reusable `Source` + per-fact `Citation` two-tier model with a `exactly_one_target` CHECK constraint, confidence enum, and full backend CRUD. This is the architectural thing webtrees/Gramps get right and most commercial tools bury. (Caveat: citations are silently dropped on GEDCOM *export* — see below.)
- **Privacy architecture is the right shape.** A single `privacy.py` engine, `TenantScoped` mixin on every row, living-person heuristic (`is_possibly_living`, unknown-birth-treated-as-living), and media served **through the backend rather than via raw S3 URLs**. The *shape* is correct; coverage is not yet complete (the media endpoint and several child resources don't yet apply `person_visibility` — see §2.4, §2.10).
- **Sources-first spine is real.** A reusable `Source` + per-fact `Citation` two-tier model with a `exactly_one_target` CHECK constraint, confidence enum, and full backend CRUD. This is the architectural thing webtrees/Gramps get right and most commercial tools bury.
- **Privacy architecture is the right shape — and coverage is now broad.** A single `privacy.py` engine, `TenantScoped` mixin on every row, living-person heuristic (`is_possibly_living`, unknown-birth-treated-as-living), and media served **through the backend rather than via raw S3 URLs**. Non-member reads of persons, events, media, names, and relationships all route through `person_visibility` (#46). The remaining gap is the `citation`/`source` list endpoints, which still gate only on `can_view_tree` — see §2.10.
- **Non-destructive by design.** Soft-delete with timed purge worker, immutable `AuditEntry` (before/after JSONB, `actor_type` ready for the assistant), GEDCOM merge that copies rather than overwrites, full account export/import.
- **Modeling maturity.** Typed parent/child qualifiers (biological/adoptive/step/foster/donor/guardian), typed alternate names with one-primary invariant, dual verbatim+normalized dates, duplicate-relationship guards, UUID surrogate keys.
- **Standards core.** GEDCOM 5.5.1 import/export is **functional** (with preview/merge-vs-create resolution UI), pg_trgm fuzzy name search, multi-tenant tree hosting with visibility tiers. Round-trip *fidelity* has four tracked gaps (citation links, custom tags, PLAC coords/hierarchy, non-UTF-8 encoding) — see §2.11.
- **Standards core.** GEDCOM 5.5.1 import/export is **functional** (with preview/merge-vs-create resolution UI), pg_trgm fuzzy name search, multi-tenant tree hosting with visibility tiers. Round-trip *fidelity* has three tracked gaps (custom tags, PLAC coords/hierarchy, non-UTF-8 encoding) — see §2.11.
**Documentation-vs-code gaps to correct now (per "docs travel with code").** Three repo claims are not yet true and should be edited in the same spirit they were written:
**Documentation-vs-code gaps to correct now (per "docs travel with code").** Two repo claims are not yet true and should be edited in the same spirit they were written:
- **ChangeProposal is documented as landed but does not exist.** CLAUDE.md states the core data model (ARCHITECTURE §5) landed / "Phase 0 complete," but `ChangeProposal` — part of §5 and the load-bearing AI invariant — has no model, migration, or schema. Either scope it out of the "landed" claim or build it; don't leave the docs asserting it.
- **pgvector is claimed as used; it is not.** Only `pg_trgm` is created. ARCHITECTURE references pgvector for match ranking.
- **i18n "from day one" is documented but unmet.** PRD §6 promises externalized strings; every label is a hardcoded literal.
These three doc edits are themselves trivial quick wins (see §3).
These two doc edits are themselves trivial quick wins (see §3).
**The biggest gaps vs commercial (Ancestry / MyHeritage / FamilySearch).** Provenance is not trying to be a record provider, and correctly so — but it is missing several things mainstream users treat as table stakes:
@@ -40,19 +39,16 @@ These three doc edits are themselves trivial quick wins (see §3).
**The biggest gaps vs OSS (GRAMPS / Gramps Web / webtrees).** These are where a privacy-first self-host product is expected to compete and currently trails:
- **Collaboration is plumbed but unreachable.** `TreeMembership` roles are enforced on every read/write, but there is **no API or UI to invite, grant, change, or revoke** a member — the tree is effectively single-user despite multi-user infrastructure. This also breaks the full-CRUD invariant (NN#8) and, because importance and the old Phase-6 schedule disagree, a minimal management slice is pulled forward (§2.9).
- **Living-person redaction is non-uniform.** Redaction is applied on person reads but **not** on the event/media/name/relationship/citation/source child-resource endpoints — a real PII leak on public/unlisted trees (NN#3, NN#2).
- **`site_members` visibility tier is silently broken** (defined, selectable in UI, never handled in `can_view_tree`).
- **Collaboration management is now reachable, but minimal.** `TreeMembership` roles are enforced on every read/write, and a list/add/change-role/remove API + UI now ship (§2.9), satisfying the full-CRUD invariant (NN#8). The remaining gap is the richer **email invite/grant flow** (pending-invite state, resend/expire), still scheduled for Phase 6.
- **Living-person redaction is now near-uniform.** Non-member reads of persons, events, media, names, and relationships all redact possibly-living people (#46); the `citation`/`source` list endpoints are the remaining hold-outs (they gate only on `can_view_tree`) — a narrowed PII gap on public/unlisted trees (NN#3, NN#2).
- **No place as a usable first-class entity** (model exists, created by GEDCOM, but no read/edit/delete — a create-only entity, which is a bug per NN#8).
- **No research log, to-do/task planner, kinship calculator, data-quality checker, or i18n/string externalization** (the last is a documented day-one commitment that is currently unmet).
**Security-priority correctness fixes (do these first, regardless of phase).** Three current defects are user-harm or trust issues, not roadmap items:
**Security-priority correctness fixes (do these first, regardless of phase).** The redaction defects all shipped — child resources (#46) and now citations/sources too — leaving one config switch:
1. **Media privacy leak (§2.4)**`list_media`/`get_media`/`media_content` gate on `can_view_tree` but never `person_visibility`; non-owners can download photos of redacted living people on public/unlisted trees.
2. **Child-resource redaction gap (§2.10)** — event/media/name/relationship/citation/source endpoints don't apply living-person redaction.
3. **Registration issues a live session before verification (§2.10)**`register` returns an authenticated session cookie + token (201) and `email_verified_at` is written but never read on any path; there is no env switch to gate self-registration. The *enforcement check* (read-side `email_verified_at`) is small; the approval-mode env switch is the larger piece.
1. **Self-registration approval-mode switch (§2.10)**the read-side enforcement now exists: `REQUIRE_EMAIL_VERIFICATION` gates login/session on `email_verified_at` (#53). The remaining gap is the env switch to choose open vs admin-approval vs closed self-registration. *(The citation/source living-person leak is now closed — citation/source list endpoints apply `person_visibility` for non-members via `public_view_service`.)*
**Strategic posture.** The differentiators worth pressing — property chain-of-title, the ChangeProposal AI model, the anonymous mutual-consent hint system, and true self-host data ownership — are mostly still ahead on the roadmap. The near-term job is (a) close the **privacy/auth correctness** and **collaboration** gaps that the architecture already implies, (b) ship the **maps + reports + merge** table stakes, and (c) build the **connector/ModelProvider/ChangeProposal** spine that unlocks the entire back half of the roadmap.
**Strategic posture.** The differentiators worth pressing — property chain-of-title, the ChangeProposal AI model, the anonymous mutual-consent hint system, and true self-host data ownership — are mostly still ahead on the roadmap. The near-term job is (a) close the **privacy/auth correctness** and **collaboration** gaps that the architecture already implies, (b) ship the **maps + reports + merge** table stakes, and (c) finish the back-half spine — the **connector framework** plus wiring the now-landed **ChangeProposal/ModelProvider** into the assistant — that unlocks the entire back half of the roadmap.
---
@@ -129,11 +125,11 @@ Fuzzy trigram name search is **have**; everything that depends on connectors, em
### 2.4 Media & documents
Universal media attachment is **have**, but with a **confirmed privacy leak** and no asset-processing pipeline.
Universal media attachment is **have**; the earlier privacy leak is now **closed** (#46), and the remaining gaps are the asset-processing pipeline (EXIF strip, thumbnails).
| Item | Description | Status | Imp | Eff | Phase | Non-negotiable |
|---|---|---|---|---|---|---|
| **Media privacy gating on serve paths** | `list_media`/`get_media`/`media_content` gate only on `can_view_tree`, never `person_visibility` — a non-owner can download photos of redacted living people on public/unlisted trees. | Have(leaky) | **Critical** | M | 1 | **Security-priority — fix first. Direct NN#3/NN#2 violation.** Check attached `person_id` visibility and redact/hide. |
| **Media privacy gating on serve paths** | `list_media`/`get_media`/`media_content` now apply `person_visibility` for non-members (#46): media is exposed only when linked to a FULL-visibility person (`list_public_media`/`can_view_media`), so living-person photos no longer leak on public/unlisted trees. | Have | **Critical** | M | 1 | **Resolved (NN#3/NN#2).** Serve paths check attached `person_id` visibility and 404 otherwise. |
| EXIF / GPS stripping on upload | Raw bytes stored verbatim; family photos leak GPS/home addresses/timestamps. | Planned | High | M | 1 | **Security-priority**, not cosmetic. Parse EXIF on ingest, strip/quarantine by default, allow override. |
| Thumbnail / preview generation | No image pipeline (no Pillow). Async, idempotent worker job. | Planned | High | L | 1 | Derived thumbnail must inherit parent privacy — no bypass path. |
| Image reference regions | Mark the rectangle of a census image that supports a Citation. | Missing | Med | M | later | Tenant-scoped, full CRUD; region→Citation preferred over region→Person. |
@@ -224,14 +220,14 @@ The preview→approve **bulk cleanup** tool is a genuine **have** and a differen
### 2.9 Collaboration & sharing
Authorization is enforced everywhere, but the **management surface is entirely absent** — the most consequential gap relative to the multi-user product promise. Because the Critical items below previously sat at Phase 6 while their labels said "breaks NN#8," a minimal management slice is pulled forward to Phase 2; the richer invite/email UX stays at Phase 6.
Authorization is enforced everywhere, and a **minimal management surface now ships** — list/add/change-role/remove via `api/v1/members.py` plus a members page (#233). The remaining gap is the richer email invite/grant flow. The minimal slice landed at Phase 2 as planned; the invite/email UX stays at Phase 6.
| Item | Description | Status | Imp | Eff | Phase | Non-negotiable |
|---|---|---|---|---|---|---|
| **Membership PATCH/DELETE + role change (minimal slice)** | Add/adjust/revoke a collaborator and change `role`the substrate (mutable `role`) exists; only the endpoints are missing. Resolves the create-only NN#8 break without the full invite flow. | Partial | **Critical** | SM | 2 | **Pulled forward** — a create-only entity shouldn't wait for Phase 6 (NN#8). Revocation routes through the single privacy point. |
| **Membership PATCH/DELETE + role change (minimal slice)** | Add/adjust/revoke a collaborator and change `role`GET/PATCH/DELETE on `/trees/{id}/members` (`api/v1/members.py`) plus a frontend members page now ship (#233). Resolves the create-only NN#8 break without the full invite flow. | Have | **Critical** | SM | 2 | Resolves the create-only NN#8 break. Revocation routes through the single privacy point. |
| Full invite/grant flow (email + UI) | Email-based invitations, pending-invite state, role-grant UI, resend/expire. Builds on the minimal slice. | Partial | High | L | 6 | Invitation email via configured SMTP (NN#7); membership changes through the one enforcement point. |
| **Read-only public tree share** | Visibility model already redacts living persons for anonymous viewers, but every endpoint requires `CurrentUser` — no optional-auth dep, no public route, no public page. | Partial | High | M | 2 | Highest-leverage near-term sharing feature; living-safe by construction via `person_visibility` (NN#2/#3). |
| SEO public profile pages (server-rendered) | Intent declared (`public` = search-indexable) but zero implementation; no sitemap/robots/meta. | Partial | Med | L | 2 | NN#2 explicitly names server-rendered public pages — must go through privacy engine, no direct row queries. |
| **Read-only public tree share** | Anonymous read surface shipped: optional-auth `CurrentUserOrNone` dep, `api/v1/public.py` + `public_view_service.py`, and server-rendered pages at `/p/[treeId]` (+ `/persons/[personId]`) and `/explore`. Living-safe by construction via `person_visibility`. | Have | High | M | 2 | Highest-leverage near-term sharing feature; living-safe by construction via `person_visibility` (NN#2/#3). |
| SEO public profile pages (server-rendered) | Server-rendered public pages (`/p/[treeId]`, `/explore`) and `robots.ts` now ship. Deferred follow-ups: a public-only `sitemap.ts` and per-tree `noindex,nofollow` meta for `unlisted`/`site_members` pages. | Partial | Med | L | 2 | NN#2 explicitly names server-rendered public pages — must go through privacy engine, no direct row queries. |
| **Notification / event-dispatch substrate** | Shared enabler seeded from `AuditEntry`: subscription + dispatch layer emitting privacy-filtered projections. Underpins watch/follow, mutual-consent match notices, comments, moderation, and in-app messaging. | Missing | High | L | 6 | **Privacy-filtered projections only — never raw before/after JSON** (NN#2/#3). |
| Comments / discussion threads | Per-profile discussion (target = person/event/source), threaded. | Missing | High | M | 6 | Comments on living persons redacted for non-members (NN#2/#3); rides the dispatch substrate. |
| In-app messaging (contact details hidden) | SMTP exists; no Message/Thread model. | Planned | High | L | 6 | Hide contact details; opens after mutual consent (NN#4); redact living-person content; rides dispatch substrate. |
@@ -253,10 +249,11 @@ The architecture is correct (single engine, tenant mixin, audit, soft-delete + p
| Item | Description | Status | Imp | Eff | Phase | Non-negotiable |
|---|---|---|---|---|---|---|
| **Uniform living-person redaction across child resources** | `_redact` runs on person reads but **not** on event/media/name/relationship/citation/source endpoints — non-members fetch a possibly-living person's events/photos/names directly. | Partial | **Critical** | M | 12 | **Security-priority. Core NN#3/NN#2 defect.** Apply `person_visibility` on every person-derived fact. |
| **Email-verification enforcement gate** | `email_verified_at` is written at `auth_service.py:154` but read on no path; `register` returns an authenticated session cookie + token (201) pre-verification. | Partial | **High** | S | 12 | **Security-priority near-quick-win** — add the read-side check (NN#7 trust path). The check is small; the registration-mode switch below is the larger piece. |
| **Uniform living-person redaction across child resources** | `person_visibility` now runs for non-members on the event, media, name, relationship endpoints (#46) and the citation/source list endpoints, all delegating to `public_view_service`: citations resolve to FULL-visibility person(s); sources show only when they back a visible citation. | Have | High | S | 12 | **Resolved (NN#3/NN#2).** No child-resource path leaks a redacted living person's facts. |
| **Email-verification enforcement gate** | Read-side check now ships (#53): `REQUIRE_EMAIL_VERIFICATION` gates login/session on `email_verified_at` (`auth_service.py`). Opt-in (default off) so SMTP-less self-hosts still work. | Have | **High** | S | 12 | Read-side trust path now enforced (NN#7); the registration-mode switch below is the separate larger piece. |
| Self-registration mode gating (approve / open / closed) | No env switch to choose open vs admin-approval vs closed registration. | Partial | High | M | 2/5 | Twelve-factor registration control (NN#7); pairs with the verification gate above. |
| **Fix `site_members` visibility tier** | Defined + selectable in UI but `can_view_tree` only handles public/unlisted — fails closed unintuitively. | Partial | Critical | S | 1 | **Quick win.** Least-surprise; honor the tier the UI offers. |
| Instance owner / operator role | `OWNER_EMAIL`-declared operator (#240): `is_instance_owner` on `/users/me`, owner-only `GET /api/v1/admin/instance`, `/admin` UI. | Have | Med | S | 2/5 | Owner-only operational surface, twelve-factor via env (NN#7); reads stay through the service layer. |
| **Fix `site_members` visibility tier** | `can_view_tree` now handles `site_members` (`privacy.py:56`): any authenticated account gets a read view, anonymous is refused. | Have | Critical | S | 1 | Honors the tier the UI offers; reads still route through `person_visibility`. |
| Make `LIVING_RECENCY_YEARS` configurable | Hardcoded 100 at `privacy.py:23`. | Partial | High | S | 2 | **Quick win.** Twelve-factor (NN#7). |
| Privacy-stripped export (redact living) | GEDCOM + account export emit full tree; no "strip living" mode. | Missing | High | M | 2 | Reuse `person_visibility`/`_redact` (NN#3). Owner self-export is safe today; shareable variant is the gap. |
| Per-fact / per-field privacy + record flags | tentative/rejected/preferred/private flags on facts. | Missing | Med | L | later | If added, route through the single engine (NN#2). |
@@ -270,11 +267,11 @@ The architecture is correct (single engine, tenant mixin, audit, soft-delete + p
### 2.11 Import/export & standards
GEDCOM 5.5.1 import/export and full data-portability export are **have**, but fidelity gaps directly undercut the provenance thesis — and one is outright data loss.
GEDCOM 5.5.1 import/export and full data-portability export are **have**; the remaining fidelity gaps (custom tags, PLAC coords/hierarchy, non-UTF-8 encoding) still undercut the provenance thesis.
| Item | Description | Status | Imp | Eff | Phase | Non-negotiable |
|---|---|---|---|---|---|---|
| **Citation links dropped on GEDCOM export** | Export never selects the Citation table — fact→source links, page, detail, confidence all dropped on export (they import fine). Re-importing your own export **destroys** the sources-first graph. | Partial | **Critical** | M | 2 | **Silent data loss on the product's signature data + destructive round-trip** (NN#5); breaks PRD US-013. |
| **Citation links on GEDCOM export** | Export now selects Citations and emits `SOUR`/`PAGE` per fact (#232), so fact→source links survive a Provenance→Provenance round-trip. (Citation detail/confidence beyond page still to round-trip.) | Have | **Critical** | M | 2 | Closes the silent data-loss / destructive round-trip on the product's signature data (NN#5); satisfies PRD US-013. |
| GEDCOM 7.0 import/export | Version hardcoded `5.5.1`; no v7 semantics, SCHMA, SUBM, or UID handling. | Partial | High | L | 2 | Stated differentiator (FamilySearch interop). |
| Custom/underscore tag preservation | `_MARNM` becomes `TYPE married`, other custom tags dropped — violates ≥99% round-trip goal. | Missing | High | L | 2 | Tension with provenance thesis (faithful record). |
| PLAC FORM hierarchy + MAP coordinate round-trip | Import reads only PLAC text; export emits flat PLAC. lat/long + hierarchy lost on round-trip. | Missing | High | M | 23 | Round-trip fidelity for the land/maps pillar. |
@@ -309,7 +306,7 @@ Internal REST + OpenAPI + generated TS client are **have**. The externalized dev
| Item | Description | Status | Imp | Eff | Phase | Non-negotiable |
|---|---|---|---|---|---|---|
| Public read-only API + scoped tokens (OAuth) | Bearer token is opaque session only; `TokenPurpose` lacks scopes; designed `public.py` never built. | Partial | High | L | 56 | Any scoped-token path routes through `person_visibility` + living-person redaction (NN#2/#3). |
| Public read-only API + scoped tokens (OAuth) | The unauthenticated public read surface (`public.py`) now ships (#41#51), but for a *developer* API the bearer token is still opaque session only and `TokenPurpose` lacks scopes — no scoped/OAuth token path. | Partial | High | L | 56 | Any scoped-token path routes through `person_visibility` + living-person redaction (NN#2/#3). |
| SourceConnector framework | Only AuthProvider/ObjectStore/Mailer base classes exist; no connector base/loader/registry. Gates AI, hints, property connectors. | Planned | Med | L | 4 | Read-only, rate-limited; findings via ChangeProposal (NN#1); legal sources only (NN#6). |
| Webhooks / change feeds | `AuditEntry` is the natural substrate (shares the notification dispatch layer, §2.9); no feed/webhook layer. | Missing | Med | L | 6 | Emit privacy-filtered, tenant-scoped projections — never raw before/after JSON (NN#2/#3). |
| CLI / scripting surface | No `[project.scripts]`, no Typer/Click; worker is a purge loop only. Self-hosters want bulk admin. | Missing | Med | M | 9 | Funnel reads through privacy.py, writes through audit; admin-scoped, no assistant-write path. |
@@ -328,7 +325,7 @@ Postgres + S3, multi-tenant isolation are **have**. Queue, observability, backup
| Real job queue (Postgres/Redis-backed) | Worker is a fixed-interval purge loop; GEDCOM import and account export run **inline in the request**. | Partial | High | L | 4 (pre-req) | Blocks NN#1 (assistant in worker) and NN#4 (hint matching in worker). Queue backend is an open question (PRD §11). |
| **Pagination on list endpoints + server-side tree loading** | List endpoints (`persons.py:37`, events, relationships) take no `limit/offset/skip`; the tree view loads the whole graph client-side. A *current* limitation against the 50k-person target. | Planned | High | M | 12 | **Split out from scale validation** — this is a correctness/functional gap now, not a Phase 9 task. |
| Scale validation (50k+ trees, P95<2s, load test) | No benchmark or load test exists. | Planned | High | L | 9 | Inline heavy ops risk partial writes — moving to the queue is what makes "failures never corrupt state" true. |
| **Operator backup: one-command `pg_dump` + MinIO sync** | Only a documented procedure + per-account ZIP exist; no scripted DB+object dump. For a self-host product this is day-one data-loss exposure. | Partial | Critical | M | 12 | **Pulled forward** — Critical importance contradicted the old Phase-9 slot. Restore must re-apply privacy state faithfully (NN#3); safety net for NN#8. |
| **Operator backup: one-command `pg_dump` + MinIO sync** | `deploy/backup.sh` + `deploy/BACKUP.md` now provide a scripted DB+object dump (#234). Remaining: scheduled/off-host/verified-restore tooling (row below). | Have | Critical | M | 12 | Restore must re-apply privacy state faithfully (NN#3); safety net for NN#8. |
| Scheduled / cloud automated backup + restore tooling | Cron-driven, off-host, verified-restore workflow. | Partial | High | L | 9 | Builds on the one-command slice above. |
| ARM64 build matrix | CI builds `linux/amd64` only; many self-hosters run ARM SBCs. | Partial | High | S | 1 | **Quick win.** Add arm64 + QEMU to buildx (NN#7 container-native). |
| Structured JSON logs + Prometheus metrics | Plain-text stdlib logging; no `/metrics`. | Partial | Med | M | 9 | Logs/metrics reference UUIDs, never names/PII (NN#3/#4). |
@@ -361,12 +358,13 @@ The entire "land" half is **planned/missing** but fully specified. This is where
### 2.16 AI assistant — *defining differentiator*
Entirely **planned** — and note the docs-vs-code gap: ARCHITECTURE §5 lists `ChangeProposal` as part of the "landed" core model, but no model/migration/schema exists. The audit substrate (`actor_type=assistant`, before/after JSONB) is the right foundation; the ChangeProposal model and ModelProvider abstraction are the two critical-path pieces.
The spine has now **landed**: the `ChangeProposal` model/schema/service, its migration, the GET/POST API, and a review UI all ship, and the `LLMProvider`/`EmbeddingProvider` abstraction with null/Anthropic/OpenAI-compat (OpenAI/xAI/Ollama) providers + registry is in place. The audit substrate (`actor_type=assistant`, before/after JSONB) is the right foundation; the remaining work is wiring the assistant's tools to emit proposals and building the chatbot/RAG surface on top.
| Item | Description | Status | Imp | Eff | Phase | Non-negotiable |
|---|---|---|---|---|---|---|
| **ChangeProposal (propose-then-confirm)** | The defining invariant. No `proposal.py`, no migration, no review UI yet — despite docs implying it landed. | Planned | **Critical** | L | 4 | **IS NN#1.** Enforce structurally: assistant tools return proposals; only user action applies one; application flows through the normal service layer (privacy + audit). ChangeProposal itself needs full CRUD (NN#8). Correct the docs to match reality. |
| Pluggable LLM + embedding provider | `ModelProvider` over Anthropic/OpenAI/xAI/Ollama; env placeholders exist, no interface code. | Planned | Critical | M | 4 | **Twelve-factor, no hard-coded keys/endpoints** (NN#7); the Ollama/self-hosted path is what makes the privacy-first promise real. |
| **ChangeProposal (propose-then-confirm)** | The defining invariant. Model/schema/service (`models/change_proposal.py`, `services/change_proposal_service.py`), migration `a1b2c3d4e5f6`, GET/POST `api/v1/proposals.py`, and a `/trees/[id]/proposals` review UI all ship. Remaining: wire assistant tools to emit proposals. | Have | **Critical** | L | 4 | **IS NN#1.** Enforce structurally: assistant tools return proposals; only user action applies one; application flows through the normal service layer (privacy + audit). ChangeProposal itself needs full CRUD (NN#8). |
| Pluggable LLM + embedding provider | `LLMProvider`/`EmbeddingProvider` ABCs (`integrations/models/base.py`) with null, Anthropic, and OpenAI-compat (OpenAI/xAI/Ollama) implementations + registry. | Have | Critical | M | 4 | **Twelve-factor, no hard-coded keys/endpoints** (NN#7); the Ollama/self-hosted path is what makes the privacy-first promise real. |
| Per-tree AI model policy | Owner-only per-tree model selection (`Tree.ai_member_provider`/`ai_recommender_provider`, GET/PATCH `/trees/{id}/ai`, `/trees/[id]/ai` UI) (#238). | Have | Med | S | 4 | Owner-only; selects which configured provider a tree uses — keys stay in env, twelve-factor (NN#7). |
| AI research-assistant chatbot (RAG over tree) | Marquee feature; needs ModelProvider + connector + retrieval through privacy engine. | Planned | High | XL | 4 | NN#1 propose-only, NN#2 privacy retrieval, NN#3 redaction. |
| Conversational / connector record search | Search legal sources via the assistant. | Planned | High | L | 4 | Legal sources (NN#6); findings = Source + Citation (NN#5). |
| Fact extraction from documents | Extracted facts map cleanly to ChangeProposal review. | Missing | Med | M | 4 | Canonical NN#1 use case; each fact carries a Citation (NN#5). |
@@ -399,8 +397,8 @@ A documented **day-one commitment** ("UI strings externalized from day one") tha
Ordered by leverage. All are S-effort or a thin slice of a larger item, and most close a stated invariant gap.
1. **Fix `site_members` visibility tier** (Privacy, Critical/S) — defined and selectable in the UI but never handled in `can_view_tree`; fails closed unintuitively.
2. **Email-verification enforcement gate** (Privacy/Auth, High/S) — add the read-side `email_verified_at` check so a freshly registered, unverified user doesn't get a live authenticated session. Security-priority; the registration-mode env switch (open/approve/closed) is the larger follow-on, not part of this quick win.
1. **Fix `site_members` visibility tier** (Privacy, Critical/S) — **done:** `can_view_tree` now handles `site_members` (`privacy.py:56`), giving any authenticated account a read view while refusing anonymous.
2. **Email-verification enforcement gate** (Privacy/Auth, High/S) — **done (#53):** the read-side `email_verified_at` check now ships behind `REQUIRE_EMAIL_VERIFICATION`, so a freshly registered, unverified user doesn't get a live authenticated session. The registration-mode env switch (open/approve/closed) is the larger follow-on (§2.10, M-effort — not a quick win).
3. **Citation confidence selector in the cite form** (Sources, High/S) — confidence is modeled and API-writable but unreachable in the UI; every UI citation is currently NULL. Honors NN#8 and the evidence-quality thesis.
4. **Source edit UI + expose all 8 fields** (Sources, High/S) — update API exists but there is no edit form and create exposes ~3 fields; a create-but-not-edit entity violates NN#8.
5. **Make `LIVING_RECENCY_YEARS` env-configurable** (Privacy, High/S) — hardcoded 100 at `privacy.py:23`; twelve-factor (NN#7).
@@ -411,11 +409,9 @@ Ordered by leverage. All are S-effort or a thin slice of a larger item, and most
10. **`GET /{tree}/citations/{id}` endpoint** (Sources, Med/S) — API symmetry (NN#8).
11. **Transcription/abstract fields on Source** (Sources, Med/S) — add `transcription_text` + `abstract_text`, distinct from `citation_text`; core to evidence analysis.
12. **Sort the merged person timeline** (Research workflow, Med/S) — `shownEvents.sort()` on `date_start`; currently appended unsorted.
13. **Doc corrections (docs-vs-code)** (Meta, trivial/S) — edit CLAUDE.md / ARCHITECTURE so the pgvector "used" claim, the i18n "from day one" claim, and the ChangeProposal "landed" claim match reality. The repo convention requires docs to travel with code.
13. **Doc corrections (docs-vs-code)** (Meta, trivial/S) — edit CLAUDE.md / ARCHITECTURE so the pgvector "used" claim and the i18n "from day one" claim match reality. The repo convention requires docs to travel with code.
> **Ships-with, not standalone:** *Revocable / adjustable access (membership PATCH/DELETE + role change)* is security-critical and S-effort, but it is the minimal slice of the membership work (§2.9) and ships **with** those endpoints — it is not independently shippable on its own.
>
> **Higher priority than any quick win, but M-effort (not quick):** the **media privacy leak** (§2.4), the **child-resource redaction gap** (§2.10), and pulling the **one-command operator backup** (§2.14) forward. Treat these as **security-/data-loss-priority Phase 12 fixes** regardless of the quick-win list.
> **Shipped this cycle:** the **media privacy leak** (§2.4) and the **child-resource redaction gap** (§2.10) are fully closed — person/event/media/name/relationship (#46) and citation/source endpoints all apply `person_visibility` for non-members. No residual living-person leak on the read surface.
---
@@ -425,10 +421,10 @@ Where to invest to make Provenance distinct rather than a webtrees clone. Each l
**1. Property chain-of-title (the "land" half).** No surveyed competitor models ownership as a typed, cited event chain tying parties across time, with gap-flagging and bidirectional owner↔person / parcel↔place traversal, fed by **legal** public sources (BLM/GLO patents, USGS, public county deeds). This is the single clearest "no one else does this" capability. Sequence: Property + OwnershipEvent + Citation-target (Phase 3) → chain-of-title view → BLM/GLO connector (Phase 8). The Citation extension is a quick win; the entity is the prerequisite for everything else in the category.
**2. The ChangeProposal AI model.** "The assistant never writes autonomously" is a *trust* differentiator in a market where users fear AI corrupting their research. Build it structurally — assistant tools return proposals; only an explicit human action applies one; application flows through the normal service layer so it always hits the privacy engine and audit log. The same approval queue moderates untrusted human-contributor edits (Collaboration §2.9), so design them together. The audit substrate is already in place; ChangeProposal + ModelProvider are the critical path — and the docs should stop asserting ChangeProposal has landed until it has.
**2. The ChangeProposal AI model.** "The assistant never writes autonomously" is a *trust* differentiator in a market where users fear AI corrupting their research. The structural spine has **landed** — the `ChangeProposal` model/API/review UI and the pluggable `LLMProvider`/`EmbeddingProvider` abstraction both ship — so the remaining work is wiring the assistant's tools to emit proposals (never mutating directly). Assistant tools return proposals; only an explicit human action applies one; application flows through the normal service layer so it always hits the privacy engine and audit log. The same approval queue moderates untrusted human-contributor edits (Collaboration §2.9), so design them together.
**3. Anonymous, mutual-consent cross-tree hints.** The privacy model already redacts living people for anonymous viewers, so a hint system that reveals *nothing identifying* until both sides opt in is achievable by construction — and is a categorically more trustworthy version of MyHeritage Smart Matches / Ancestry hints. Requires the matching engine (pgvector enablement + candidate generation, Phase 7), the notification/event-dispatch substrate (§2.9), and the messaging channel that opens only post-consent.
**4. True self-hosting + data ownership.** Full account export/import, soft-delete recovery, GEDCOM round-trip, env-driven everything, and (to-build) operator-grade scheduled backup + ARM support make Provenance the genealogy app you actually own. Two correctness items gate the promise: GEDCOM export must stop dropping citations (a Provenance→Provenance round-trip currently destroys the sources graph), and operator backup must move from "documented procedure" to a one-command dump. The Ollama/self-hosted ModelProvider path means even the AI assistant runs without tree data leaving the deployment — a promise no commercial competitor can make.
**4. True self-hosting + data ownership.** Full account export/import, soft-delete recovery (with owner-confirmed on-demand purge to delete a trashed tree immediately rather than waiting out the 30-day window), GEDCOM round-trip, env-driven everything, a one-command operator backup, and (to-build) scheduled off-host backup + ARM support make Provenance the genealogy app you actually own. The two correctness items that gated the promise have **landed**: GEDCOM export now preserves citations (the Provenance→Provenance round-trip keeps the sources graph), and operator backup moved from "documented procedure" to a one-command dump (`deploy/backup.sh`). What remains is scheduled/verified-restore tooling and ARM builds. The Ollama/self-hosted ModelProvider path means even the AI assistant runs without tree data leaving the deployment — a promise no commercial competitor can make.
**5. Sources-first as a felt experience.** The two-tier model is built; the differentiator is making it *visible and low-friction*: a guided Evidence-Explained citation builder, transcription/abstract fields, source-driven data entry (transcribe a document into the tree), per-fact confidence surfaced in the UI, and — critically — citations that **survive GEDCOM export**. These turn "every fact links to where it came from" from an architecture note into the product's personality.
**5. Sources-first as a felt experience.** The two-tier model is built, and citations now **survive GEDCOM export** (#232); the remaining differentiator is making sourcing *visible and low-friction*: a guided Evidence-Explained citation builder, transcription/abstract fields, source-driven data entry (transcribe a document into the tree), and per-fact confidence surfaced in the UI. These turn "every fact links to where it came from" from an architecture note into the product's personality.
+15 -9
View File
@@ -1,8 +1,8 @@
# Provenance — Product Requirements Document
**Status:** Draft v0.1
**Status:** Draft v0.1 — now describes a partially-implemented system: Phase 0 complete, Phase 1 done, with early slices of later phases shipped.
**Owner:** Justin Paul
**Last updated:** 2026-06-06
**Last updated:** 2026-06-10
---
@@ -94,7 +94,7 @@ Acceptance criteria (AC) are written to be testable.
- **US-033** I view every property a person held, and every parcel ever recorded at a place. *AC:* both reverse lookups return correct sets.
### Privacy & sharing
- **US-040** I set a tree to public, unlisted, or private. *AC:* visibility enforced for anonymous and non-owner users.
- **US-040** I set a tree to one of four visibility levels — private, unlisted, site_members, or public. *AC:* visibility enforced for anonymous and non-owner users; at the **site_members** level the tree is visible to any authenticated instance user (signed in but not a member of the tree) and hidden from anonymous visitors.
- **US-041** I mark any individual private even within a public tree. *AC:* that person's details hidden from non-owners regardless of tree setting.
- **US-042** Living people are hidden from non-owners by default. *AC:* a person with no death fact and a plausibly-living birth date shows only minimal/no PII to non-owners; owner can override per person.
- **US-043** I add a co-owner to a tree. *AC:* co-owner can edit per role; action attributed to them in the audit log.
@@ -132,6 +132,7 @@ Acceptance criteria (AC) are written to be testable.
### 5.1 Identity & access
- Pluggable authentication: local password (with email verification and reset), social sign-in (Google, Apple, Facebook), and generic **OIDC** (validated against Authentik; should work with Keycloak, Authentik, Auth0, etc.). Operators enable any subset.
- Roles per tree: **owner**, **co-owner/editor**, **viewer**. Public/unlisted trees also have an implicit anonymous viewer.
- **Instance owner/operator:** an env-declared operator role (via `OWNER_EMAIL`, requiring a verified email), distinct from the per-tree roles. It is an operations/config role only and is **not** a privacy bypass — it grants no access to others' tree data or PII.
- The AI assistant acts as a distinct, scoped principal bound to the user it is helping — it can never exceed that user's rights, and its actions are separately attributable.
### 5.2 Data model (core entities)
@@ -155,6 +156,7 @@ Acceptance criteria (AC) are written to be testable.
### 5.5 Privacy engine
- Effective visibility = function(tree visibility, person override, living status, viewer role).
- Tree visibility has four levels: **private** (members only; default), **unlisted** (anyone with the link, not listed/indexed), **site_members** (any authenticated instance user), and **public** (anonymous + listed/indexable).
- Living-person rule: absent a death fact and within a configurable recency window (default ~100 years from birth, or unknown birth treated as possibly-living), non-owners see minimal or no PII.
- Public/link views must render through the same privacy engine — no bypass path.
@@ -168,6 +170,7 @@ Acceptance criteria (AC) are written to be testable.
### 5.8 AI research assistant
- Provider-agnostic abstraction over hosted models (Anthropic, OpenAI, xAI) and self-hosted/local models (e.g., an OpenAI-compatible endpoint or Ollama).
- Operators register one or more model providers (env / registry); a tree owner then selects the active provider(s) for that tree via an owner-only AI settings surface.
- Tool-mediated access to the same CRUD operations a user has, scoped to that user, via a server with explicitly scoped capabilities (an MCP-style tool boundary).
- **Propose-then-confirm is mandatory.** The assistant drafts changes as diffs; nothing persists without explicit user approval.
- Source connectors are a **plugin framework**; the project ships only legal sources (e.g., FamilySearch API, Find A Grave, WikiTree, BLM/GLO land patents, USGS maps, public-domain newspapers, public county records). Operator-supplied scrapers can be added later.
@@ -181,6 +184,7 @@ Acceptance criteria (AC) are written to be testable.
### 5.11 Administration & operations
- All integration points (auth, SMTP, object storage, database, model providers, scrapers) are environment/config-driven.
- Health endpoints; structured logs; a documented backup/restore procedure; safe upgrade via image pull + migration.
- Owner-only operator surface: instance status and configuration (`GET /api/v1/admin/instance` and the `/admin` UI), scoped to the instance owner and exposing no tree contents or PII.
## 6. Non-functional requirements
@@ -206,17 +210,19 @@ Acceptance criteria (AC) are written to be testable.
Provenance ships continuously and is stood up in a live lab as it goes; there is no hard MVP/v2 line, but features land in dependency order so each tranche is usable.
- **Phase 0 — Foundation:** backend + DB schema; local auth + email verify; frontend scaffold; container images; CI/CD (Gitea Actions → Gitea registry → server pull); one-command compose deploy.
- **Phase 1 — Core tree:** people, relationships, events; sources & citations; media uploads; soft delete + recovery; tree-level privacy.
- **Phase 2 — Standards & polish:** GEDCOM 7 import/export; search with fuzzy names; living-person protection; person-level privacy override; onboarding + persona selector.
- **Phase 0 — Foundation:** *(shipped)* backend + DB schema; local auth + email verify; frontend scaffold; container images; CI/CD (Gitea Actions → Gitea registry → server pull); one-command compose deploy.
- **Phase 1 — Core tree:** *(shipped)* people, relationships, events; sources & citations; media uploads; soft delete + recovery; tree-level privacy (now four levels: private/unlisted/site_members/public).
- **Phase 2 — Standards & polish:** *(partly shipped — GEDCOM 7 import/export #232; fuzzy/trigram search)* GEDCOM 7 import/export; search with fuzzy names; living-person protection; person-level privacy override; onboarding + persona selector.
- **Phase 3 — Property:** property entity; ownership events; chain-of-title view; property-aware sources.
- **Phase 4 — AI assistant:** provider abstraction (hosted + local); scraper plugin framework; first connectors (FamilySearch, Find A Grave); propose-diff approval flow; assistant actions in audit log.
- **Phase 5 — Federated auth:** OIDC (Authentik), then Google/Apple/Facebook sign-in.
- **Phase 6 — Collaboration:** tree co-owners; audit-log UI; direct messaging; notifications.
- **Phase 4 — AI assistant:** *(partly shipped early — provider abstraction + multi-provider registry #235/#237; ChangeProposal propose-then-confirm #236)* provider abstraction (hosted + local); scraper plugin framework; first connectors (FamilySearch, Find A Grave); propose-diff approval flow; assistant actions in audit log.
- **Phase 5 — Federated auth:** *(not shipped — only the `AuthProvider` ABC exists)* OIDC (Authentik), then Google/Apple/Facebook sign-in.
- **Phase 6 — Collaboration:** *(tree membership #233 landed early)* tree co-owners; audit-log UI; direct messaging; notifications.
- **Phase 7 — Cross-tree hints:** async matching engine (embeddings-assisted); anonymous match notifications; mutual-consent reveal.
- **Phase 8 — Land sources:** BLM/GLO patents; USGS map integration; additional county-deed connectors (merge existing scrapers).
- **Phase 9 — Hardening & dogfooding** toward a possible hosted offering.
**Shipped ahead of sequence (operations & platform):** instance-owner/operator role (#240); operator backup tooling (#234); a schema-drift guard (#239). These landed early because the live lab deployment needed them. Note that despite their later issue numbers, **Phase 5 federated auth/OIDC is not yet shipped** — only the `AuthProvider` ABC is in place.
Rationale: enabling work (schema, auth, deploy, sources) precedes everything; GEDCOM lands before the assistant so AI writes target a stable model; property follows a well-tested people graph; hints come late because they require multiple populated trees.
## 9. Technical direction (summary)
+6 -2
View File
@@ -1,6 +1,8 @@
# Design note: ChangeProposal (propose-then-confirm)
Status: **in progress**. Implements non-negotiable #1 (CLAUDE.md): *the AI
Status: **Shipped (#214/#236)** — model, service, API, and review UI landed; the
assistant producer and cross-op transactional apply remain as follow-ups (see
Out of scope). Implements non-negotiable #1 (CLAUDE.md): *the AI
assistant never writes autonomously.* Every assistant "write" emits a
**ChangeProposal** — a structured diff a human approves, edits, or rejects.
@@ -63,7 +65,9 @@ is a follow-up (it needs the services to accept a no-commit mode).
- `apply(session, *, actor, tree, proposal_id, edited_operations=None) -> ChangeProposal`
— editor-only. Optional `edited_operations` lets the reviewer tweak the diff
before applying ("edit" in approve/edit/reject). Dispatches each op through the
editing services; on any failure, rolls back and records `apply_error`.
editing services; on failure it records `apply_error` and leaves the proposal
pending — it does **not** roll back ops already committed by earlier dispatches
(v1 is not cross-op transactional; see Data model).
- `reject(session, *, actor, tree, proposal_id, note=None)` — editor-only.
## API
+33 -22
View File
@@ -1,11 +1,11 @@
# Design note: tree visibility & the public viewing surface
Status: **proposed** (design only — no code yet). Owner: Justin. Created 2026-06-09.
Status: **Shipped (#41-#51)**. Owner: Justin. Created 2026-06-09.
This is a privacy-critical change (it creates the first anonymous read surface in
Provenance). Per CLAUDE.md, design before code. Implementation should land in
small, individually-reviewable PRs, with tests on the privacy engine and the
public read path before any anonymous endpoint is exposed.
This is a privacy-critical change (it created the first anonymous read surface in
Provenance). Per CLAUDE.md, it was designed before code and shipped in small,
individually-reviewable PRs, with tests on the privacy engine and the public read
path landing before any anonymous endpoint was exposed.
## 1. The model
@@ -74,13 +74,12 @@ logged-in non-member; `private` denies both.
## 4. The anonymous read path (the careful part)
**Recommendation: a dedicated read-only public API namespace**, not optional-auth
on the existing endpoints. Rationale: it is far easier to audit a small,
purpose-built surface that *always* funnels through `person_visibility` than to
weaken the membership checks on the authenticated endpoints and hope every branch
is covered.
**Shipped: a dedicated read-only public API namespace**, not optional-auth on the
existing endpoints. Rationale: it is far easier to audit a small, purpose-built
surface that *always* funnels through `person_visibility` than to weaken the
membership checks on the authenticated endpoints and hope every branch is covered.
- New router `app/api/v1/public.py`, mounted at `/api/v1/public`, with an
- Router `app/api/v1/public.py`, mounted at `/api/v1/public`, with an
**optional-auth** dependency `CurrentUserOrNone` (returns `User | None`; never
401s). Contrast with `CurrentUser` (`deps.py:30-36`) which hard-401s.
- Endpoints (read-only; no create/update/delete):
@@ -88,14 +87,20 @@ is covered.
lists `site_members` when the caller is authenticated. Paginated, search via
existing `pg_trgm`. Never lists `unlisted`/`private`.
- `GET /public/trees/{id}` — tree metadata if `can_view_tree(user_or_none)`.
- `GET /public/trees/{id}/persons`, `/persons/{pid}`, `/relationships`,
`/events`, `/media`, … — each filtered through `person_visibility`, returning
redacted projections (a `PublicPersonRead` that omits PII for redacted people:
no exact dates, no living-person names beyond "Living", etc.).
- **A redacted response schema**, distinct from the member `PersonRead`, so the
serializer physically cannot emit fields a non-member shouldn't see. Redaction
happens in the service, not the route.
- **Rate limiting** on the public namespace (per-IP) to blunt scraping/enumeration.
- `GET /public/trees/{id}/persons`, `/persons/{pid}`, `/persons/{pid}/names`,
`/relationships`, `/events` — each filtered through `person_visibility`.
(Media is not exposed on the public surface yet — deferred.)
- **Redaction happens in the service, before serialization** — this is the safety
guarantee. It did **not** ship as a separate `PublicPersonRead` schema (that
recommendation was not adopted): the public router **reuses the member read
schemas** (`PersonRead`, `RelationshipRead`, `EventRead`, `NameRead`), and only
the tree projection (`PublicTreeRead`) is distinct. Safety comes from
`public_view_service` resolving `person_visibility` and then **dropping hidden
rows and redacting possibly-living people** (`person_service._redact` rewrites
the name to "Living person", etc.) *before* a row is ever validated into a
schema. No route hands a raw row to the serializer.
- **Rate limiting** on the public namespace (per-IP) is **deferred** — it is not
implemented in the app and may be handled at the Caddy edge if needed.
- **Audit**: count public reads; do not log PII.
## 5. Frontend public pages
@@ -103,8 +108,12 @@ is covered.
- New **server-rendered** routes outside the authed app shell, e.g.
`/p/[treeId]` (tree), `/p/[treeId]/[personId]` (person), `/explore` (directory).
Server components fetch the `/api/v1/public/*` endpoints; no login redirect.
- `robots`: allow + sitemap for `public`; `noindex, nofollow` meta for `unlisted`
and `site_members`. Sitemap lists only `public` trees/persons.
- `robots`: ships a coarse `allow: ["/", "/p/"]` rule (`frontend/app/robots.ts`)
that keeps the authed app out of the index. Per-tree `noindex, nofollow` meta
for `unlisted`/`site_members` and a `public`-only **sitemap** did **not** ship —
both are **deferred** follow-ups (per-tree noindex needs server rendering;
meanwhile `unlisted`/`site_members` trees aren't linked or listed, so they
aren't crawl-discoverable).
- The directory `/explore` is anonymous for `public`; shows `site_members` trees
only to logged-in users.
- Reuse the tree/person view components where possible, fed by the redacted
@@ -131,7 +140,9 @@ anyone on the web. Living people stay hidden.") is worthwhile given the stakes.
output. No raw repository reads in the public router.
- Living-person protection holds regardless of tree visibility.
- Unlisted relies on UUID unguessability; never expose a sequential public id.
- `noindex` everything except `public`; sitemap is `public`-only.
- Per-tree `noindex` (everything except `public`) and a `public`-only sitemap are
**deferred** (see §5); today `robots.ts` keeps the authed app out of the index
and `unlisted`/`site_members` trees aren't linked or listed.
- Tests gate the merge: privacy-engine matrix + an integration test that hits the
public endpoints anonymously and asserts no living-person PII leaks.
+5
View File
@@ -0,0 +1,5 @@
import { AppShell } from "@/components/app-shell";
export default function AdminLayout({ children }: { children: React.ReactNode }) {
return <AppShell>{children}</AppShell>;
}
+103
View File
@@ -0,0 +1,103 @@
"use client";
import { useCallback, useEffect, useState } from "react";
import { api } from "@/lib/api/client";
import type { components } from "@/lib/api/schema";
import { Card, CardContent } from "@/components/ui/card";
type Instance = components["schemas"]["InstanceStatus"];
function Stat({ label, value }: { label: string; value: React.ReactNode }) {
return (
<div className="flex flex-col gap-1">
<div className="text-xs uppercase tracking-wider text-[var(--muted)]">{label}</div>
<div className="text-sm font-medium">{value}</div>
</div>
);
}
export default function AdminPage() {
const [instance, setInstance] = useState<Instance | null>(null);
const [forbidden, setForbidden] = useState(false);
const [ready, setReady] = useState(false);
const load = useCallback(async () => {
const { data, response } = await api.GET("/api/v1/admin/instance");
if (response.status === 403) setForbidden(true);
else if (data) setInstance(data);
setReady(true);
}, []);
useEffect(() => {
load();
}, [load]);
if (!ready) return <div className="p-6 text-sm text-[var(--muted)]">Loading</div>;
// Fail closed on anything that isn't a successful owner load: 403 (not owner),
// 401 (not signed in), or any 5xx all land here rather than dereferencing null.
if (forbidden || !instance) {
return (
<div className="mx-auto max-w-2xl p-6">
<h1 className="text-xl font-semibold">Instance admin</h1>
<p className="mt-2 text-sm text-[var(--muted)]">
{forbidden
? "This area is for the instance owner only. Set OWNER_EMAIL in the server environment to your account email (and verify that email) to claim it."
: "Instance status is unavailable right now. Make sure you're signed in as the instance owner."}
</p>
</div>
);
}
const i = instance;
return (
<div className="mx-auto max-w-2xl p-6">
<h1 className="text-xl font-semibold">Instance admin</h1>
<p className="mt-1 text-sm text-[var(--muted)]">
Operational status for this deployment. You see this because your account is
named in <code>OWNER_EMAIL</code>. Instance ownership is an operator role it
does not grant access to other people&apos;s private tree data.
</p>
<Card className="mt-6">
<CardContent className="grid grid-cols-2 gap-5 py-6 sm:grid-cols-3">
<Stat label="Version" value={i.version} />
<Stat label="Environment" value={i.env} />
<Stat label="Users" value={i.user_count} />
<Stat label="Trees" value={i.tree_count} />
<Stat
label="Email verification"
value={i.require_email_verification ? "required" : "off"}
/>
<Stat label="Owner(s)" value={i.owner_emails.join(", ") || "—"} />
</CardContent>
</Card>
<Card className="mt-4">
<CardContent className="flex flex-col gap-3 py-6">
<div className="text-sm font-medium">AI providers (instance-wide)</div>
{i.ai_providers.length === 0 ? (
<div className="text-sm text-[var(--muted)]">
None configured. Set provider credentials (Anthropic, OpenAI, x.AI, or
Ollama) in the server environment.
</div>
) : (
<ul className="flex flex-col gap-1 text-sm">
{i.ai_providers.map((p) => (
<li key={p.name} className="flex items-center justify-between">
<span className="font-medium">{p.name}</span>
<span className="text-[var(--muted)]">{p.model}</span>
</li>
))}
</ul>
)}
<div className="text-xs text-[var(--muted)]">
Default provider: {i.default_llm_provider}. Per-tree AI policy is set on
each tree&apos;s AI page.
</div>
</CardContent>
</Card>
</div>
);
}
+171
View File
@@ -0,0 +1,171 @@
"use client";
import { useParams } from "next/navigation";
import { useCallback, useEffect, useState } from "react";
import { api } from "@/lib/api/client";
import type { components } from "@/lib/api/schema";
import { Button } from "@/components/ui/button";
import { Card, CardContent } from "@/components/ui/card";
type Policy = components["schemas"]["TreeAiPolicyRead"];
// `null`/"" means "no AI for this role". The <select> uses "" for that option
// and we translate to null on save.
const NONE = "";
export default function AiPolicyPage() {
const { id: treeId } = useParams<{ id: string }>();
const [policy, setPolicy] = useState<Policy | null>(null);
const [member, setMember] = useState<string>(NONE);
const [recommender, setRecommender] = useState<string>(NONE);
const [ready, setReady] = useState(false);
const [forbidden, setForbidden] = useState(false);
const [saving, setSaving] = useState(false);
const [msg, setMsg] = useState<string | null>(null);
const load = useCallback(async () => {
const { data, response } = await api.GET("/api/v1/trees/{tree_id}/ai", {
params: { path: { tree_id: treeId } },
});
if (response.status === 403) {
setForbidden(true);
setReady(true);
return;
}
if (data) {
setPolicy(data);
setMember(data.member_provider ?? NONE);
setRecommender(data.recommender_provider ?? NONE);
}
setReady(true);
}, [treeId]);
useEffect(() => {
load();
}, [load]);
async function save() {
setSaving(true);
setMsg(null);
const { data, error } = await api.PATCH("/api/v1/trees/{tree_id}/ai", {
params: { path: { tree_id: treeId } },
body: {
member_provider: member || null,
recommender_provider: recommender || null,
},
});
setSaving(false);
if (error || !data) {
setMsg("Couldn't save — pick a provider your operator has configured.");
return;
}
setPolicy(data);
setMember(data.member_provider ?? NONE);
setRecommender(data.recommender_provider ?? NONE);
setMsg("Saved.");
}
if (!ready) {
return <div className="p-6 text-sm text-[var(--muted)]">Loading</div>;
}
if (forbidden) {
return (
<div className="mx-auto max-w-2xl p-6">
<h1 className="text-xl font-semibold">AI models</h1>
<p className="mt-2 text-sm text-[var(--muted)]">
Only the tree owner can configure which AI models this tree uses.
</p>
</div>
);
}
const providers = policy?.configured_providers ?? [];
const dirty =
member !== (policy?.member_provider ?? NONE) ||
recommender !== (policy?.recommender_provider ?? NONE);
const Select = ({
value,
onChange,
}: {
value: string;
onChange: (v: string) => void;
}) => (
<select
className="h-9 w-full max-w-xs rounded-md border border-[var(--border)] bg-[var(--surface)] px-2 text-sm"
value={value}
onChange={(e) => onChange(e.target.value)}
>
<option value={NONE}>No AI</option>
{providers.map((p) => (
<option key={p.name} value={p.name}>
{p.name} {p.model}
</option>
))}
</select>
);
return (
<div className="mx-auto max-w-2xl p-6">
<h1 className="text-xl font-semibold">AI models</h1>
<p className="mt-1 text-sm text-[var(--muted)]">
Choose which configured model each role uses. As the owner you can use any
configured provider; these settings pin members and the recommender to one.
</p>
{providers.length === 0 ? (
<Card className="mt-6">
<CardContent className="py-6 text-sm text-[var(--muted)]">
No AI providers are configured on this deployment. Set provider
credentials in the server environment (Anthropic, OpenAI, x.AI, or
Ollama) and they&apos;ll appear here.
</CardContent>
</Card>
) : (
<>
<Card className="mt-6">
<CardContent className="flex flex-col gap-6 py-6">
<div className="flex flex-col gap-2">
<div>
<div className="text-sm font-medium">Members&apos; assistant</div>
<div className="text-xs text-[var(--muted)]">
The model non-owner members&apos; AI assistant uses.
</div>
</div>
<Select value={member} onChange={setMember} />
</div>
<div className="flex flex-col gap-2">
<div>
<div className="text-sm font-medium">Recommender</div>
<div className="text-xs text-[var(--muted)]">
The model that finds associations and suggests connections.
</div>
</div>
<Select value={recommender} onChange={setRecommender} />
</div>
<div className="flex items-center gap-3">
<Button onClick={save} disabled={!dirty || saving}>
{saving ? "Saving…" : "Save"}
</Button>
{msg && <span className="text-sm text-[var(--muted)]">{msg}</span>}
</div>
</CardContent>
</Card>
<div className="mt-4 text-xs text-[var(--muted)]">
<span className="font-medium">Configured providers:</span>{" "}
{providers.map((p) => `${p.name} (${p.model})`).join(", ")}.
{policy?.default_provider && (
<> Default: {policy.default_provider}.</>
)}{" "}
As the owner you can use all of them.
</div>
</>
)}
</div>
);
}
+77
View File
@@ -10,6 +10,7 @@ import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
import { Input } from "@/components/ui/input";
type Deceased = components["schemas"]["DeceasedCandidate"];
type DeceasedByChild = components["schemas"]["DeceasedByChildCandidate"];
type GenderProp = components["schemas"]["GenderProposal"];
type NameIssue = components["schemas"]["NameIssue"];
type Person = components["schemas"]["PersonRead"];
@@ -31,6 +32,12 @@ export default function CleanupPage() {
const [decSel, setDecSel] = useState<Set<string>>(new Set());
const [decMsg, setDecMsg] = useState<string | null>(null);
// 1b) Deceased by a child's birth year (for parents with no birth date)
const [childYear, setChildYear] = useState(1900);
const [decByChild, setDecByChild] = useState<DeceasedByChild[] | null>(null);
const [dbcSel, setDbcSel] = useState<Set<string>>(new Set());
const [dbcMsg, setDbcMsg] = useState<string | null>(null);
// 2) Gender from source GEDCOM
const [gender, setGender] = useState<GenderProp[] | null>(null);
const [genSel, setGenSel] = useState<Set<string>>(new Set());
@@ -63,6 +70,23 @@ export default function CleanupPage() {
setDeceased(null);
}
async function previewDeceasedByChild() {
setDbcMsg(null);
const { data } = await api.GET("/api/v1/trees/{tree_id}/cleanup/deceased-by-child", {
params: { path: { tree_id: treeId }, query: { born_on_or_before: childYear } },
});
setDecByChild(data ?? []);
setDbcSel(new Set((data ?? []).map((d) => d.person_id)));
}
async function applyDeceasedByChild() {
const { data } = await api.POST("/api/v1/trees/{tree_id}/cleanup/deceased", {
params: { path: { tree_id: treeId } },
body: { person_ids: [...dbcSel] },
});
setDbcMsg(`Marked ${data?.updated ?? 0} people deceased.`);
setDecByChild(null);
}
async function previewGender(e: React.ChangeEvent<HTMLInputElement>) {
const file = e.target.files?.[0];
if (genFile.current) genFile.current.value = "";
@@ -231,6 +255,59 @@ export default function CleanupPage() {
</CardContent>
</Card>
{/* 1b) Deceased by a child's birth year */}
<Card>
<CardHeader>
<CardTitle className="text-base">Mark deceased by a childs birth year</CardTitle>
</CardHeader>
<CardContent className="space-y-3">
<p className="text-sm text-[var(--muted)]">
Catches parents who have <strong>no birth date of their own</strong> (so the rule
above cant reach them) but who have a child born long ago theyre necessarily
deceased.
</p>
<div className="flex flex-wrap items-end gap-2">
<label className="flex flex-col gap-1 text-sm">
<span className="text-xs text-[var(--muted)]">Has a child born on or before</span>
<Input
type="number"
className="w-28"
value={childYear}
onChange={(e) => setChildYear(Number(e.target.value))}
/>
</label>
<Button variant="outline" onClick={previewDeceasedByChild}>
Preview
</Button>
</div>
{dbcMsg && <p className="text-sm text-bronze">{dbcMsg}</p>}
{decByChild && (
<div className="space-y-2">
<p className="text-sm text-[var(--muted)]">
{decByChild.length} people with a child born {childYear} (not already marked
deceased).
</p>
<ul className="max-h-64 divide-y divide-[var(--border)] overflow-auto rounded-lg border border-[var(--border)]">
{decByChild.map((d) => (
<li key={d.person_id} className="flex items-center gap-3 px-3 py-1.5 text-sm">
<input
type="checkbox"
checked={dbcSel.has(d.person_id)}
onChange={() => toggle(dbcSel, d.person_id, setDbcSel)}
/>
<span className="flex-1">{d.name}</span>
<span className="text-xs text-[var(--muted)]">child b. {d.child_birth_year}</span>
</li>
))}
</ul>
{decByChild.length > 0 && (
<Button onClick={applyDeceasedByChild}>Mark {dbcSel.size} deceased</Button>
)}
</div>
)}
</CardContent>
</Card>
{/* 2) Gender from source */}
<Card>
<CardHeader>
@@ -135,7 +135,6 @@ export default function PersonDetailPage() {
const [evType, setEvType] = useState("birth");
const [evTypeOther, setEvTypeOther] = useState("");
const [evSpouse, setEvSpouse] = useState(""); // partner for a partnership event
const [allEvents, setAllEvents] = useState<Event[]>([]); // tree-wide, for partnership events
const [dateQual, setDateQual] = useState("exact");
const [dateDay, setDateDay] = useState("");
const [dateMonth, setDateMonth] = useState("");
@@ -189,8 +188,9 @@ export default function PersonDetailPage() {
return;
}
setPerson(p.data ?? null);
const [all, nm, mine, tr, ev, rl, src, cit, evAll, med] = await Promise.all([
api.GET("/api/v1/trees/{tree_id}/persons", { params: { path: { tree_id: treeId } } }),
// Person-scoped fetches only — the page no longer pulls the whole tree.
// /persons/{id}/events now includes this person's partnership events too.
const [nm, mine, tr, ev, rl, src, cit, med] = await Promise.all([
api.GET("/api/v1/trees/{tree_id}/persons/{person_id}/names", {
params: { path: { tree_id: treeId, person_id: personId } },
}),
@@ -204,22 +204,49 @@ export default function PersonDetailPage() {
}),
api.GET("/api/v1/trees/{tree_id}/sources", { params: { path: { tree_id: treeId } } }),
api.GET("/api/v1/trees/{tree_id}/citations", { params: { path: { tree_id: treeId } } }),
api.GET("/api/v1/trees/{tree_id}/events", { params: { path: { tree_id: treeId } } }),
api.GET("/api/v1/trees/{tree_id}/media", { params: { path: { tree_id: treeId } } }),
]);
setPeople(all.data ?? []);
setNames(nm.data ?? []);
setMe(mine.data ?? null);
setTree(tr.data ?? null);
setEvents(ev.data ?? []);
setAllEvents(evAll.data ?? []);
setMedia(med.data ?? []);
setRels(rl.data ?? []);
setSources(src.data ?? []);
setCitations(cit.data ?? []);
// Resolve the names of just this person's relatives (for display), by id —
// not the whole tree. The relationship/spouse pickers search on demand.
const relList = rl.data ?? [];
const relatedIds = Array.from(
new Set(
relList
.flatMap((r) => [r.person_from_id, r.person_to_id])
.filter((id): id is string => !!id && id !== personId),
),
);
if (relatedIds.length) {
const rel = await api.GET("/api/v1/trees/{tree_id}/persons", {
params: { path: { tree_id: treeId }, query: { ids: relatedIds.join(",") } },
});
setPeople(rel.data ?? []);
} else {
setPeople([]);
}
setReady(true);
}, [router, treeId, personId]);
// Server-side fuzzy search for the relative/spouse pickers — avoids loading
// every person just to search.
const searchPeople = useCallback(
async (query: string) => {
const r = await api.GET("/api/v1/trees/{tree_id}/persons", {
params: { path: { tree_id: treeId }, query: { q: query } },
});
return (r.data ?? []).filter((pp) => pp.id !== personId);
},
[treeId, personId],
);
useEffect(() => {
load();
}, [load]);
@@ -233,7 +260,6 @@ export default function PersonDetailPage() {
return (id: string) => m.get(id) ?? "source";
}, [sources]);
const others = people.filter((p) => p.id !== personId);
const parents = rels.filter((r) => r.type === "parent_child" && r.person_to_id === personId);
const children = rels.filter((r) => r.type === "parent_child" && r.person_from_id === personId);
const partners = rels.filter((r) => r.type === "partnership");
@@ -241,22 +267,18 @@ export default function PersonDetailPage() {
const eventCites = (id: string) => citations.filter((c) => c.event_id === id);
const personCites = citations.filter((c) => c.person_id === personId);
// Partnership events live on the relationship and show on both partners.
// Partnership events live on the relationship and show on both partners; the
// /persons/{id}/events endpoint now returns them alongside personal events.
const myPartnerRels = rels.filter(
(r) => r.type === "partnership" && (r.person_from_id === personId || r.person_to_id === personId),
);
const myPartnerRelIds = new Set(myPartnerRels.map((r) => r.id));
const relEvents = allEvents.filter(
(e) => e.relationship_id && myPartnerRelIds.has(e.relationship_id),
);
const spouseOfRelEvent = (relId: string | null | undefined) => {
const r = myPartnerRels.find((x) => x.id === relId);
if (!r) return null;
return r.person_from_id === personId ? r.person_to_id : r.person_from_id;
};
const isPartnershipType = (t: string) => PARTNERSHIP_EVENTS.includes(t);
// Personal events + this person's partnership events, shown together.
const shownEvents = [...events, ...relEvents];
const shownEvents = events;
async function addEvent(e: React.FormEvent) {
e.preventDefault();
@@ -1090,7 +1112,7 @@ export default function PersonDetailPage() {
<label className="flex flex-col gap-1">
<span className="text-xs text-[var(--muted)]">Spouse / partner</span>
<PersonCombobox
people={others}
onSearch={searchPeople}
value={evSpouse}
onChange={setEvSpouse}
placeholder="Search for a spouse…"
@@ -1158,36 +1180,32 @@ export default function PersonDetailPage() {
</div>
)}
{others.length === 0 ? (
<p className="text-sm text-[var(--muted)]">Add more people to the tree to link them.</p>
) : (
<form onSubmit={addRel} className="flex flex-wrap items-center gap-2">
<span className="text-sm text-[var(--muted)]">Add</span>
<select className={fieldCls} value={relKind} onChange={(e) => setRelKind(e.target.value as typeof relKind)}>
<option value="parent">parent</option>
<option value="child">child</option>
<option value="partner">partner</option>
<option value="sibling">sibling</option>
<form onSubmit={addRel} className="flex flex-wrap items-center gap-2">
<span className="text-sm text-[var(--muted)]">Add</span>
<select className={fieldCls} value={relKind} onChange={(e) => setRelKind(e.target.value as typeof relKind)}>
<option value="parent">parent</option>
<option value="child">child</option>
<option value="partner">partner</option>
<option value="sibling">sibling</option>
</select>
<PersonCombobox
onSearch={searchPeople}
value={relOther}
onChange={setRelOther}
onCreate={createRelativeAndGo}
placeholder="Search, or type a new name…"
/>
{(relKind === "parent" || relKind === "child") && (
<select className={fieldCls} value={relQual} onChange={(e) => setRelQual(e.target.value as Qualifier)}>
{QUALIFIERS.map((q) => (
<option key={q} value={q}>
{q}
</option>
))}
</select>
<PersonCombobox
people={others}
value={relOther}
onChange={setRelOther}
onCreate={createRelativeAndGo}
placeholder="Search, or type a new name…"
/>
{(relKind === "parent" || relKind === "child") && (
<select className={fieldCls} value={relQual} onChange={(e) => setRelQual(e.target.value as Qualifier)}>
{QUALIFIERS.map((q) => (
<option key={q} value={q}>
{q}
</option>
))}
</select>
)}
<Button type="submit">Link</Button>
</form>
)}
)}
<Button type="submit">Link</Button>
</form>
{relErr && <p className="text-sm text-red-600">{relErr}</p>}
</CardContent>
</Card>
+21 -2
View File
@@ -1,7 +1,10 @@
.f3 {
--female-color: rgb(196, 138, 146);
--male-color: rgb(120, 159, 172);
--genderless-color: lightgray;
/* Warm mid-gray for unset-sex / redacted "Living person" cards matches the
muted male/female tone weight and the brand palette, instead of the library's
washed-out lightgray. */
--genderless-color: rgb(156, 150, 143);
--background-color: rgb(33, 33, 33);
--text-color: #fff;
@@ -381,9 +384,25 @@
color: rgb(255, 251, 220);
background-color: rgba(255, 251, 220, 0);
border-radius: 50%;
padding: 2px;
padding: 2px 4px;
font-weight: 600;
cursor: pointer;
transition: color 0.2s ease-in-out, background-color 0.2s ease-in-out;
}
.f3 .f3-card-duplicate-tag:hover {
background-color: rgba(255, 251, 220, 0.9);
color: #000;
}
/* Click the ×N badge → every copy of that person flashes (see tree/page.tsx). */
@keyframes f3-card-flash {
0%, 100% { outline-color: rgba(160, 106, 66, 0); }
30%, 70% { outline-color: rgba(160, 106, 66, 1); }
}
.f3 .f3-card-flash .card-inner {
outline: 4px solid rgba(160, 106, 66, 1);
animation: f3-card-flash 0.55s ease-in-out 3;
}
.f3 .f3-card-duplicate-hover div.card-inner {
transform: translate(0, -2px);
+129 -6
View File
@@ -36,6 +36,12 @@ export default function TreePage() {
const containerRef = useRef<HTMLDivElement>(null);
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const chartRef = useRef<any>(null);
// family-chart's pan/zoom helpers (cardToMiddle, getCurrentZoom), captured at
// render — used to fly to a duplicate's other copy.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const handlersRef = useRef<any>(null);
// Per-person cursor so repeated clicks on a ×N badge cycle through the copies.
const dupCycle = useRef<Map<string, number>>(new Map());
const [query, setQuery] = useState("");
const [people, setPeople] = useState<Person[]>([]);
@@ -179,7 +185,9 @@ export default function TreePage() {
"first name": fn || "Unnamed",
"last name": ln,
birthday: years.get(pp.id) ?? "",
gender: pp.gender === "female" ? "F" : "M",
// male → blue, female → pink, unset → genderless (gray). Unset sex no
// longer defaults to male/blue (which was misleading).
gender: pp.gender === "male" ? "M" : pp.gender === "female" ? "F" : null,
},
rels: {
spouses: ok(partnersOf(pp.id), pp.id),
@@ -189,6 +197,7 @@ export default function TreePage() {
};
});
const f3 = await import("family-chart");
handlersRef.current = f3.handlers;
if (cancelled || !containerRef.current) return;
try {
containerRef.current.innerHTML = "";
@@ -252,6 +261,85 @@ export default function TreePage() {
[mode],
);
// Click the "×N" duplicate badge to FLY to the person's other copy in the
// view (cycling through them on repeat clicks) and flash it on arrival. The
// same record is drawn in two places (a shared ancestor, or an intermarriage),
// and on a big tree the other copy is usually off-screen. Delegated on the
// container so it survives chart rebuilds; capture-phase + stopPropagation so a
// badge click flies instead of recentering.
useEffect(() => {
const el = containerRef.current;
if (!el) return;
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const data = (n: Element | null) => (n as any)?.__data__;
const idOf = (n: Element | null) => data(n)?.data?.id as string | undefined;
const xyOf = (cont: Element): { x: number; y: number } | null => {
const d = data(cont);
if (d && typeof d.x === "number" && typeof d.y === "number") return { x: d.x, y: d.y };
const m = /translate\(\s*([-\d.]+)[ ,]+([-\d.]+)/.exec(cont.getAttribute("transform") ?? "");
return m ? { x: parseFloat(m[1]), y: parseFloat(m[2]) } : null;
};
const flash = (cont: Element | null) => {
const card = cont?.querySelector(".card") as HTMLElement | null;
if (!card) return;
card.classList.remove("f3-card-flash");
void card.offsetWidth; // restart the animation on repeat clicks
card.classList.add("f3-card-flash");
window.setTimeout(() => card.classList.remove("f3-card-flash"), 1900);
};
function onClick(e: MouseEvent) {
const tag = (e.target as HTMLElement).closest?.(".f3-card-duplicate-tag");
if (!tag) return;
e.preventDefault();
e.stopPropagation();
const clicked = tag.closest(".card_cont");
const id = idOf(clicked);
if (!id) return;
const copies = Array.from(el!.querySelectorAll(".card_cont")).filter((c) => idOf(c) === id);
if (copies.length < 2) {
flash(clicked);
return;
}
// Advance from wherever we last landed (or the clicked card), skipping the
// clicked copy, so each click moves to the next other location.
const start = dupCycle.current.get(id) ?? copies.indexOf(clicked as Element);
let next = (start + 1) % copies.length;
if (copies[next] === clicked) next = (next + 1) % copies.length;
dupCycle.current.set(id, next);
const target = copies[next];
const handlers = handlersRef.current;
const svg = el!.querySelector("svg.main_svg") as SVGSVGElement | null;
const xy = xyOf(target);
let flew = false;
if (handlers?.cardToMiddle && svg && xy) {
try {
const rect = svg.getBoundingClientRect();
const scale = handlers.getCurrentZoom ? handlers.getCurrentZoom(svg).k : 1;
// cardToMiddle centers the datum at the current zoom. (Its vertical
// centering at non-1 zoom is fixed in our family-chart patch — see
// CLAUDE.md / upstream PR donatso/family-chart#103 — so we pass the
// raw y; do NOT pre-scale it here or it double-corrects.)
handlers.cardToMiddle({
datum: xy,
svg,
svg_dim: { width: rect.width, height: rect.height },
scale,
transition_time: 750,
});
flew = true;
} catch {
/* zoom not ready — fall back to flashing in place */
}
}
// Flash on arrival (after the fly), or immediately if we couldn't fly.
window.setTimeout(() => flash(target), flew ? 900 : 0);
}
el.addEventListener("click", onClick, true);
return () => el.removeEventListener("click", onClick, true);
}, []);
// Mirror the focused person into the URL (?focus=…) so navigating away and
// back — or sharing the link — keeps the tree centered where you left it.
// `replace` (not push) so each recenter doesn't pile up in browser history.
@@ -402,11 +490,46 @@ export default function TreePage() {
/>
)}
<p className="text-sm text-[var(--muted)]">
{mode === "fan"
? "Click an ancestor to recenter the fan."
: "Drag to pan · scroll to zoom · click a person to recenter."}
</p>
<div className="flex flex-wrap items-center gap-x-3 gap-y-1 text-sm text-[var(--muted)]">
<span>
{mode === "fan"
? "Click an ancestor to recenter the fan."
: "Drag to pan · scroll to zoom · click a person to recenter."}
</span>
{mode !== "fan" && (
<div className="group relative">
<button
type="button"
className="underline decoration-dotted underline-offset-2 hover:text-bronze focus-visible:text-bronze focus-visible:outline-none"
>
Legend
</button>
<div className="invisible absolute bottom-full left-0 z-30 mb-2 w-80 rounded-lg border border-[var(--border)] bg-[var(--surface)] p-3 text-xs text-[var(--foreground)] opacity-0 shadow-lg transition-opacity group-hover:visible group-hover:opacity-100 group-focus-within:visible group-focus-within:opacity-100">
<ul className="space-y-2">
<li>
<span className="font-semibold text-bronze">×N</span> on a card this
person appears N times in the current view. The same record is drawn in
two places because they connect through more than one line (a shared
ancestor, or an intermarriage).{" "}
<span className="text-[var(--muted)]">Click the ×N to fly to the other copies (click again to cycle).</span>
</li>
<li className="flex flex-wrap items-center gap-x-2 gap-y-1">
<span className="inline-flex items-center gap-1">
<span className="inline-block h-2.5 w-2.5 rounded-sm" style={{ background: "rgb(120,159,172)" }} /> male
</span>
<span className="inline-flex items-center gap-1">
<span className="inline-block h-2.5 w-2.5 rounded-sm" style={{ background: "rgb(196,138,146)" }} /> female
</span>
<span className="inline-flex items-center gap-1">
<span className="inline-block h-2.5 w-2.5 rounded-sm" style={{ background: "lightgray" }} /> sex not set
</span>
</li>
<li>Drag to pan, scroll to zoom, and click any card to recenter the tree on that person.</li>
</ul>
</div>
</div>
)}
</div>
</div>
);
}
+35 -2
View File
@@ -53,6 +53,26 @@ export default function TreesPage() {
await api.POST("/api/v1/trees/{tree_id}/restore", { params: { path: { tree_id: id } } });
load();
}
async function purge(id: string, treeName: string) {
const typed = window.prompt(
`Permanently delete "${treeName}" and ALL its data (people, sources, media, …)?\n\n` +
"This CANNOT be undone. Type the tree name to confirm:",
);
if (typed == null) return; // cancelled
const { error, response } = await api.POST("/api/v1/trees/{tree_id}/purge", {
params: { path: { tree_id: id } },
body: { confirm_name: typed },
});
if (error) {
window.alert(
response.status === 403
? "The name didn't match — nothing was deleted."
: "Couldn't purge that tree.",
);
return;
}
load();
}
// Optimistic visibility change so the dropdown reflects the pick immediately.
async function setVisibility(id: string, visibility: NonNullable<Tree["visibility"]>) {
setTrees((cur) => cur.map((t) => (t.id === id ? { ...t, visibility } : t)));
@@ -139,15 +159,28 @@ export default function TreesPage() {
<h2 className="font-serif text-base font-semibold text-[var(--muted)]">
Recently deleted
</h2>
<p className="text-xs text-[var(--muted)]">
Restorable for 30 days, after which they&apos;re purged automatically. Use
Delete forever to purge one now.
</p>
<ul className="space-y-2">
{deleted.map((tree) => (
<li key={tree.id}>
<Card>
<CardContent className="flex items-center justify-between p-4">
<span className="text-[var(--muted)]">{tree.name}</span>
<CardContent className="flex items-center justify-between gap-2 p-4">
<span className="min-w-0 flex-1 truncate text-[var(--muted)]">{tree.name}</span>
<Button variant="outline" size="sm" onClick={() => restore(tree.id)}>
Restore
</Button>
<Button
variant="outline"
size="sm"
onClick={() => purge(tree.id, tree.name)}
className="border-bronze/40 text-bronze hover:bg-bronze/10"
title="Permanently delete this tree and all its data"
>
Delete forever
</Button>
</CardContent>
</Card>
</li>
+21 -1
View File
@@ -4,6 +4,7 @@ import {
Archive,
ArrowDownUp,
BookText,
Bot,
ClipboardCheck,
Compass,
FolderTree,
@@ -11,6 +12,7 @@ import {
LogOut,
Network,
Settings,
ShieldCheck,
Sparkles,
UserPlus,
Users,
@@ -29,7 +31,11 @@ export function AppSidebar({ onNavigate }: { onNavigate?: () => void }) {
const segs = pathname.split("/").filter(Boolean); // ["trees", "<id>", ...]
const treeId = segs[0] === "trees" && segs[1] ? segs[1] : null;
const [treeName, setTreeName] = useState<string | null>(null);
const [me, setMe] = useState<{ display_name: string | null; email: string } | null>(null);
const [me, setMe] = useState<{
display_name: string | null;
email: string;
is_instance_owner?: boolean;
} | null>(null);
const [menuOpen, setMenuOpen] = useState(false);
const menuRef = useRef<HTMLDivElement>(null);
@@ -97,6 +103,14 @@ export function AppSidebar({ onNavigate }: { onNavigate?: () => void }) {
<Item href="/trees" label="Trees" icon={FolderTree} active={pathname === "/trees"} />
<Item href="/explore" label="Explore" icon={Compass} active={pathname === "/explore"} />
<Item href="/import" label="Import" icon={ArrowDownUp} active={pathname === "/import"} />
{me?.is_instance_owner && (
<Item
href="/admin"
label="Admin"
icon={ShieldCheck}
active={pathname.startsWith("/admin")}
/>
)}
{treeId && (
<div className="mt-5 flex flex-col gap-1">
@@ -151,6 +165,12 @@ export function AppSidebar({ onNavigate }: { onNavigate?: () => void }) {
icon={UserPlus}
active={pathname.startsWith(`/trees/${treeId}/members`)}
/>
<Item
href={`/trees/${treeId}/ai`}
label="AI models"
icon={Bot}
active={pathname.startsWith(`/trees/${treeId}/ai`)}
/>
<Item
href={`/trees/${treeId}/recovery`}
label="Recovery"
+63 -19
View File
@@ -1,26 +1,30 @@
"use client";
import { useEffect, useMemo, useRef, useState } from "react";
import { useCallback, useEffect, useMemo, useRef, useState } from "react";
import type { components } from "@/lib/api/schema";
type Person = components["schemas"]["PersonRead"];
/**
* A type-to-filter person picker. Shows a text input; as you type, a dropdown
* of matching people appears. Selecting one sets `value` (a person id) and
* fills the input with their name. Replaces a plain <select> when the list is
* long enough that scanning it by hand is painful.
* A type-to-pick person picker. Two modes:
* - client (`people`): filter a preloaded list in the browser.
* - server (`onSearch`): query the backend (debounced) as you type the
* preferred mode for large trees, so the page doesn't
* have to preload every person just to search.
* Selecting one sets `value` (a person id) and fills the input with their name.
*/
export function PersonCombobox({
people,
onSearch,
value,
onChange,
onCreate,
placeholder = "Search for a person…",
className,
}: {
people: Person[];
people?: Person[];
onSearch?: (q: string) => Promise<Person[]>;
value: string;
onChange: (id: string) => void;
/** When set, the dropdown offers a "Create '<typed name>'" action. */
@@ -30,21 +34,27 @@ export function PersonCombobox({
}) {
const [query, setQuery] = useState("");
const [open, setOpen] = useState(false);
const [results, setResults] = useState<Person[]>([]);
const [loading, setLoading] = useState(false);
const wrapRef = useRef<HTMLDivElement>(null);
// Names we've seen (from the list or search results), so a selected value
// keeps displaying its name even in server mode.
const known = useRef<Map<string, string>>(new Map());
const nameOf = useMemo(
() => new Map(people.map((p) => [p.id, p.primary_name ?? "Unnamed"])),
[people],
);
const remember = useCallback((ps: Person[] | undefined) => {
for (const p of ps ?? []) known.current.set(p.id, p.primary_name ?? "Unnamed");
}, []);
useEffect(() => {
remember(people);
}, [people, remember]);
const nameOf = useCallback((id: string) => known.current.get(id) ?? "", []);
// Keep the input text in sync when the selection changes externally
// (e.g. cleared to "" after a successful add).
useEffect(() => {
if (!value) {
setQuery("");
} else if (!open) {
setQuery(nameOf.get(value) ?? "");
}
if (!value) setQuery("");
else if (!open) setQuery(nameOf(value));
}, [value, open, nameOf]);
// Close on outside click.
@@ -56,17 +66,48 @@ export function PersonCombobox({
return () => document.removeEventListener("mousedown", onDoc);
}, []);
// Server search, debounced. Stale responses are dropped via `cancelled`.
useEffect(() => {
if (!onSearch) return;
const q = query.trim();
if (!q) {
setResults([]);
setLoading(false);
return;
}
setLoading(true);
let cancelled = false;
const t = setTimeout(async () => {
try {
const r = await onSearch(q);
if (cancelled) return;
remember(r);
setResults(r);
} finally {
if (!cancelled) setLoading(false);
}
}, 160);
return () => {
cancelled = true;
clearTimeout(t);
};
}, [query, onSearch, remember]);
const matches = useMemo(() => {
if (onSearch) return results.slice(0, 10);
const q = query.trim().toLowerCase();
const pool = q
? people.filter((p) => (p.primary_name ?? "").toLowerCase().includes(q))
: people;
? (people ?? []).filter((p) => (p.primary_name ?? "").toLowerCase().includes(q))
: (people ?? []);
return pool.slice(0, 10);
}, [query, people]);
}, [query, results, people, onSearch]);
const base =
"h-9 w-56 rounded-md border border-[var(--border)] bg-[var(--surface)] px-2 text-sm placeholder:text-[var(--muted)] focus-visible:border-bronze focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-bronze/40";
const showDropdown =
open && (matches.length > 0 || loading || (onCreate && query.trim()));
return (
<div ref={wrapRef} className="relative">
<input
@@ -80,8 +121,11 @@ export function PersonCombobox({
if (value) onChange(""); // typing invalidates the prior pick
}}
/>
{open && (matches.length > 0 || (onCreate && query.trim())) && (
{showDropdown && (
<ul className="absolute z-30 mt-1 max-h-64 w-72 overflow-auto rounded-lg border border-[var(--border)] bg-[var(--surface)] shadow-lg">
{loading && matches.length === 0 && (
<li className="px-3 py-2 text-sm text-[var(--muted)]">Searching</li>
)}
{matches.map((p) => (
<li key={p.id}>
<button
+4 -1
View File
@@ -120,7 +120,10 @@ export function PublicTreeChart({
"first name": fn || "Unnamed",
"last name": ln,
birthday: years.get(pp.id) ?? "",
gender: pp.gender === "female" ? "F" : "M",
// male → blue, female → pink, unset/redacted → genderless (gray).
// Redacted living people have null gender, so they render gray rather
// than defaulting to male/blue (and never imply a real person's sex).
gender: pp.gender === "male" ? "M" : pp.gender === "female" ? "F" : null,
},
rels: {
spouses: ok(partnersOf(pp.id), pp.id),
+299
View File
@@ -293,6 +293,27 @@ export interface paths {
patch?: never;
trace?: never;
};
"/api/v1/trees/{tree_id}/purge": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
/**
* Purge Tree
* @description Permanently delete a soft-deleted tree and all its data irreversible.
* Owner-only; the tree must be in the trash and `confirm_name` must match.
*/
post: operations["purge_tree_api_v1_trees__tree_id__purge_post"];
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/api/v1/trees/{tree_id}/persons": {
parameters: {
query?: never;
@@ -697,6 +718,27 @@ export interface paths {
patch?: never;
trace?: never;
};
"/api/v1/trees/{tree_id}/cleanup/deceased-by-child": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* Preview Deceased By Child
* @description People with a child born on/before the cutoff necessarily deceased even
* when their own birth date is missing. Apply via POST .../cleanup/deceased.
*/
get: operations["preview_deceased_by_child_api_v1_trees__tree_id__cleanup_deceased_by_child_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/api/v1/trees/{tree_id}/cleanup/gender/preview": {
parameters: {
query?: never;
@@ -1031,6 +1073,44 @@ export interface paths {
patch?: never;
trace?: never;
};
"/api/v1/trees/{tree_id}/ai": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/** Get Ai Policy */
get: operations["get_ai_policy_api_v1_trees__tree_id__ai_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
/** Update Ai Policy */
patch: operations["update_ai_policy_api_v1_trees__tree_id__ai_patch"];
trace?: never;
};
"/api/v1/admin/instance": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* Instance Status
* @description Operator dashboard data. Requires the caller to be an instance owner.
*/
get: operations["instance_status_api_v1_admin_instance_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
}
export type webhooks = Record<string, never>;
export interface components {
@@ -1215,11 +1295,30 @@ export interface components {
/** Updated */
updated: number;
};
/** ConfiguredProvider */
ConfiguredProvider: {
/** Name */
name: string;
/** Model */
model: string;
};
/** DeceasedApply */
DeceasedApply: {
/** Person Ids */
person_ids: string[];
};
/** DeceasedByChildCandidate */
DeceasedByChildCandidate: {
/**
* Person Id
* Format: uuid
*/
person_id: string;
/** Name */
name: string;
/** Child Birth Year */
child_birth_year: number;
};
/** DeceasedCandidate */
DeceasedCandidate: {
/**
@@ -1393,6 +1492,25 @@ export interface components {
/** Unmapped Tags */
unmapped_tags: string[];
};
/** InstanceStatus */
InstanceStatus: {
/** Version */
version: string;
/** Env */
env: string;
/** Owner Emails */
owner_emails: string[];
/** Require Email Verification */
require_email_verification: boolean;
/** User Count */
user_count: number;
/** Tree Count */
tree_count: number;
/** Default Llm Provider */
default_llm_provider: string;
/** Ai Providers */
ai_providers: components["schemas"]["ConfiguredProvider"][];
};
/** LoginRequest */
LoginRequest: {
/** Email */
@@ -1886,6 +2004,24 @@ export interface components {
/** Token */
token: string;
};
/** TreeAiPolicyRead */
TreeAiPolicyRead: {
/** Member Provider */
member_provider: string | null;
/** Recommender Provider */
recommender_provider: string | null;
/** Configured Providers */
configured_providers: components["schemas"]["ConfiguredProvider"][];
/** Default Provider */
default_provider: string;
};
/** TreeAiPolicyUpdate */
TreeAiPolicyUpdate: {
/** Member Provider */
member_provider?: string | null;
/** Recommender Provider */
recommender_provider?: string | null;
};
/** TreeCreate */
TreeCreate: {
/** Name */
@@ -1895,6 +2031,11 @@ export interface components {
/** @default private */
visibility?: components["schemas"]["TreeVisibility"];
};
/** TreePurge */
TreePurge: {
/** Confirm Name */
confirm_name: string;
};
/** TreeRead */
TreeRead: {
/**
@@ -1955,6 +2096,11 @@ export interface components {
* Format: date-time
*/
created_at: string;
/**
* Is Instance Owner
* @default false
*/
is_instance_owner?: boolean;
};
/** UserSelfPersonUpdate */
UserSelfPersonUpdate: {
@@ -2568,11 +2714,45 @@ export interface operations {
};
};
};
purge_tree_api_v1_trees__tree_id__purge_post: {
parameters: {
query?: never;
header?: never;
path: {
tree_id: string;
};
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["TreePurge"];
};
};
responses: {
/** @description Successful Response */
204: {
headers: {
[name: string]: unknown;
};
content?: never;
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
list_persons_api_v1_trees__tree_id__persons_get: {
parameters: {
query?: {
deleted?: boolean;
q?: string | null;
ids?: string | null;
};
header?: never;
path: {
@@ -3866,6 +4046,39 @@ export interface operations {
};
};
};
preview_deceased_by_child_api_v1_trees__tree_id__cleanup_deceased_by_child_get: {
parameters: {
query?: {
born_on_or_before?: number;
};
header?: never;
path: {
tree_id: string;
};
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["DeceasedByChildCandidate"][];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
preview_gender_api_v1_trees__tree_id__cleanup_gender_preview_post: {
parameters: {
query?: never;
@@ -4651,4 +4864,90 @@ export interface operations {
};
};
};
get_ai_policy_api_v1_trees__tree_id__ai_get: {
parameters: {
query?: never;
header?: never;
path: {
tree_id: string;
};
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["TreeAiPolicyRead"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
update_ai_policy_api_v1_trees__tree_id__ai_patch: {
parameters: {
query?: never;
header?: never;
path: {
tree_id: string;
};
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["TreeAiPolicyUpdate"];
};
};
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["TreeAiPolicyRead"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
instance_status_api_v1_admin_instance_get: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["InstanceStatus"];
};
};
};
};
}
+425 -1
View File
@@ -710,6 +710,53 @@
}
}
},
"/api/v1/trees/{tree_id}/purge": {
"post": {
"tags": [
"trees"
],
"summary": "Purge Tree",
"description": "Permanently delete a soft-deleted tree and all its data \u2014 irreversible.\nOwner-only; the tree must be in the trash and `confirm_name` must match.",
"operationId": "purge_tree_api_v1_trees__tree_id__purge_post",
"parameters": [
{
"name": "tree_id",
"in": "path",
"required": true,
"schema": {
"type": "string",
"format": "uuid",
"title": "Tree Id"
}
}
],
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/TreePurge"
}
}
}
},
"responses": {
"204": {
"description": "Successful Response"
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/api/v1/trees/{tree_id}/persons": {
"post": {
"tags": [
@@ -804,6 +851,22 @@
],
"title": "Q"
}
},
{
"name": "ids",
"in": "query",
"required": false,
"schema": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Ids"
}
}
],
"responses": {
@@ -2810,6 +2873,64 @@
}
}
},
"/api/v1/trees/{tree_id}/cleanup/deceased-by-child": {
"get": {
"tags": [
"cleanup"
],
"summary": "Preview Deceased By Child",
"description": "People with a child born on/before the cutoff \u2014 necessarily deceased even\nwhen their own birth date is missing. Apply via POST .../cleanup/deceased.",
"operationId": "preview_deceased_by_child_api_v1_trees__tree_id__cleanup_deceased_by_child_get",
"parameters": [
{
"name": "tree_id",
"in": "path",
"required": true,
"schema": {
"type": "string",
"format": "uuid",
"title": "Tree Id"
}
},
{
"name": "born_on_or_before",
"in": "query",
"required": false,
"schema": {
"type": "integer",
"default": 1900,
"title": "Born On Or Before"
}
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {
"type": "array",
"items": {
"$ref": "#/components/schemas/DeceasedByChildCandidate"
},
"title": "Response Preview Deceased By Child Api V1 Trees Tree Id Cleanup Deceased By Child Get"
}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/api/v1/trees/{tree_id}/cleanup/gender/preview": {
"post": {
"tags": [
@@ -4093,6 +4214,122 @@
}
}
}
},
"/api/v1/trees/{tree_id}/ai": {
"get": {
"tags": [
"ai"
],
"summary": "Get Ai Policy",
"operationId": "get_ai_policy_api_v1_trees__tree_id__ai_get",
"parameters": [
{
"name": "tree_id",
"in": "path",
"required": true,
"schema": {
"type": "string",
"format": "uuid",
"title": "Tree Id"
}
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/TreeAiPolicyRead"
}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
},
"patch": {
"tags": [
"ai"
],
"summary": "Update Ai Policy",
"operationId": "update_ai_policy_api_v1_trees__tree_id__ai_patch",
"parameters": [
{
"name": "tree_id",
"in": "path",
"required": true,
"schema": {
"type": "string",
"format": "uuid",
"title": "Tree Id"
}
}
],
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/TreeAiPolicyUpdate"
}
}
}
},
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/TreeAiPolicyRead"
}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/api/v1/admin/instance": {
"get": {
"tags": [
"admin"
],
"summary": "Instance Status",
"description": "Operator dashboard data. Requires the caller to be an instance owner.",
"operationId": "instance_status_api_v1_admin_instance_get",
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/InstanceStatus"
}
}
}
}
}
}
}
},
"components": {
@@ -4683,6 +4920,24 @@
],
"title": "CleanupResult"
},
"ConfiguredProvider": {
"properties": {
"name": {
"type": "string",
"title": "Name"
},
"model": {
"type": "string",
"title": "Model"
}
},
"type": "object",
"required": [
"name",
"model"
],
"title": "ConfiguredProvider"
},
"DeceasedApply": {
"properties": {
"person_ids": {
@@ -4700,6 +4955,30 @@
],
"title": "DeceasedApply"
},
"DeceasedByChildCandidate": {
"properties": {
"person_id": {
"type": "string",
"format": "uuid",
"title": "Person Id"
},
"name": {
"type": "string",
"title": "Name"
},
"child_birth_year": {
"type": "integer",
"title": "Child Birth Year"
}
},
"type": "object",
"required": [
"person_id",
"name",
"child_birth_year"
],
"title": "DeceasedByChildCandidate"
},
"DeceasedCandidate": {
"properties": {
"person_id": {
@@ -5287,6 +5566,60 @@
],
"title": "ImportReport"
},
"InstanceStatus": {
"properties": {
"version": {
"type": "string",
"title": "Version"
},
"env": {
"type": "string",
"title": "Env"
},
"owner_emails": {
"items": {
"type": "string"
},
"type": "array",
"title": "Owner Emails"
},
"require_email_verification": {
"type": "boolean",
"title": "Require Email Verification"
},
"user_count": {
"type": "integer",
"title": "User Count"
},
"tree_count": {
"type": "integer",
"title": "Tree Count"
},
"default_llm_provider": {
"type": "string",
"title": "Default Llm Provider"
},
"ai_providers": {
"items": {
"$ref": "#/components/schemas/ConfiguredProvider"
},
"type": "array",
"title": "Ai Providers"
}
},
"type": "object",
"required": [
"version",
"env",
"owner_emails",
"require_email_verification",
"user_count",
"tree_count",
"default_llm_provider",
"ai_providers"
],
"title": "InstanceStatus"
},
"LoginRequest": {
"properties": {
"email": {
@@ -6812,6 +7145,79 @@
],
"title": "TokenRequest"
},
"TreeAiPolicyRead": {
"properties": {
"member_provider": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Member Provider"
},
"recommender_provider": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Recommender Provider"
},
"configured_providers": {
"items": {
"$ref": "#/components/schemas/ConfiguredProvider"
},
"type": "array",
"title": "Configured Providers"
},
"default_provider": {
"type": "string",
"title": "Default Provider"
}
},
"type": "object",
"required": [
"member_provider",
"recommender_provider",
"configured_providers",
"default_provider"
],
"title": "TreeAiPolicyRead"
},
"TreeAiPolicyUpdate": {
"properties": {
"member_provider": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Member Provider"
},
"recommender_provider": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Recommender Provider"
}
},
"type": "object",
"title": "TreeAiPolicyUpdate"
},
"TreeCreate": {
"properties": {
"name": {
@@ -6840,6 +7246,19 @@
],
"title": "TreeCreate"
},
"TreePurge": {
"properties": {
"confirm_name": {
"type": "string",
"title": "Confirm Name"
}
},
"type": "object",
"required": [
"confirm_name"
],
"title": "TreePurge"
},
"TreeRead": {
"properties": {
"id": {
@@ -7009,6 +7428,11 @@
"type": "string",
"format": "date-time",
"title": "Created At"
},
"is_instance_owner": {
"type": "boolean",
"title": "Is Instance Owner",
"default": false
}
},
"type": "object",
@@ -7081,4 +7505,4 @@
}
}
}
}
}
+20 -2
View File
@@ -1,5 +1,5 @@
diff --git a/node_modules/family-chart/dist/family-chart.esm.js b/node_modules/family-chart/dist/family-chart.esm.js
index 3867be0..560c99e 100644
index 3867be0..656fafa 100644
--- a/node_modules/family-chart/dist/family-chart.esm.js
+++ b/node_modules/family-chart/dist/family-chart.esm.js
@@ -10,10 +10,10 @@ function sortChildrenWithSpouses(children, datum, data) {
@@ -61,8 +61,17 @@ index 3867be0..560c99e 100644
if (!d.spouses)
d.spouses = [];
d.spouses.push(spouse);
@@ -1073,7 +1091,7 @@ function calculateTreeFit(svg_dim, tree_dim) {
return { k, x, y };
}
function cardToMiddle({ datum, svg, svg_dim, scale, transition_time }) {
- const k = scale || 1, x = svg_dim.width / 2 - datum.x * k, y = svg_dim.height / 2 - datum.y, t = { k, x: x / k, y: y / k };
+ const k = scale || 1, x = svg_dim.width / 2 - datum.x * k, y = svg_dim.height / 2 - datum.y * k, t = { k, x: x / k, y: y / k };
positionTree({ t, svg, transition_time });
}
function manualZoom({ amount, svg, transition_time = 500 }) {
diff --git a/node_modules/family-chart/dist/family-chart.js b/node_modules/family-chart/dist/family-chart.js
index 1c750d4..47efcc2 100644
index 1c750d4..edeb804 100644
--- a/node_modules/family-chart/dist/family-chart.js
+++ b/node_modules/family-chart/dist/family-chart.js
@@ -33,10 +33,9 @@
@@ -116,3 +125,12 @@ index 1c750d4..47efcc2 100644
if (!d.spouses)
d.spouses = [];
d.spouses.push(spouse);
@@ -1096,7 +1106,7 @@
return { k, x, y };
}
function cardToMiddle({ datum, svg, svg_dim, scale, transition_time }) {
- const k = scale || 1, x = svg_dim.width / 2 - datum.x * k, y = svg_dim.height / 2 - datum.y, t = { k, x: x / k, y: y / k };
+ const k = scale || 1, x = svg_dim.width / 2 - datum.x * k, y = svg_dim.height / 2 - datum.y * k, t = { k, x: x / k, y: y / k };
positionTree({ t, svg, transition_time });
}
function manualZoom({ amount, svg, transition_time = 500 }) {