# Design note: tree visibility & the public viewing surface Status: **Shipped (#41-#51)**. Owner: Justin. Created 2026-06-09. This is a privacy-critical change (it created the first anonymous read surface in Provenance). Per CLAUDE.md, it was designed before code and shipped in small, individually-reviewable PRs, with tests on the privacy engine and the public read path landing before any anonymous endpoint was exposed. ## 1. The model Visibility flattens **two axes** — *who may read* and *how discoverable* — into one ordered enum for the UI: | Level | Anonymous (no login) | Any logged-in user | Tree members | In-app directory | Search-indexed | |---|---|---|---|---|---| | `public` — anyone on the web | ✅ view¹ | ✅ view¹ | ✅ full | ✅ listed to everyone | ✅ sitemap + indexable | | `site_members` — Public, Site Members | ❌ | ✅ view¹ | ✅ full | ✅ listed to logged-in users | ❌ (`noindex`) | | `unlisted` — anyone with the link | ✅ via direct link¹ | ✅ via link¹ | ✅ full | ❌ never listed | ❌ (`noindex`) | | `private` | ❌ | ❌ | ✅ full | ❌ | ❌ | ¹ **Every non-member view passes through the privacy engine.** Living people are redacted, and per-person `private` hides / `public` reveals, exactly as `person_visibility()` already does (`backend/app/services/privacy.py:100-110`). This is the single enforcement point — no public code path may issue a raw query. Decisions captured (2026-06-09): - **Unlisted** = anyone with the link, no account required. The link must be **unguessable** (the tree UUID is already non-enumerable; do not add a public integer id). Unlisted trees are excluded from the directory and sitemap and served `noindex`. - **Public** discovery for v1 includes **an in-app public browse/search**, not just search-engine indexing. - **Public – Site Members** = *any* registered account on this instance (not an invite list — that is already tree membership / `private`). ## 2. Data model `TreeVisibility` enum (`backend/app/models/enums.py`) gains a value: ``` public # anyone on the web site_members # any authenticated user of this instance <-- NEW unlisted # anyone with the link private # members only (default) ``` - Alembic migration to `ALTER TYPE tree_visibility ADD VALUE 'site_members'` (Postgres enum add-value cannot run inside a transaction with other DDL — use `op.execute` with autocommit, separate migration). - Default stays `private`. Existing rows unchanged. - `TreeRead`/`TreeUpdate`/`TreeCreate` schemas already carry the enum; they pick up the new value automatically. The OpenAPI client regen (`gen:api`) exposes it to the frontend. ## 3. Privacy engine `can_view_tree()` today treats `public` and `unlisted` identically and ignores whether the viewer is anonymous vs authenticated (`privacy.py:44-49`). Replace the final line with explicit branching on viewer auth state: ``` if membership: return True # members always match tree.visibility: public, unlisted: return True # anonymous OK (unlisted gated only by knowing the link) site_members: return user_id is not None # any logged-in account private: return False ``` `person_visibility()` is unchanged — it already redacts living/private people for non-members. Add focused unit tests: anonymous + each visibility; living person redacted on public/unlisted; `site_members` denies anonymous but allows a logged-in non-member; `private` denies both. ## 4. The anonymous read path (the careful part) **Shipped: a dedicated read-only public API namespace**, not optional-auth on the existing endpoints. Rationale: it is far easier to audit a small, purpose-built surface that *always* funnels through `person_visibility` than to weaken the membership checks on the authenticated endpoints and hope every branch is covered. - Router `app/api/v1/public.py`, mounted at `/api/v1/public`, with an **optional-auth** dependency `CurrentUserOrNone` (returns `User | None`; never 401s). Contrast with `CurrentUser` (`deps.py:30-36`) which hard-401s. - Endpoints (read-only; no create/update/delete): - `GET /public/trees` — directory: lists `public` to everyone; additionally lists `site_members` when the caller is authenticated. Paginated, search via existing `pg_trgm`. Never lists `unlisted`/`private`. - `GET /public/trees/{id}` — tree metadata if `can_view_tree(user_or_none)`. - `GET /public/trees/{id}/persons`, `/persons/{pid}`, `/persons/{pid}/names`, `/relationships`, `/events` — each filtered through `person_visibility`. (Media is not exposed on the public surface yet — deferred.) - **Redaction happens in the service, before serialization** — this is the safety guarantee. It did **not** ship as a separate `PublicPersonRead` schema (that recommendation was not adopted): the public router **reuses the member read schemas** (`PersonRead`, `RelationshipRead`, `EventRead`, `NameRead`), and only the tree projection (`PublicTreeRead`) is distinct. Safety comes from `public_view_service` resolving `person_visibility` and then **dropping hidden rows and redacting possibly-living people** (`person_service._redact` rewrites the name to "Living person", etc.) *before* a row is ever validated into a schema. No route hands a raw row to the serializer. - **Rate limiting** on the public namespace (per-IP) is **deferred** — it is not implemented in the app and may be handled at the Caddy edge if needed. - **Audit**: count public reads; do not log PII. ## 5. Frontend public pages - New **server-rendered** routes outside the authed app shell, e.g. `/p/[treeId]` (tree), `/p/[treeId]/[personId]` (person), `/explore` (directory). Server components fetch the `/api/v1/public/*` endpoints; no login redirect. - `robots`: ships a coarse `allow: ["/", "/p/"]` rule (`frontend/app/robots.ts`) that keeps the authed app out of the index. Per-tree `noindex, nofollow` meta for `unlisted`/`site_members` and a `public`-only **sitemap** did **not** ship — both are **deferred** follow-ups (per-tree noindex needs server rendering; meanwhile `unlisted`/`site_members` trees aren't linked or listed, so they aren't crawl-discoverable). - The directory `/explore` is anonymous for `public`; shows `site_members` trees only to logged-in users. - Reuse the tree/person view components where possible, fed by the redacted schema. ## 6. UI control Update the visibility dropdown (`frontend/app/trees/page.tsx`, shipped in PR #41) from 3 to 4 options with helper text: ``` Private — only you and people you invite Public – Members — any signed-in user on this site Unlisted — anyone with the link (not listed or indexed) Public — anyone on the web; listed and search-indexable ``` A short confirmation when switching *to* `public` ("This makes visible to anyone on the web. Living people stay hidden.") is worthwhile given the stakes. ## 7. Guardrails / invariants - One enforcement point: every public response is built from `person_visibility` output. No raw repository reads in the public router. - Living-person protection holds regardless of tree visibility. - Unlisted relies on UUID unguessability; never expose a sequential public id. - Per-tree `noindex` (everything except `public`) and a `public`-only sitemap are **deferred** (see §5); today `robots.ts` keeps the authed app out of the index and `unlisted`/`site_members` trees aren't linked or listed. - Tests gate the merge: privacy-engine matrix + an integration test that hits the public endpoints anonymously and asserts no living-person PII leaks. ## 8. Suggested phasing (small PRs) 1. Enum value + migration + regen client (+ dropdown → 4 options). No behavior change yet for non-members. 2. Privacy-engine branching + unit tests. 3. Public read API namespace (optional-auth, redacted schema, rate limit) + tests. 4. Public frontend pages (`/p/...`) + robots/sitemap. 5. In-app `/explore` directory + search. Steps 2–3 are the privacy-critical core and should be reviewed hardest. ## 9. Open questions - Caching: public pages are cacheable for SEO, but cache keys must not blur the redacted-vs-member rendering. Likely: cache only the anonymous projection at the edge; never cache member responses. - Do `site_members` trees appear in the sitemap for logged-in crawling? (Default: no — `noindex`.) - Per-tree opt-out of the directory even when `public`? (Probably unnecessary; `unlisted` already covers "reachable but not listed.")