docs: bring all documentation current with shipped work
A multi-agent audit of every doc against the code surfaced ~50 stale/missing
items (the roadmap/status docs and the backlog had fallen behind the code).
This catches them up:
- CLAUDE.md: phase status was ~3 phases stale ("Phase 1 is next" while Phase 1 +
chunks of 2 & 4 shipped). Rewrote the status list; added a model-provider
tech-stack entry; updated repo-layout (integrations objectstore/models,
deploy backup.sh/dev compose).
- ARCHITECTURE.md: §6 privacy engine described 3 visibility levels — corrected to
the shipped 4 (adds site_members); documented per-tree AI policy on Tree,
LLMProvider/EmbeddingProvider split + registry, ChangeProposal origin/status/
operations, verified-email session gate, instance-owner role, schema-drift
guard, and the env_file config model.
- PRD.md: 4-level visibility in US-040/§5.5, instance-owner role (§5.1/§5.11),
per-tree AI policy (§5.8), §8 sequencing annotated with shipped status, header
date/status bumped.
- README.md: 4-level privacy; softened "Full GEDCOM 7" to the 5.5.1/7 common
subset; noted backups + instance-owner admin; moved property/land to an
explicit "where it's headed" (no property models exist yet).
- BACKLOG.md: flipped ~15 shipped-but-open rows to Have (ChangeProposal, provider
abstraction, GEDCOM citation export, membership management, operator backup,
email-verification gate, per-tree AI policy, instance owner, the whole
visibility/public-viewing/child-resource-redaction cluster #41-#51/#46), and
reconciled the executive summary, "current defects" list, quick wins, and
differentiators. Left genuinely-open items (citation/source redaction, sitemap,
per-tree noindex, scoped-token API) accurately open.
- .env.example: dropped "SMTP wired in a later phase"; documented the worker
purge knobs, S3_PRESIGN_TTL, COOKIE_NAME; removed a stray duplicate line.
- design/: tree-visibility.md and change-proposal.md marked Shipped; corrected
the redaction approach (reuses member schemas, not a separate PublicPersonRead)
and the apply() rollback claim (v1 is not cross-op transactional), and marked
rate-limiting/sitemap/noindex as deferred.
No code changes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
This commit is contained in:
@@ -1,6 +1,8 @@
|
||||
# Design note: ChangeProposal (propose-then-confirm)
|
||||
|
||||
Status: **in progress**. Implements non-negotiable #1 (CLAUDE.md): *the AI
|
||||
Status: **Shipped (#214/#236)** — model, service, API, and review UI landed; the
|
||||
assistant producer and cross-op transactional apply remain as follow-ups (see
|
||||
Out of scope). Implements non-negotiable #1 (CLAUDE.md): *the AI
|
||||
assistant never writes autonomously.* Every assistant "write" emits a
|
||||
**ChangeProposal** — a structured diff a human approves, edits, or rejects.
|
||||
|
||||
@@ -63,7 +65,9 @@ is a follow-up (it needs the services to accept a no-commit mode).
|
||||
- `apply(session, *, actor, tree, proposal_id, edited_operations=None) -> ChangeProposal`
|
||||
— editor-only. Optional `edited_operations` lets the reviewer tweak the diff
|
||||
before applying ("edit" in approve/edit/reject). Dispatches each op through the
|
||||
editing services; on any failure, rolls back and records `apply_error`.
|
||||
editing services; on failure it records `apply_error` and leaves the proposal
|
||||
pending — it does **not** roll back ops already committed by earlier dispatches
|
||||
(v1 is not cross-op transactional; see Data model).
|
||||
- `reject(session, *, actor, tree, proposal_id, note=None)` — editor-only.
|
||||
|
||||
## API
|
||||
|
||||
@@ -1,11 +1,11 @@
|
||||
# Design note: tree visibility & the public viewing surface
|
||||
|
||||
Status: **proposed** (design only — no code yet). Owner: Justin. Created 2026-06-09.
|
||||
Status: **Shipped (#41-#51)**. Owner: Justin. Created 2026-06-09.
|
||||
|
||||
This is a privacy-critical change (it creates the first anonymous read surface in
|
||||
Provenance). Per CLAUDE.md, design before code. Implementation should land in
|
||||
small, individually-reviewable PRs, with tests on the privacy engine and the
|
||||
public read path before any anonymous endpoint is exposed.
|
||||
This is a privacy-critical change (it created the first anonymous read surface in
|
||||
Provenance). Per CLAUDE.md, it was designed before code and shipped in small,
|
||||
individually-reviewable PRs, with tests on the privacy engine and the public read
|
||||
path landing before any anonymous endpoint was exposed.
|
||||
|
||||
## 1. The model
|
||||
|
||||
@@ -74,13 +74,12 @@ logged-in non-member; `private` denies both.
|
||||
|
||||
## 4. The anonymous read path (the careful part)
|
||||
|
||||
**Recommendation: a dedicated read-only public API namespace**, not optional-auth
|
||||
on the existing endpoints. Rationale: it is far easier to audit a small,
|
||||
purpose-built surface that *always* funnels through `person_visibility` than to
|
||||
weaken the membership checks on the authenticated endpoints and hope every branch
|
||||
is covered.
|
||||
**Shipped: a dedicated read-only public API namespace**, not optional-auth on the
|
||||
existing endpoints. Rationale: it is far easier to audit a small, purpose-built
|
||||
surface that *always* funnels through `person_visibility` than to weaken the
|
||||
membership checks on the authenticated endpoints and hope every branch is covered.
|
||||
|
||||
- New router `app/api/v1/public.py`, mounted at `/api/v1/public`, with an
|
||||
- Router `app/api/v1/public.py`, mounted at `/api/v1/public`, with an
|
||||
**optional-auth** dependency `CurrentUserOrNone` (returns `User | None`; never
|
||||
401s). Contrast with `CurrentUser` (`deps.py:30-36`) which hard-401s.
|
||||
- Endpoints (read-only; no create/update/delete):
|
||||
@@ -88,14 +87,20 @@ is covered.
|
||||
lists `site_members` when the caller is authenticated. Paginated, search via
|
||||
existing `pg_trgm`. Never lists `unlisted`/`private`.
|
||||
- `GET /public/trees/{id}` — tree metadata if `can_view_tree(user_or_none)`.
|
||||
- `GET /public/trees/{id}/persons`, `/persons/{pid}`, `/relationships`,
|
||||
`/events`, `/media`, … — each filtered through `person_visibility`, returning
|
||||
redacted projections (a `PublicPersonRead` that omits PII for redacted people:
|
||||
no exact dates, no living-person names beyond "Living", etc.).
|
||||
- **A redacted response schema**, distinct from the member `PersonRead`, so the
|
||||
serializer physically cannot emit fields a non-member shouldn't see. Redaction
|
||||
happens in the service, not the route.
|
||||
- **Rate limiting** on the public namespace (per-IP) to blunt scraping/enumeration.
|
||||
- `GET /public/trees/{id}/persons`, `/persons/{pid}`, `/persons/{pid}/names`,
|
||||
`/relationships`, `/events` — each filtered through `person_visibility`.
|
||||
(Media is not exposed on the public surface yet — deferred.)
|
||||
- **Redaction happens in the service, before serialization** — this is the safety
|
||||
guarantee. It did **not** ship as a separate `PublicPersonRead` schema (that
|
||||
recommendation was not adopted): the public router **reuses the member read
|
||||
schemas** (`PersonRead`, `RelationshipRead`, `EventRead`, `NameRead`), and only
|
||||
the tree projection (`PublicTreeRead`) is distinct. Safety comes from
|
||||
`public_view_service` resolving `person_visibility` and then **dropping hidden
|
||||
rows and redacting possibly-living people** (`person_service._redact` rewrites
|
||||
the name to "Living person", etc.) *before* a row is ever validated into a
|
||||
schema. No route hands a raw row to the serializer.
|
||||
- **Rate limiting** on the public namespace (per-IP) is **deferred** — it is not
|
||||
implemented in the app and may be handled at the Caddy edge if needed.
|
||||
- **Audit**: count public reads; do not log PII.
|
||||
|
||||
## 5. Frontend public pages
|
||||
@@ -103,8 +108,12 @@ is covered.
|
||||
- New **server-rendered** routes outside the authed app shell, e.g.
|
||||
`/p/[treeId]` (tree), `/p/[treeId]/[personId]` (person), `/explore` (directory).
|
||||
Server components fetch the `/api/v1/public/*` endpoints; no login redirect.
|
||||
- `robots`: allow + sitemap for `public`; `noindex, nofollow` meta for `unlisted`
|
||||
and `site_members`. Sitemap lists only `public` trees/persons.
|
||||
- `robots`: ships a coarse `allow: ["/", "/p/"]` rule (`frontend/app/robots.ts`)
|
||||
that keeps the authed app out of the index. Per-tree `noindex, nofollow` meta
|
||||
for `unlisted`/`site_members` and a `public`-only **sitemap** did **not** ship —
|
||||
both are **deferred** follow-ups (per-tree noindex needs server rendering;
|
||||
meanwhile `unlisted`/`site_members` trees aren't linked or listed, so they
|
||||
aren't crawl-discoverable).
|
||||
- The directory `/explore` is anonymous for `public`; shows `site_members` trees
|
||||
only to logged-in users.
|
||||
- Reuse the tree/person view components where possible, fed by the redacted
|
||||
@@ -131,7 +140,9 @@ anyone on the web. Living people stay hidden.") is worthwhile given the stakes.
|
||||
output. No raw repository reads in the public router.
|
||||
- Living-person protection holds regardless of tree visibility.
|
||||
- Unlisted relies on UUID unguessability; never expose a sequential public id.
|
||||
- `noindex` everything except `public`; sitemap is `public`-only.
|
||||
- Per-tree `noindex` (everything except `public`) and a `public`-only sitemap are
|
||||
**deferred** (see §5); today `robots.ts` keeps the authed app out of the index
|
||||
and `unlisted`/`site_members` trees aren't linked or listed.
|
||||
- Tests gate the merge: privacy-engine matrix + an integration test that hits the
|
||||
public endpoints anonymously and asserts no living-person PII leaks.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user