Close citation/source living-person leak; add on-demand tree purge
Two changes. 1. Privacy fix (NN#2/NN#3) — the citation and source list endpoints gated only on can_view_tree, so a non-member on a public/unlisted/site_members tree could enumerate citations and sources tied to a redacted living person, leaking that the person exists and has sourced facts (and possibly their name via a source title). #46 closed this for events/media/names/relationships but not citations/sources. Now citation_service.list_citations and source_service.{list_sources,get_source} delegate non-member reads to public_view_service, mirroring the #46 pattern: - citations: shown only when the cited fact resolves to FULL-visibility person(s) — covers the person_id, name_id, event_id (person or both-partner), and relationship_id (both-partner) target paths. - sources: shown only when they back at least one visible citation; a withheld source 404s (don't reveal it exists). Tests cover all four citation target types + source withholding + member-sees-all. 2. On-demand tree purge — owners can permanently delete a soft-deleted tree now instead of waiting out the 30-day auto-purge window. POST /trees/{id}/purge (owner-only): the tree must already be in the trash, and the caller retypes its name to confirm. Media objects are deleted from storage, then a single DELETE on trees cascades all tree-owned rows via the tree_id ON DELETE CASCADE; the audit entry survives (tree_id SET NULL). Frontend adds a "Delete forever" button to the Recently-deleted list. No migration. Suite: 102 passing. Signed-off-by: Justin Paul <justin@jpaul.me>
This commit is contained in:
+5
-6
@@ -44,10 +44,9 @@ These two doc edits are themselves trivial quick wins (see §3).
|
||||
- **No place as a usable first-class entity** (model exists, created by GEDCOM, but no read/edit/delete — a create-only entity, which is a bug per NN#8).
|
||||
- **No research log, to-do/task planner, kinship calculator, data-quality checker, or i18n/string externalization** (the last is a documented day-one commitment that is currently unmet).
|
||||
|
||||
**Security-priority correctness fixes (do these first, regardless of phase).** Most of the original redaction defects shipped this cycle (#46); two items remain — one a narrowed PII gap, one a config switch:
|
||||
**Security-priority correctness fixes (do these first, regardless of phase).** The redaction defects all shipped — child resources (#46) and now citations/sources too — leaving one config switch:
|
||||
|
||||
1. **Citation/source redaction gap (§2.10)** — `list_media`/`get_media`/`media_content`, plus the event/name/relationship endpoints, now apply `person_visibility` for non-members (#46), closing the media leak. The `citation`/`source` list endpoints still gate only on `can_view_tree`, so a non-member on a public/unlisted tree can still enumerate citations/sources tied to redacted living people — the remaining living-person leak.
|
||||
2. **Self-registration approval-mode switch (§2.10)** — the read-side enforcement now exists: `REQUIRE_EMAIL_VERIFICATION` gates login/session on `email_verified_at` (#53). The remaining gap is the env switch to choose open vs admin-approval vs closed self-registration.
|
||||
1. **Self-registration approval-mode switch (§2.10)** — the read-side enforcement now exists: `REQUIRE_EMAIL_VERIFICATION` gates login/session on `email_verified_at` (#53). The remaining gap is the env switch to choose open vs admin-approval vs closed self-registration. *(The citation/source living-person leak is now closed — citation/source list endpoints apply `person_visibility` for non-members via `public_view_service`.)*
|
||||
|
||||
**Strategic posture.** The differentiators worth pressing — property chain-of-title, the ChangeProposal AI model, the anonymous mutual-consent hint system, and true self-host data ownership — are mostly still ahead on the roadmap. The near-term job is (a) close the **privacy/auth correctness** and **collaboration** gaps that the architecture already implies, (b) ship the **maps + reports + merge** table stakes, and (c) finish the back-half spine — the **connector framework** plus wiring the now-landed **ChangeProposal/ModelProvider** into the assistant — that unlocks the entire back half of the roadmap.
|
||||
|
||||
@@ -250,7 +249,7 @@ The architecture is correct (single engine, tenant mixin, audit, soft-delete + p
|
||||
|
||||
| Item | Description | Status | Imp | Eff | Phase | Non-negotiable |
|
||||
|---|---|---|---|---|---|---|
|
||||
| **Uniform living-person redaction across child resources** | `person_visibility` now runs for non-members on the event, media, name, and relationship endpoints (#46), which delegate to `public_view_service`. Remaining: the `citation`/`source` list endpoints still gate only on `can_view_tree`, so citations tied to a redacted living person are still enumerable. | Partial | High | S | 1–2 | **Mostly resolved (NN#3/NN#2).** Apply `person_visibility` to the citation/source list paths to close the residual leak. |
|
||||
| **Uniform living-person redaction across child resources** | `person_visibility` now runs for non-members on the event, media, name, relationship endpoints (#46) and the citation/source list endpoints, all delegating to `public_view_service`: citations resolve to FULL-visibility person(s); sources show only when they back a visible citation. | Have | High | S | 1–2 | **Resolved (NN#3/NN#2).** No child-resource path leaks a redacted living person's facts. |
|
||||
| **Email-verification enforcement gate** | Read-side check now ships (#53): `REQUIRE_EMAIL_VERIFICATION` gates login/session on `email_verified_at` (`auth_service.py`). Opt-in (default off) so SMTP-less self-hosts still work. | Have | **High** | S | 1–2 | Read-side trust path now enforced (NN#7); the registration-mode switch below is the separate larger piece. |
|
||||
| Self-registration mode gating (approve / open / closed) | No env switch to choose open vs admin-approval vs closed registration. | Partial | High | M | 2/5 | Twelve-factor registration control (NN#7); pairs with the verification gate above. |
|
||||
| Instance owner / operator role | `OWNER_EMAIL`-declared operator (#240): `is_instance_owner` on `/users/me`, owner-only `GET /api/v1/admin/instance`, `/admin` UI. | Have | Med | S | 2/5 | Owner-only operational surface, twelve-factor via env (NN#7); reads stay through the service layer. |
|
||||
@@ -412,7 +411,7 @@ Ordered by leverage. All are S-effort or a thin slice of a larger item, and most
|
||||
12. **Sort the merged person timeline** (Research workflow, Med/S) — `shownEvents.sort()` on `date_start`; currently appended unsorted.
|
||||
13. **Doc corrections (docs-vs-code)** (Meta, trivial/S) — edit CLAUDE.md / ARCHITECTURE so the pgvector "used" claim and the i18n "from day one" claim match reality. The repo convention requires docs to travel with code.
|
||||
|
||||
> **Mostly shipped this cycle (#46):** the **media privacy leak** (§2.4) and the broad **child-resource redaction gap** (§2.10) are now closed for the person/event/media/name/relationship endpoints. The narrowed remainder — applying `person_visibility` to the **citation/source list endpoints** — is an S-effort follow-up; treat it as a security-priority Phase 1–2 fix regardless of the quick-win list.
|
||||
> **Shipped this cycle:** the **media privacy leak** (§2.4) and the **child-resource redaction gap** (§2.10) are fully closed — person/event/media/name/relationship (#46) and citation/source endpoints all apply `person_visibility` for non-members. No residual living-person leak on the read surface.
|
||||
|
||||
---
|
||||
|
||||
@@ -426,6 +425,6 @@ Where to invest to make Provenance distinct rather than a webtrees clone. Each l
|
||||
|
||||
**3. Anonymous, mutual-consent cross-tree hints.** The privacy model already redacts living people for anonymous viewers, so a hint system that reveals *nothing identifying* until both sides opt in is achievable by construction — and is a categorically more trustworthy version of MyHeritage Smart Matches / Ancestry hints. Requires the matching engine (pgvector enablement + candidate generation, Phase 7), the notification/event-dispatch substrate (§2.9), and the messaging channel that opens only post-consent.
|
||||
|
||||
**4. True self-hosting + data ownership.** Full account export/import, soft-delete recovery, GEDCOM round-trip, env-driven everything, a one-command operator backup, and (to-build) scheduled off-host backup + ARM support make Provenance the genealogy app you actually own. The two correctness items that gated the promise have **landed**: GEDCOM export now preserves citations (the Provenance→Provenance round-trip keeps the sources graph), and operator backup moved from "documented procedure" to a one-command dump (`deploy/backup.sh`). What remains is scheduled/verified-restore tooling and ARM builds. The Ollama/self-hosted ModelProvider path means even the AI assistant runs without tree data leaving the deployment — a promise no commercial competitor can make.
|
||||
**4. True self-hosting + data ownership.** Full account export/import, soft-delete recovery (with owner-confirmed on-demand purge to delete a trashed tree immediately rather than waiting out the 30-day window), GEDCOM round-trip, env-driven everything, a one-command operator backup, and (to-build) scheduled off-host backup + ARM support make Provenance the genealogy app you actually own. The two correctness items that gated the promise have **landed**: GEDCOM export now preserves citations (the Provenance→Provenance round-trip keeps the sources graph), and operator backup moved from "documented procedure" to a one-command dump (`deploy/backup.sh`). What remains is scheduled/verified-restore tooling and ARM builds. The Ollama/self-hosted ModelProvider path means even the AI assistant runs without tree data leaving the deployment — a promise no commercial competitor can make.
|
||||
|
||||
**5. Sources-first as a felt experience.** The two-tier model is built, and citations now **survive GEDCOM export** (#232); the remaining differentiator is making sourcing *visible and low-friction*: a guided Evidence-Explained citation builder, transcription/abstract fields, source-driven data entry (transcribe a document into the tree), and per-fact confidence surfaced in the UI. These turn "every fact links to where it came from" from an architecture note into the product's personality.
|
||||
|
||||
Reference in New Issue
Block a user