447daf7fa8
A multi-agent audit of every doc against the code surfaced ~50 stale/missing
items (the roadmap/status docs and the backlog had fallen behind the code).
This catches them up:
- CLAUDE.md: phase status was ~3 phases stale ("Phase 1 is next" while Phase 1 +
chunks of 2 & 4 shipped). Rewrote the status list; added a model-provider
tech-stack entry; updated repo-layout (integrations objectstore/models,
deploy backup.sh/dev compose).
- ARCHITECTURE.md: §6 privacy engine described 3 visibility levels — corrected to
the shipped 4 (adds site_members); documented per-tree AI policy on Tree,
LLMProvider/EmbeddingProvider split + registry, ChangeProposal origin/status/
operations, verified-email session gate, instance-owner role, schema-drift
guard, and the env_file config model.
- PRD.md: 4-level visibility in US-040/§5.5, instance-owner role (§5.1/§5.11),
per-tree AI policy (§5.8), §8 sequencing annotated with shipped status, header
date/status bumped.
- README.md: 4-level privacy; softened "Full GEDCOM 7" to the 5.5.1/7 common
subset; noted backups + instance-owner admin; moved property/land to an
explicit "where it's headed" (no property models exist yet).
- BACKLOG.md: flipped ~15 shipped-but-open rows to Have (ChangeProposal, provider
abstraction, GEDCOM citation export, membership management, operator backup,
email-verification gate, per-tree AI policy, instance owner, the whole
visibility/public-viewing/child-resource-redaction cluster #41-#51/#46), and
reconciled the executive summary, "current defects" list, quick wins, and
differentiators. Left genuinely-open items (citation/source redaction, sitemap,
per-tree noindex, scoped-token API) accurately open.
- .env.example: dropped "SMTP wired in a later phase"; documented the worker
purge knobs, S3_PRESIGN_TTL, COOKIE_NAME; removed a stray duplicate line.
- design/: tree-visibility.md and change-proposal.md marked Shipped; corrected
the redaction approach (reuses member schemas, not a separate PublicPersonRead)
and the apply() rollback claim (v1 is not cross-op transactional), and marked
rate-limiting/sitemap/noindex as deferred.
No code changes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Justin Paul <justin@jpaul.me>
66 lines
5.0 KiB
Markdown
66 lines
5.0 KiB
Markdown
# Provenance
|
|
|
|
**Where it came from matters.**
|
|
|
|
Provenance is self-hostable software for tracing where you come from — your family *and* your land. Build a family tree, document every claim with real sources, reconstruct the chain of ownership behind a piece of property, and keep all of it in a format you control, on infrastructure you run.
|
|
|
|
Your history shouldn't live behind a subscription. Your data shouldn't be someone else's product. The story of where you came from belongs to you — and to whoever comes after.
|
|
|
|
---
|
|
|
|
## Why "Provenance"
|
|
|
|
Museums and collectors use the word for the chain of custody behind an object: where it came from, who held it, how it got here. A painting without provenance is just a painting. A painting *with* provenance is a story.
|
|
|
|
People and land work the same way. A name on a tree is just a name. A name with sources, photos, letters, and the small details of a life — that's a person. A parcel of farmland traced from its original federal patent through every deed and heir to the present day — that's a story too. Provenance treats both as facets of the same thing.
|
|
|
|
Every fact links to its source. Every claim can be traced. Nothing is just asserted; everything is shown.
|
|
|
|
## What it does
|
|
|
|
- **Build a tree that holds up.** People, relationships, events, and places — with every fact linked to the document, photo, or record it came from.
|
|
- **Bring your own archive.** Scans, PDFs, photos, audio recordings — first-class citizens, not afterthoughts.
|
|
- **A research assistant that proposes, never overwrites.** The built-in AI assistant searches legal sources, lays out what it found, and waits for your approval before anything touches your data. You can point it at the major model providers or a self-hosted model — your keys, your choice.
|
|
- **Standards over silos.** GEDCOM import and export (5.5.1 / 7 common subset) — duplicate-aware import, citation-preserving export. Migrate in, migrate out.
|
|
- **Privacy you control.** Public, members-only (any signed-in user on your instance), unlisted, or private per tree; any individual can be hidden; living people are protected by default.
|
|
- **Find your people.** When another user's tree overlaps with yours, Provenance can surface an anonymous "possible match" — and only connects you if you both say yes.
|
|
- **Run it your way.** Container-native. Self-host behind Caddy and, if you like, a Cloudflare Tunnel. Multi-tenant, so your whole extended family — or a whole community of strangers — can coexist on one deployment. One-command backups (Postgres + object storage) and an instance-owner admin role keep operations in your hands.
|
|
|
|
**Where it's headed — trace the land, not just the family.** The same source-backed treatment for *property*: parcels, deeds, and ownership events, reconstructing chain-of-title and tying land to the people who held it. The people side ships today; the land half is on the roadmap, not yet built — but it's why Provenance exists, not an afterthought.
|
|
|
|
## Who it's for
|
|
|
|
- The person who became the keeper of the photos after a parent passed
|
|
- Farm and rural families tracing land back to the original patent
|
|
- Researchers who want their citations to actually mean something
|
|
- Adoptees and donor-conceived people piecing together a fuller picture
|
|
- Anyone who looked at the big genealogy subscriptions and thought *I don't want my family history to be someone else's recurring revenue*
|
|
|
|
## Principles
|
|
|
|
- **Your data is yours.** Open formats. Export anytime. Self-host anywhere.
|
|
- **Sources or it didn't happen.** Every fact can carry citations. The record holds what you know *and* how you know it.
|
|
- **The assistant serves you.** AI proposes; you decide. No autonomous writes, ever.
|
|
- **Honest about hard things.** Adoption, estrangement, complicated parentage, name changes, people who don't want to be on a tree — treated as normal, not edge cases.
|
|
- **No dark patterns.** No paywalled hints. No surprise upsells. No "you have new ancestors waiting" emails.
|
|
|
|
## Licensing
|
|
|
|
Provenance is **source-available**, not open source (yet). It is licensed under the [Business Source License 1.1](LICENSE):
|
|
|
|
- **Free forever for personal, family, and non-commercial use** — self-host all you like.
|
|
- **Commercial hosting for a fee is not permitted** without a separate license from the author.
|
|
- **Each release converts to AGPL-3.0** (a true open-source license) four years after it ships.
|
|
|
|
In plain terms: run it for yourself, your family, or your community at no cost, forever. You just can't take this code and sell it as a hosted service — that's reserved for a possible future first-party offering. See [LICENSE](LICENSE) for the exact terms.
|
|
|
|
## Status
|
|
|
|
Early and moving fast. The product is being built in the open, commit by commit, and stood up in a live home lab as it goes. See [docs/PRD.md](docs/PRD.md) for the product requirements and roadmap.
|
|
|
|
If the principles above resonate, watch the repo, open an issue with your use case, or pitch in. See [CONTRIBUTING.md](CONTRIBUTING.md).
|
|
|
|
---
|
|
|
|
***Provenance.*** *Where it came from matters.*
|