Files
provenance/backend/app/api/v1/gedcom.py
T
justin 5824e70895 GEDCOM: duplicate-aware import + typed name/attribute mapping
Duplicate detection (the "merge / skip / overwrite" the user asked for):
- New POST /gedcom/preview dry-runs the file and flags incoming people that
  resemble existing ones (name similarity via difflib + birth-year guard;
  high/medium score). No writes.
- /gedcom/import takes default_action (new|skip|merge|overwrite) + per-xref
  resolutions {xref: {action, target_id}}:
    new       create as a new person (current behavior)
    skip      link families to the existing person, copy nothing
    merge     attach the incoming names (as alternates), events, citations,
              and notes onto the existing person
    overwrite soft-delete the existing person, import the incoming one fresh
  Relationship creation is deduped so a merge can't double an edge.

Richer record mapping (covers the user's repo's GEDCOM):
- Multiple NAME records honor their TYPE; _MARNM (and NICK) import as typed
  alternate names — maiden stays primary, married becomes a "married" Name.
- RELI -> a "religion" event with the value in detail; OCCU/EDUC values too.
- NOTE -> person notes (and event notes); NOTE/RELI are no longer "unmapped".
- Export round-trips name TYPE.

Verified against the user's 2185-person export: 0 unmapped tags. 48 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 10:35:55 -04:00

69 lines
2.5 KiB
Python

import json
import uuid
from fastapi import APIRouter, File, Form, Response, UploadFile
from app.api.deps import CurrentUser, SessionDep
from app.schemas.gedcom import ImportPreview, ImportReport
from app.services import gedcom, tree_service
router = APIRouter(prefix="/trees", tags=["gedcom"])
@router.post("/{tree_id}/gedcom/preview", response_model=ImportPreview)
async def preview_gedcom(
tree_id: uuid.UUID,
session: SessionDep,
current: CurrentUser,
file: UploadFile = File(...),
) -> ImportPreview:
"""Dry run: report counts and incoming people that look like duplicates of
existing ones, so the user can choose how to resolve each before importing."""
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
text = (await file.read()).decode("utf-8", errors="replace")
report = await gedcom.preview_gedcom(session, actor=current, tree=tree, text=text)
return ImportPreview(**report)
@router.post("/{tree_id}/gedcom/import", response_model=ImportReport)
async def import_gedcom(
tree_id: uuid.UUID,
session: SessionDep,
current: CurrentUser,
file: UploadFile = File(...),
default_action: str = Form("new"),
resolutions: str = Form("{}"),
) -> ImportReport:
"""Import a GEDCOM. ``default_action`` (new|skip|merge|overwrite) applies to
incoming people that match an existing one; ``resolutions`` is a JSON object
{xref: {action, target_id}} overriding it per record."""
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
text = (await file.read()).decode("utf-8", errors="replace")
try:
parsed = json.loads(resolutions or "{}")
except json.JSONDecodeError:
parsed = {}
report = await gedcom.import_gedcom(
session,
actor=current,
tree=tree,
text=text,
default_action=default_action,
resolutions=parsed,
)
return ImportReport(**report)
@router.get("/{tree_id}/gedcom/export")
async def export_gedcom(
tree_id: uuid.UUID, session: SessionDep, current: CurrentUser
) -> Response:
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
text = await gedcom.export_gedcom(session, viewer_id=current.id, tree=tree)
safe = "".join(c for c in tree.name if c.isalnum() or c in " -_").strip() or "tree"
return Response(
content=text,
media_type="text/plain",
headers={"Content-Disposition": f'attachment; filename="{safe}.ged"'},
)