GEDCOM: duplicate-aware import + typed name/attribute mapping
Duplicate detection (the "merge / skip / overwrite" the user asked for):
- New POST /gedcom/preview dry-runs the file and flags incoming people that
resemble existing ones (name similarity via difflib + birth-year guard;
high/medium score). No writes.
- /gedcom/import takes default_action (new|skip|merge|overwrite) + per-xref
resolutions {xref: {action, target_id}}:
new create as a new person (current behavior)
skip link families to the existing person, copy nothing
merge attach the incoming names (as alternates), events, citations,
and notes onto the existing person
overwrite soft-delete the existing person, import the incoming one fresh
Relationship creation is deduped so a merge can't double an edge.
Richer record mapping (covers the user's repo's GEDCOM):
- Multiple NAME records honor their TYPE; _MARNM (and NICK) import as typed
alternate names — maiden stays primary, married becomes a "married" Name.
- RELI -> a "religion" event with the value in detail; OCCU/EDUC values too.
- NOTE -> person notes (and event notes); NOTE/RELI are no longer "unmapped".
- Export round-trips name TYPE.
Verified against the user's 2185-person export: 0 unmapped tags. 48 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,25 +1,56 @@
|
||||
import json
|
||||
import uuid
|
||||
|
||||
from fastapi import APIRouter, File, Response, UploadFile
|
||||
from fastapi import APIRouter, File, Form, Response, UploadFile
|
||||
|
||||
from app.api.deps import CurrentUser, SessionDep
|
||||
from app.schemas.gedcom import ImportReport
|
||||
from app.schemas.gedcom import ImportPreview, ImportReport
|
||||
from app.services import gedcom, tree_service
|
||||
|
||||
router = APIRouter(prefix="/trees", tags=["gedcom"])
|
||||
|
||||
|
||||
@router.post("/{tree_id}/gedcom/preview", response_model=ImportPreview)
|
||||
async def preview_gedcom(
|
||||
tree_id: uuid.UUID,
|
||||
session: SessionDep,
|
||||
current: CurrentUser,
|
||||
file: UploadFile = File(...),
|
||||
) -> ImportPreview:
|
||||
"""Dry run: report counts and incoming people that look like duplicates of
|
||||
existing ones, so the user can choose how to resolve each before importing."""
|
||||
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
|
||||
text = (await file.read()).decode("utf-8", errors="replace")
|
||||
report = await gedcom.preview_gedcom(session, actor=current, tree=tree, text=text)
|
||||
return ImportPreview(**report)
|
||||
|
||||
|
||||
@router.post("/{tree_id}/gedcom/import", response_model=ImportReport)
|
||||
async def import_gedcom(
|
||||
tree_id: uuid.UUID,
|
||||
session: SessionDep,
|
||||
current: CurrentUser,
|
||||
file: UploadFile = File(...),
|
||||
default_action: str = Form("new"),
|
||||
resolutions: str = Form("{}"),
|
||||
) -> ImportReport:
|
||||
# NOTE: additive — records are created as new; existing people are not merged.
|
||||
"""Import a GEDCOM. ``default_action`` (new|skip|merge|overwrite) applies to
|
||||
incoming people that match an existing one; ``resolutions`` is a JSON object
|
||||
{xref: {action, target_id}} overriding it per record."""
|
||||
tree = await tree_service.get_tree(session, viewer_id=current.id, tree_id=tree_id)
|
||||
text = (await file.read()).decode("utf-8", errors="replace")
|
||||
report = await gedcom.import_gedcom(session, actor=current, tree=tree, text=text)
|
||||
try:
|
||||
parsed = json.loads(resolutions or "{}")
|
||||
except json.JSONDecodeError:
|
||||
parsed = {}
|
||||
report = await gedcom.import_gedcom(
|
||||
session,
|
||||
actor=current,
|
||||
tree=tree,
|
||||
text=text,
|
||||
default_action=default_action,
|
||||
resolutions=parsed,
|
||||
)
|
||||
return ImportReport(**report)
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user