Files
justin 4b21ba53a2 gc: rewrite registry_gc.py against Gitea's actual API (+ UA fix)
Root cause of run #122's GC failure (turns out NOT a permission
issue, despite the 403):

  1. The template's URL was wrong: /api/v1/packages/{owner}/container/
     {name}/versions — Gitea interprets this as "look up a SINGLE
     version named 'versions'" and returns "package does not exist".
     The correct list endpoint is:
       GET /api/v1/packages/{owner}?type=container&q={name}
     which returns one entry per tag with {id, version, created_at}.

  2. Cloudflare in front of git.jpaul.io returns 403 to the default
     Python-urllib User-Agent — any non-Python UA passes (curl,
     "requests", anything). That explains the 403 in CI (Python made
     the call) vs 404 from my curl test (curl passed CF, hit Gitea's
     wrong-URL 404). So both the URL AND the UA were broken.

Fixes:
  - Set User-Agent to "crop-chem-docs-registry-gc/0.1" in api().
  - Correct URL for list (above) + DELETE
    /api/v1/packages/{owner}/container/{name}/{tag} for delete.
  - Cleaner keep policy with explicit reasons:
      always: :latest
      always: corpus-*  (production pins; Drawbar may have locked)
      keep:   --keep-latest most recent OTHER tags
      keep:   anything younger than --keep-days
      delete: everything else
  - --dry-run for safe testing.

Local dry-run against current 4 tags categorizes correctly and
deletes nothing (4 < keep-latest=6).

Leaving continue-on-error: true in the workflows for one more
cycle. If tonight's run passes the GC step cleanly, follow-up
commit removes the safety net.

(Workflow paths: filter excludes scripts/**, so this commit
doesn't trigger image-only.yml.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 17:42:46 -04:00
..