fix: stop ignoring corpus/ so the refresh workflow can commit it

PLAN.md's design has corpus/ committed and chroma/+bm25/ regenerated
at CI time. The scaffold's .gitignore over-ignored corpus/, which
meant refresh.yml's `git add bundles.json corpus` silently dropped
the corpus and the changed-detection logic always reported "no
changes — skipping reindex and image build". Net result: refresh
would scrape successfully and then ship nothing.

chroma/ and bm25/ stay ignored — those are rebuilt by
`python -m rag.index --rebuild` before docker build copies them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 13:38:04 -04:00
parent 6b11993688
commit 661b6311ab
+8 -2
View File
@@ -2,11 +2,17 @@
venv/
.venv/
# Regenerable from corpus + CI
corpus/
# Indexes are regenerated from corpus by `python -m rag.index --rebuild`
# (run in CI before docker build). Don't commit them.
chroma/
bm25/
# corpus/ IS committed — the weekly refresh workflow writes scraped
# pages here and `git add bundles.json corpus`s them. The image-only
# workflow then rebuilds indexes from the committed corpus without
# re-scraping. Earlier the .gitignore silently ate `git add corpus`
# and refresh.yml's commit step would always report "no changes".
# Python detritus
__pycache__/
*.py[cod]