From 661b6311abd43132b377fbe5f5c0d9eb373470d7 Mon Sep 17 00:00:00 2001 From: Justin Paul Date: Fri, 22 May 2026 13:38:04 -0400 Subject: [PATCH] fix: stop ignoring corpus/ so the refresh workflow can commit it MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PLAN.md's design has corpus/ committed and chroma/+bm25/ regenerated at CI time. The scaffold's .gitignore over-ignored corpus/, which meant refresh.yml's `git add bundles.json corpus` silently dropped the corpus and the changed-detection logic always reported "no changes — skipping reindex and image build". Net result: refresh would scrape successfully and then ship nothing. chroma/ and bm25/ stay ignored — those are rebuilt by `python -m rag.index --rebuild` before docker build copies them. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/.gitignore b/.gitignore index fbc0883..d592ac8 100644 --- a/.gitignore +++ b/.gitignore @@ -2,11 +2,17 @@ venv/ .venv/ -# Regenerable from corpus + CI -corpus/ +# Indexes are regenerated from corpus by `python -m rag.index --rebuild` +# (run in CI before docker build). Don't commit them. chroma/ bm25/ +# corpus/ IS committed — the weekly refresh workflow writes scraped +# pages here and `git add bundles.json corpus`s them. The image-only +# workflow then rebuilds indexes from the committed corpus without +# re-scraping. Earlier the .gitignore silently ate `git add corpus` +# and refresh.yml's commit step would always report "no changes". + # Python detritus __pycache__/ *.py[cod] -- 2.52.0