README: rewrite for crop-chem-docs as a product (was template README) #1

Merged
justin merged 1 commits from readme-update into main 2026-05-25 17:51:21 -04:00
Owner

Replaces the never-customized template README with a crop-chem-docs-specific one.

The old README was literally the docs-mcp-template's README (title "docs-mcp-template", template introduction prose) — never updated after cloning. New README covers:

  • Corpus: 4,159 indexed pages (91 Bayer + 4,068 EPA PPLS)
  • Tools: standard docs-mcp-template tools + crop_chem_api_lessons
  • Eval baseline from eval/results/with_rerank.md — hybrid+rerank MRR 0.672 vs BM25-only 0.544 vs hybrid-rrf-without-rerank 0.114. Same pattern seed-mcp found independently: dense alone is noise; hybrid-without-rerank actively HURTS.
  • Note on rerank in production: was silently failing through 2026-05-25 due to llama-rerank Docker network gotcha; fixed by attaching to drawbar-backend_default. Re-running eval against the now-working rerank is on the followup list — expect deployed MRR to lift toward the lab number.
  • Repo layout, infrastructure, deploy mechanics, cross-link to seed-mcp.
Replaces the never-customized template README with a crop-chem-docs-specific one. The old README was literally the docs-mcp-template's README (title "docs-mcp-template", template introduction prose) — never updated after cloning. New README covers: - **Corpus**: 4,159 indexed pages (91 Bayer + 4,068 EPA PPLS) - **Tools**: standard docs-mcp-template tools + `crop_chem_api_lessons` - **Eval baseline** from `eval/results/with_rerank.md` — hybrid+rerank MRR 0.672 vs BM25-only 0.544 vs hybrid-rrf-without-rerank 0.114. Same pattern seed-mcp found independently: dense alone is noise; hybrid-without-rerank actively HURTS. - **Note on rerank in production**: was silently failing through 2026-05-25 due to llama-rerank Docker network gotcha; fixed by attaching to drawbar-backend_default. Re-running eval against the now-working rerank is on the followup list — expect deployed MRR to lift toward the lab number. - **Repo layout, infrastructure, deploy mechanics, cross-link to seed-mcp.**
justin added 1 commit 2026-05-25 17:51:18 -04:00
The README had never been customized after cloning the
docs-mcp-template — title said "docs-mcp-template" and it read as
the template's generic introduction with no mention of EPA PPLS,
the Bayer scraper, the ~4k label corpus, or the production deploy.

Replace with a crop-chem-docs-specific README that covers:

- Corpus inventory: 4,159 indexed pages (91 Bayer + 4,068 EPA PPLS)
- MCP tool catalog with crop_chem_api_lessons specifics
- Eval baseline from eval/results/with_rerank.md showing
  hybrid+rerank wins (MRR 0.672) over BM25-only (0.544) and that
  hybrid-without-rerank actively HURTS (0.114) — same pattern
  seed-mcp found independently
- Note that the deployed rerank was silently failing through
  2026-05-25 due to the llama-rerank Docker network gotcha;
  fixed and re-running eval is on the followup list
- Quick-start commands
- Repo layout reference
- Infrastructure: registry, embedder pool, shared llama-rerank
  sidecar, PRODUCT_NAME=crop_chem
- Cross-link to the sibling seed-mcp project

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
justin merged commit 4c9b087da8 into main 2026-05-25 17:51:21 -04:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: justin/crop-chem-docs#1