# docs-mcp-template A reusable template for building hosted MCP servers over a product's public documentation. Distilled from one production build; everything product-specific has been factored out. The end product is a streamable-HTTP MCP server with ~15 tools that any LLM client (Claude Desktop, Claude Code, Cursor, Copilot) can call to answer questions against the docs, surface what changed recently, and flag likely inconsistencies. ## What's here - **[PLAN.md](PLAN.md)** — comprehensive build guide. Phased approach (13 phases, ~2–3 weeks of focused work for the full stack). Includes the design decisions, the gotchas, and a per-product customization checklist. - **Scaffolded skeleton** — working FastMCP server with stub tools, Dockerfile, docker-compose, CI workflows, eval harness layout, usage logging. Everything you need to `git clone` and start filling in the product-specific bits. ## Quick start ```bash git clone https://git.jpaul.io/justin/docs-mcp-template.git my-product-docs cd my-product-docs git remote remove origin # detach from template python -m venv venv && source venv/bin/activate pip install -r requirements.txt # Read PLAN.md before doing anything else. Pay particular attention to # Phase 1 (scraper) — that's the most product-specific phase. # Run the stub server (no corpus yet — just verifies the wiring): python -m docs_mcp.server --transport stdio ``` ## Repo layout ``` . ├── PLAN.md # The build guide. Read first. ├── README.md ├── requirements.txt ├── Dockerfile ├── .gitignore ├── .gitea/workflows/ │ ├── refresh.yml # Weekly scrape + index + image push │ └── image-only.yml # On-demand code-only ship ├── scrape/ │ ├── README.md # Product-specific scraper goes here │ └── changelog.py # Reusable: --json, --history-out ├── rag/ │ ├── embeddings.py # Ollama embedder, swappable │ ├── chunk.py # Chunker — adjust per page format │ ├── index.py # Builds Chroma + (optionally) BM25 │ └── bm25.py # SQLite FTS5 lexical index ├── docs_mcp/ │ ├── server.py # FastMCP server with stub tools │ └── usage.py # TimedCall + JSONL telemetry ├── eval/ │ ├── queries.jsonl.example # Curate ~25 hand-labeled queries │ ├── retrievers.py # Retriever protocol + implementations │ └── run_eval.py # MRR / Recall@k / nDCG@k harness ├── scripts/ │ ├── usage_report.py # Standalone log analyzer │ └── registry_gc.py # Container registry cleanup └── deploy/ └── docker-compose.yml # Hosting stack: MCP + reranker + Watchtower ``` ## What's product-specific (must implement) - `scrape/` — the scraper itself. The template gives you the corpus layout contract and a working `changelog.py`; the actual extraction logic is yours. - The corpus on disk (gitignored; rebuilt by CI). - The reranker GGUF model and llama.cpp container (commented in `deploy/docker-compose.yml`). - The reverse proxy / TLS layer in front of the public endpoint. - The hand-curated knowledge surface (your product's API gotchas, example scripts, anything the LLM should know that the docs don't say). ## What's NOT product-specific (works as-is) - FastMCP server skeleton + tool decoration pattern - Chroma + Ollama embedding pipeline - BM25 / SQLite FTS5 lexical index - Hybrid retrieval (RRF) + reranker integration - Eval harness (Retriever protocol, MRR/Recall/nDCG) - Usage logging (TimedCall, JSONL, daily rotation) - CI workflow shape (weekly + on-demand, retry-on-race, three-tag image scheme) - Registry GC script - Standard tools: `search_docs`, `get_page`, `list_versions`, `diff_versions`, `bundle_changelog`, `weekly_digest`, `find_doc_inconsistencies`, etc. ## License Internal template. Adjust before publishing.