# docs-mcp-template

A reusable template for building hosted MCP servers over a product's
public documentation. Distilled from one production build; everything
product-specific has been factored out.

The end product is a streamable-HTTP MCP server with ~15 tools that
any LLM client (Claude Desktop, Claude Code, Cursor, Copilot) can
call to answer questions against the docs, surface what changed
recently, and flag likely inconsistencies.

## What's here

- **[PLAN.md](PLAN.md)** — comprehensive build guide. Phased
  approach (13 phases, ~2–3 weeks of focused work for the full
  stack). Includes the design decisions, the gotchas, and a
  per-product customization checklist.
- **Scaffolded skeleton** — working FastMCP server with stub tools,
  Dockerfile, docker-compose, CI workflows, eval harness layout,
  usage logging. Everything you need to `git clone` and start
  filling in the product-specific bits.

## Quick start

```bash
git clone https://git.jpaul.io/justin/docs-mcp-template.git my-product-docs
cd my-product-docs
git remote remove origin  # detach from template
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Read PLAN.md before doing anything else. Pay particular attention to
# Phase 1 (scraper) — that's the most product-specific phase.

# Run the stub server (no corpus yet — just verifies the wiring):
python -m docs_mcp.server --transport stdio
```

## Repo layout

```
.
├── PLAN.md                        # The build guide. Read first.
├── README.md
├── requirements.txt
├── Dockerfile
├── .gitignore
├── .gitea/workflows/
│   ├── refresh.yml                # Weekly scrape + index + image push
│   └── image-only.yml             # On-demand code-only ship
├── scrape/
│   ├── README.md                  # Product-specific scraper goes here
│   └── changelog.py               # Reusable: --json, --history-out
├── rag/
│   ├── embeddings.py              # Ollama embedder, swappable
│   ├── chunk.py                   # Chunker — adjust per page format
│   ├── index.py                   # Builds Chroma + (optionally) BM25
│   └── bm25.py                    # SQLite FTS5 lexical index
├── docs_mcp/
│   ├── server.py                  # FastMCP server with stub tools
│   └── usage.py                   # TimedCall + JSONL telemetry
├── eval/
│   ├── queries.jsonl.example      # Curate ~25 hand-labeled queries
│   ├── retrievers.py              # Retriever protocol + implementations
│   └── run_eval.py                # MRR / Recall@k / nDCG@k harness
├── scripts/
│   ├── usage_report.py            # Standalone log analyzer
│   └── registry_gc.py             # Container registry cleanup
└── deploy/
    └── docker-compose.yml         # Hosting stack: MCP + reranker + Watchtower
```

## What's product-specific (must implement)

- `scrape/` — the scraper itself. The template gives you the corpus
  layout contract and a working `changelog.py`; the actual extraction
  logic is yours.
- The corpus on disk (gitignored; rebuilt by CI).
- The reranker GGUF model and llama.cpp container (commented in
  `deploy/docker-compose.yml`).
- The reverse proxy / TLS layer in front of the public endpoint.
- The hand-curated knowledge surface (your product's API gotchas,
  example scripts, anything the LLM should know that the docs
  don't say).

## What's NOT product-specific (works as-is)

- FastMCP server skeleton + tool decoration pattern
- Chroma + Ollama embedding pipeline
- BM25 / SQLite FTS5 lexical index
- Hybrid retrieval (RRF) + reranker integration
- Eval harness (Retriever protocol, MRR/Recall/nDCG)
- Usage logging (TimedCall, JSONL, daily rotation)
- CI workflow shape (weekly + on-demand, retry-on-race, three-tag
  image scheme)
- Registry GC script
- Standard tools: `search_docs`, `get_page`, `list_versions`,
  `diff_versions`, `bundle_changelog`, `weekly_digest`,
  `find_doc_inconsistencies`, etc.

## License

Internal template. Adjust before publishing.