docs: replace template README with HVM-specific content
The repo still shipped the docs-mcp-template README. Replaced with HVM-specific: tool table (11 tools), bundle catalog, retrieval eval numbers, architecture diagram, CI cadences, local dev recipes, repo layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,104 +1,181 @@
|
|||||||
# docs-mcp-template
|
# hvm-docs
|
||||||
|
|
||||||
A reusable template for building hosted MCP servers over a product's
|
A hosted MCP server over the public documentation for **HPE Morpheus
|
||||||
public documentation. Distilled from one production build; everything
|
VM Essentials Software** (HVM) — the KVM-based hypervisor platform
|
||||||
product-specific has been factored out.
|
from HPE. Lets any MCP-aware client (Claude Desktop, Claude Code,
|
||||||
|
Cursor, Copilot, MetaMCP) answer questions against the User Manual,
|
||||||
|
Release Notes, and Deployment Guide; diff pages across 8.1.x
|
||||||
|
versions; surface what changed recently; and (when enabled) submit
|
||||||
|
documentation bugs back to HPE.
|
||||||
|
|
||||||
The end product is a streamable-HTTP MCP server with ~15 tools that
|
Live behind MetaMCP at `https://mcp.jpaul.io/metamcp/hvm-docs/mcp`
|
||||||
any LLM client (Claude Desktop, Claude Code, Cursor, Copilot) can
|
once deployed.
|
||||||
call to answer questions against the docs, surface what changed
|
|
||||||
recently, find inconsistencies, and (optionally) submit doc bugs
|
|
||||||
back upstream.
|
|
||||||
|
|
||||||
## What's here
|
## Tools
|
||||||
|
|
||||||
- **[PLAN.md](PLAN.md)** — comprehensive build guide. Phased
|
11 tools, registered over MCP streamable-HTTP:
|
||||||
approach (13 phases, ~2–3 weeks of focused work for the full
|
|
||||||
stack). Includes the design decisions, the gotchas, and a
|
|
||||||
per-product customization checklist.
|
|
||||||
- **Scaffolded skeleton** — working FastMCP server with stub tools,
|
|
||||||
Dockerfile, docker-compose, CI workflows, eval harness layout,
|
|
||||||
usage logging. Everything you need to `git clone` and start
|
|
||||||
filling in the product-specific bits.
|
|
||||||
|
|
||||||
## Quick start
|
| Tool | Use |
|
||||||
|
|---|---|
|
||||||
|
| `search_docs` | BM25-default search with optional version / platform / bundle filters; cross-encoder reranked when `RERANK_URL` is set |
|
||||||
|
| `get_page` | Full markdown of one page with metadata header + source URL |
|
||||||
|
| `list_versions` | Discover available versions, doc types, and bundle slugs |
|
||||||
|
| `list_cluster` | Cross-version peers of a page (synthesized from same-GUID overlap) |
|
||||||
|
| `diff_versions` | Unified diff of one topic between two bundles |
|
||||||
|
| `bundle_changelog` | Added / removed / churn-ranked changed pages between two bundles |
|
||||||
|
| `weekly_digest` | "What changed in the docs in the last N days" — reads CI-baked history.jsonl |
|
||||||
|
| `corpus_status` | Image build time, upstream Published date, total bundles/pages/chunks |
|
||||||
|
| `hvm_api_lessons` | Curated operator gotchas (manager sizing, upgrade ordering, plugin/worker compat, console keyboards, backups setup) |
|
||||||
|
| `find_doc_inconsistencies` | Scoped scan for cross-version drift + redirect-chain stub pages |
|
||||||
|
| `submit_doc_bug` | Env-gated draft → confirm → submit workflow to HPE's docs feedback (endpoint TBD; currently refuses with manual-fallback) |
|
||||||
|
|
||||||
|
## Corpus
|
||||||
|
|
||||||
|
Confirmed bundles (scraped 2026-05-22 from HPE Support DocPortal):
|
||||||
|
|
||||||
|
| Bundle | docId | Pages |
|
||||||
|
|---|---|---|
|
||||||
|
| `hvm_user_manual_8_1_0` | `sd00007520en_us` | 374 |
|
||||||
|
| `hvm_user_manual_8_1_1` | `sd00007620en_us` | 376 |
|
||||||
|
| `hvm_user_manual_8_1_2` | `sd00007735en_us` | 376 |
|
||||||
|
| `hvm_release_notes_8_1_0` | `sd00007497en_us` | 1 |
|
||||||
|
| `hvm_release_notes_8_1_1` | `sd00007609en_us` | 1 |
|
||||||
|
| `hvm_release_notes_8_1_2` | `sd00007734en_us` | 1 |
|
||||||
|
| `hvm_deployment_guide` | `sd00007332en_us` | 32 |
|
||||||
|
|
||||||
|
Total: ~1,161 pages → 2,650 chunks in Chroma + same chunks indexed in
|
||||||
|
SQLite FTS5 (BM25).
|
||||||
|
|
||||||
|
GUIDs are stable across HVM versions, so `topic_cluster` cross-version
|
||||||
|
peer mapping is free (no fuzzy matching needed).
|
||||||
|
|
||||||
|
## Retrieval
|
||||||
|
|
||||||
|
Eval against 22 hand-curated golden queries — see
|
||||||
|
[`eval/results/baseline.md`](eval/results/baseline.md):
|
||||||
|
|
||||||
|
| Retriever | MRR | Recall@5 | nDCG@5 | latency |
|
||||||
|
|---|---:|---:|---:|---:|
|
||||||
|
| dense (Ollama nomic-embed-text) | 0.539 | 0.621 | 0.558 | 88 ms |
|
||||||
|
| BM25 (SQLite FTS5) | 0.880 | 0.909 | 0.883 | 3 ms |
|
||||||
|
| hybrid (dense + BM25 + RRF) | 0.692 | 0.818 | 0.713 | 69 ms |
|
||||||
|
| **bm25 + jina-rerank** | **0.920** | **0.939** | **0.927** | 490 ms (CPU) / ~50 ms (GPU) |
|
||||||
|
|
||||||
|
HPE docs use controlled vocabulary, so lexical match dominates; the
|
||||||
|
cross-encoder cleans up the long tail. See PLAN.md Phase 7/8 for the
|
||||||
|
reasoning.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
HPE Support DocPortal (sniff-the-API, no auth)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
scrape/ ──► corpus/<bundle>/<GUID>.{md,json} (committed)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
rag/index ──► chroma/ (dense, 768-dim nomic-embed-text)
|
||||||
|
──► bm25/ (SQLite FTS5)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
docs_mcp.server (FastMCP, streamable-HTTP)
|
||||||
|
│
|
||||||
|
├── BM25 → reranker (jina-reranker-v2-base GGUF, GPU sidecar)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
deploy/docker-compose.yml
|
||||||
|
│
|
||||||
|
├── MetaMCP gateway ── public at mcp.jpaul.io behind Cloudflare Tunnel
|
||||||
|
├── jina-rerank ── shared GPU sidecar (1080 Ti)
|
||||||
|
└── Watchtower ── auto-pulls :latest on weekly refresh
|
||||||
|
```
|
||||||
|
|
||||||
|
## CI (Gitea Actions on `git.jpaul.io`)
|
||||||
|
|
||||||
|
Two cadences:
|
||||||
|
|
||||||
|
- **`refresh.yml`** — weekly Monday 06:00 UTC cron + manual dispatch.
|
||||||
|
Re-scrapes upstream, commits corpus diffs, rebuilds Chroma + BM25,
|
||||||
|
builds & pushes image. ~5–8 min on the GPU pool.
|
||||||
|
- **`image-only.yml`** — manual dispatch. Skips scrape; rebuilds
|
||||||
|
indexes from committed corpus and ships a new image. ~3 min.
|
||||||
|
|
||||||
|
Image: `git.jpaul.io/justin/hvm-docs:latest` (Watchtower target),
|
||||||
|
plus rolling `:<sha7>` and `:YYYY.MM.DD` tags.
|
||||||
|
|
||||||
|
Embeddings fan out across the two GPU-pinned Ollama containers on
|
||||||
|
the Gitea host (`192.168.0.2:11435` Titan X, `:11436` 1080 Ti) — same
|
||||||
|
infra zerto-docs uses; see `OLLAMA_URLS` in both workflows.
|
||||||
|
|
||||||
|
## Local dev
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://git.jpaul.io/justin/docs-mcp-template.git my-product-docs
|
|
||||||
cd my-product-docs
|
|
||||||
git remote remove origin # detach from template
|
|
||||||
python -m venv venv && source venv/bin/activate
|
python -m venv venv && source venv/bin/activate
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
|
|
||||||
# Read PLAN.md before doing anything else. Pay particular attention to
|
# (Optional) the CPU dev reranker — pulls PyTorch (~2 GB); skip if
|
||||||
# Phase 1 (scraper) — that's the most product-specific phase.
|
# you'll just be running stdio queries.
|
||||||
|
pip install -r requirements-rerank.txt
|
||||||
|
|
||||||
# Run the stub server (no corpus yet — just verifies the wiring):
|
# Build / refresh the corpus + indexes
|
||||||
|
python -m scrape.bundles
|
||||||
|
python -m scrape.runner --all --force --concurrency 6
|
||||||
|
python -m rag.index --rebuild
|
||||||
|
|
||||||
|
# Local stdio server (Claude Desktop dev)
|
||||||
python -m docs_mcp.server --transport stdio
|
python -m docs_mcp.server --transport stdio
|
||||||
|
|
||||||
|
# Local streamable-HTTP for integration testing
|
||||||
|
python -m docs_mcp.server --transport streamable-http --port 8000
|
||||||
|
|
||||||
|
# Run the eval harness (without reranker)
|
||||||
|
python -m eval.run_eval --k 5
|
||||||
|
|
||||||
|
# With the dev reranker
|
||||||
|
python -m scripts.rerank_server &
|
||||||
|
RERANK_URL=http://127.0.0.1:8001 python -m eval.run_eval --k 5
|
||||||
```
|
```
|
||||||
|
|
||||||
## Repo layout
|
## Repo layout
|
||||||
|
|
||||||
```
|
```
|
||||||
.
|
.
|
||||||
├── PLAN.md # The build guide. Read first.
|
├── PLAN.md # 13-phase build guide (template-shared)
|
||||||
├── README.md
|
├── CLAUDE.md # Claude Code guidance
|
||||||
├── requirements.txt
|
├── README.md # this file
|
||||||
├── Dockerfile
|
├── Dockerfile
|
||||||
├── .gitignore
|
├── requirements.txt # production deps
|
||||||
├── .gitea/workflows/
|
├── requirements-rerank.txt # dev CPU reranker only
|
||||||
│ ├── refresh.yml # Weekly scrape + index + image push
|
├── bundles.json # bundle catalog (committed)
|
||||||
│ └── image-only.yml # On-demand code-only ship
|
├── corpus/ # 1,161 scraped pages (committed)
|
||||||
|
├── .gitea/workflows/ # refresh.yml + image-only.yml
|
||||||
├── scrape/
|
├── scrape/
|
||||||
│ ├── README.md # Product-specific scraper goes here
|
│ ├── bundles.py # HVM bundle catalog + discovery
|
||||||
│ └── changelog.py # Reusable: --json, --history-out
|
│ ├── runner.py # TOC + single-doc page scraper
|
||||||
|
│ └── changelog.py # git-history → digest JSONL
|
||||||
├── rag/
|
├── rag/
|
||||||
│ ├── embeddings.py # Ollama embedder, swappable
|
│ ├── chunk.py # paragraph-aware splitter w/ 6 KB hard cap
|
||||||
│ ├── chunk.py # Chunker — adjust per page format
|
│ ├── embeddings.py # OLLAMA_URLS (zerto-style fan-out)
|
||||||
│ ├── index.py # Builds Chroma + (optionally) BM25
|
│ ├── index.py # builds Chroma + BM25
|
||||||
│ └── bm25.py # SQLite FTS5 lexical index
|
│ └── bm25.py # FTS5 lexical index
|
||||||
├── docs_mcp/
|
├── docs_mcp/
|
||||||
│ ├── server.py # FastMCP server with stub tools
|
│ ├── server.py # FastMCP + 11 tools
|
||||||
│ └── usage.py # TimedCall + JSONL telemetry
|
│ ├── usage.py # TimedCall JSONL telemetry
|
||||||
|
│ └── api_lessons.md # curated HVM operator gotchas
|
||||||
├── eval/
|
├── eval/
|
||||||
│ ├── queries.jsonl.example # Curate ~25 hand-labeled queries
|
│ ├── queries.jsonl # 22 hand-curated golden queries
|
||||||
│ ├── retrievers.py # Retriever protocol + implementations
|
│ ├── retrievers.py # Dense/BM25/Hybrid/Reranked
|
||||||
│ └── run_eval.py # MRR / Recall@k / nDCG@k harness
|
│ ├── run_eval.py # MRR / Recall@K / nDCG@K
|
||||||
|
│ └── results/baseline.md # committed eval results
|
||||||
├── scripts/
|
├── scripts/
|
||||||
│ ├── usage_report.py # Standalone log analyzer
|
│ ├── rerank_server.py # dev/CPU cross-encoder /v1/rerank
|
||||||
│ └── registry_gc.py # Container registry cleanup
|
│ ├── usage_report.py # log summarizer
|
||||||
|
│ └── registry_gc.py # Gitea container-registry cleanup
|
||||||
└── deploy/
|
└── deploy/
|
||||||
└── docker-compose.yml # Hosting stack: MCP + reranker + Watchtower
|
└── docker-compose.yml # production hosting (MCP + reranker + Watchtower)
|
||||||
```
|
```
|
||||||
|
|
||||||
## What's product-specific (must implement)
|
|
||||||
|
|
||||||
- `scrape/` — the scraper itself. The template gives you the corpus
|
|
||||||
layout contract and a working `changelog.py`; the actual extraction
|
|
||||||
logic is yours.
|
|
||||||
- The corpus on disk (gitignored; rebuilt by CI).
|
|
||||||
- The reranker GGUF model and llama.cpp container (commented in
|
|
||||||
`deploy/docker-compose.yml`).
|
|
||||||
- The reverse proxy / TLS layer in front of the public endpoint.
|
|
||||||
- The hand-curated knowledge surface (your product's API gotchas,
|
|
||||||
example scripts, anything the LLM should know that the docs
|
|
||||||
don't say).
|
|
||||||
|
|
||||||
## What's NOT product-specific (works as-is)
|
|
||||||
|
|
||||||
- FastMCP server skeleton + tool decoration pattern
|
|
||||||
- Chroma + Ollama embedding pipeline
|
|
||||||
- BM25 / SQLite FTS5 lexical index
|
|
||||||
- Hybrid retrieval (RRF) + reranker integration
|
|
||||||
- Eval harness (Retriever protocol, MRR/Recall/nDCG)
|
|
||||||
- Usage logging (TimedCall, JSONL, daily rotation)
|
|
||||||
- CI workflow shape (weekly + on-demand, retry-on-race, three-tag
|
|
||||||
image scheme)
|
|
||||||
- Registry GC script
|
|
||||||
- Standard tools: `search_docs`, `get_page`, `list_versions`,
|
|
||||||
`diff_versions`, `bundle_changelog`, `weekly_digest`,
|
|
||||||
`find_doc_inconsistencies`, `submit_doc_bug`, etc.
|
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
Internal template. Adjust before publishing.
|
Internal — HVM is HPE's product; the docs MCP is a side project, not
|
||||||
|
HPE-sanctioned.
|
||||||
|
|||||||
Reference in New Issue
Block a user