# hvm-docs A hosted MCP server over the public documentation for **HPE Morpheus VM Essentials Software** (HVM) — the KVM-based hypervisor platform from HPE. Lets any MCP-aware client (Claude Desktop, Claude Code, Cursor, Copilot, MetaMCP) answer questions against the User Manual, Release Notes, and Deployment Guide; diff pages across 8.1.x versions; surface what changed recently; and (when enabled) submit documentation bugs back to HPE. Live behind MetaMCP at `https://mcp.jpaul.io/metamcp/hvm-docs/mcp` once deployed. ## Tools 11 tools, registered over MCP streamable-HTTP: | Tool | Use | |---|---| | `search_docs` | BM25-default search with optional version / platform / bundle filters; cross-encoder reranked when `RERANK_URL` is set | | `get_page` | Full markdown of one page with metadata header + source URL | | `list_versions` | Discover available versions, doc types, and bundle slugs | | `list_cluster` | Cross-version peers of a page (synthesized from same-GUID overlap) | | `diff_versions` | Unified diff of one topic between two bundles | | `bundle_changelog` | Added / removed / churn-ranked changed pages between two bundles | | `weekly_digest` | "What changed in the docs in the last N days" — reads CI-baked history.jsonl | | `corpus_status` | Image build time, upstream Published date, total bundles/pages/chunks | | `hvm_api_lessons` | Curated operator gotchas (manager sizing, upgrade ordering, plugin/worker compat, console keyboards, backups setup) | | `find_doc_inconsistencies` | Scoped scan for cross-version drift + redirect-chain stub pages | | `submit_doc_bug` | Env-gated draft → confirm → submit workflow to HPE's docs feedback (endpoint TBD; currently refuses with manual-fallback) | ## Corpus Confirmed bundles (scraped 2026-05-22 from HPE Support DocPortal): | Bundle | docId | Pages | |---|---|---| | `hvm_user_manual_8_1_0` | `sd00007520en_us` | 374 | | `hvm_user_manual_8_1_1` | `sd00007620en_us` | 376 | | `hvm_user_manual_8_1_2` | `sd00007735en_us` | 376 | | `hvm_release_notes_8_1_0` | `sd00007497en_us` | 1 | | `hvm_release_notes_8_1_1` | `sd00007609en_us` | 1 | | `hvm_release_notes_8_1_2` | `sd00007734en_us` | 1 | | `hvm_deployment_guide` | `sd00007332en_us` | 32 | Total: ~1,161 pages → 2,650 chunks in Chroma + same chunks indexed in SQLite FTS5 (BM25). GUIDs are stable across HVM versions, so `topic_cluster` cross-version peer mapping is free (no fuzzy matching needed). ## Retrieval Eval against 22 hand-curated golden queries — see [`eval/results/baseline.md`](eval/results/baseline.md): | Retriever | MRR | Recall@5 | nDCG@5 | latency | |---|---:|---:|---:|---:| | dense (Ollama nomic-embed-text) | 0.539 | 0.621 | 0.558 | 88 ms | | BM25 (SQLite FTS5) | 0.880 | 0.909 | 0.883 | 3 ms | | hybrid (dense + BM25 + RRF) | 0.692 | 0.818 | 0.713 | 69 ms | | **bm25 + jina-rerank** | **0.920** | **0.939** | **0.927** | 490 ms (CPU) / ~50 ms (GPU) | HPE docs use controlled vocabulary, so lexical match dominates; the cross-encoder cleans up the long tail. See PLAN.md Phase 7/8 for the reasoning. ## Architecture ``` HPE Support DocPortal (sniff-the-API, no auth) │ ▼ scrape/ ──► corpus//.{md,json} (committed) │ ▼ rag/index ──► chroma/ (dense, 768-dim nomic-embed-text) ──► bm25/ (SQLite FTS5) │ ▼ docs_mcp.server (FastMCP, streamable-HTTP) │ ├── BM25 → reranker (jina-reranker-v2-base GGUF, GPU sidecar) │ ▼ deploy/docker-compose.yml │ ├── MetaMCP gateway ── public at mcp.jpaul.io behind Cloudflare Tunnel ├── jina-rerank ── shared GPU sidecar (1080 Ti) └── Watchtower ── auto-pulls :latest on weekly refresh ``` ## CI (Gitea Actions on `git.jpaul.io`) Two cadences: - **`refresh.yml`** — weekly Monday 06:00 UTC cron + manual dispatch. Re-scrapes upstream, commits corpus diffs, rebuilds Chroma + BM25, builds & pushes image. ~5–8 min on the GPU pool. - **`image-only.yml`** — manual dispatch. Skips scrape; rebuilds indexes from committed corpus and ships a new image. ~3 min. Image: `git.jpaul.io/justin/hvm-docs:latest` (Watchtower target), plus rolling `:` and `:YYYY.MM.DD` tags. Embeddings fan out across the two GPU-pinned Ollama containers on the Gitea host (`192.168.0.2:11435` Titan X, `:11436` 1080 Ti) — same infra zerto-docs uses; see `OLLAMA_URLS` in both workflows. ## Local dev ```bash python -m venv venv && source venv/bin/activate pip install -r requirements.txt # (Optional) the CPU dev reranker — pulls PyTorch (~2 GB); skip if # you'll just be running stdio queries. pip install -r requirements-rerank.txt # Build / refresh the corpus + indexes python -m scrape.bundles python -m scrape.runner --all --force --concurrency 6 python -m rag.index --rebuild # Local stdio server (Claude Desktop dev) python -m docs_mcp.server --transport stdio # Local streamable-HTTP for integration testing python -m docs_mcp.server --transport streamable-http --port 8000 # Run the eval harness (without reranker) python -m eval.run_eval --k 5 # With the dev reranker python -m scripts.rerank_server & RERANK_URL=http://127.0.0.1:8001 python -m eval.run_eval --k 5 ``` ## Repo layout ``` . ├── PLAN.md # 13-phase build guide (template-shared) ├── CLAUDE.md # Claude Code guidance ├── README.md # this file ├── Dockerfile ├── requirements.txt # production deps ├── requirements-rerank.txt # dev CPU reranker only ├── bundles.json # bundle catalog (committed) ├── corpus/ # 1,161 scraped pages (committed) ├── .gitea/workflows/ # refresh.yml + image-only.yml ├── scrape/ │ ├── bundles.py # HVM bundle catalog + discovery │ ├── runner.py # TOC + single-doc page scraper │ └── changelog.py # git-history → digest JSONL ├── rag/ │ ├── chunk.py # paragraph-aware splitter w/ 6 KB hard cap │ ├── embeddings.py # OLLAMA_URLS (zerto-style fan-out) │ ├── index.py # builds Chroma + BM25 │ └── bm25.py # FTS5 lexical index ├── docs_mcp/ │ ├── server.py # FastMCP + 11 tools │ ├── usage.py # TimedCall JSONL telemetry │ └── api_lessons.md # curated HVM operator gotchas ├── eval/ │ ├── queries.jsonl # 22 hand-curated golden queries │ ├── retrievers.py # Dense/BM25/Hybrid/Reranked │ ├── run_eval.py # MRR / Recall@K / nDCG@K │ └── results/baseline.md # committed eval results ├── scripts/ │ ├── rerank_server.py # dev/CPU cross-encoder /v1/rerank │ ├── usage_report.py # log summarizer │ └── registry_gc.py # Gitea container-registry cleanup └── deploy/ └── docker-compose.yml # production hosting (MCP + reranker + Watchtower) ``` ## License Internal — HVM is HPE's product; the docs MCP is a side project, not HPE-sanctioned.