182 lines
7.3 KiB
Markdown
182 lines
7.3 KiB
Markdown
# hvm-docs
|
||
|
||
A hosted MCP server over the public documentation for **HPE Morpheus
|
||
VM Essentials Software** (HVM) — the KVM-based hypervisor platform
|
||
from HPE. Lets any MCP-aware client (Claude Desktop, Claude Code,
|
||
Cursor, Copilot, MetaMCP) answer questions against the User Manual,
|
||
Release Notes, and Deployment Guide; diff pages across 8.1.x
|
||
versions; surface what changed recently; and (when enabled) submit
|
||
documentation bugs back to HPE.
|
||
|
||
Live behind MetaMCP at `https://mcp.jpaul.io/metamcp/hvm-docs/mcp`
|
||
once deployed.
|
||
|
||
## Tools
|
||
|
||
11 tools, registered over MCP streamable-HTTP:
|
||
|
||
| Tool | Use |
|
||
|---|---|
|
||
| `search_docs` | BM25-default search with optional version / platform / bundle filters; cross-encoder reranked when `RERANK_URL` is set |
|
||
| `get_page` | Full markdown of one page with metadata header + source URL |
|
||
| `list_versions` | Discover available versions, doc types, and bundle slugs |
|
||
| `list_cluster` | Cross-version peers of a page (synthesized from same-GUID overlap) |
|
||
| `diff_versions` | Unified diff of one topic between two bundles |
|
||
| `bundle_changelog` | Added / removed / churn-ranked changed pages between two bundles |
|
||
| `weekly_digest` | "What changed in the docs in the last N days" — reads CI-baked history.jsonl |
|
||
| `corpus_status` | Image build time, upstream Published date, total bundles/pages/chunks |
|
||
| `hvm_api_lessons` | Curated operator gotchas (manager sizing, upgrade ordering, plugin/worker compat, console keyboards, backups setup) |
|
||
| `find_doc_inconsistencies` | Scoped scan for cross-version drift + redirect-chain stub pages |
|
||
| `submit_doc_bug` | Env-gated draft → confirm → submit workflow to HPE's docs feedback (endpoint TBD; currently refuses with manual-fallback) |
|
||
|
||
## Corpus
|
||
|
||
Confirmed bundles (scraped 2026-05-22 from HPE Support DocPortal):
|
||
|
||
| Bundle | docId | Pages |
|
||
|---|---|---|
|
||
| `hvm_user_manual_8_1_0` | `sd00007520en_us` | 374 |
|
||
| `hvm_user_manual_8_1_1` | `sd00007620en_us` | 376 |
|
||
| `hvm_user_manual_8_1_2` | `sd00007735en_us` | 376 |
|
||
| `hvm_release_notes_8_1_0` | `sd00007497en_us` | 1 |
|
||
| `hvm_release_notes_8_1_1` | `sd00007609en_us` | 1 |
|
||
| `hvm_release_notes_8_1_2` | `sd00007734en_us` | 1 |
|
||
| `hvm_deployment_guide` | `sd00007332en_us` | 32 |
|
||
|
||
Total: ~1,161 pages → 2,650 chunks in Chroma + same chunks indexed in
|
||
SQLite FTS5 (BM25).
|
||
|
||
GUIDs are stable across HVM versions, so `topic_cluster` cross-version
|
||
peer mapping is free (no fuzzy matching needed).
|
||
|
||
## Retrieval
|
||
|
||
Eval against 22 hand-curated golden queries — see
|
||
[`eval/results/baseline.md`](eval/results/baseline.md):
|
||
|
||
| Retriever | MRR | Recall@5 | nDCG@5 | latency |
|
||
|---|---:|---:|---:|---:|
|
||
| dense (Ollama nomic-embed-text) | 0.539 | 0.621 | 0.558 | 88 ms |
|
||
| BM25 (SQLite FTS5) | 0.880 | 0.909 | 0.883 | 3 ms |
|
||
| hybrid (dense + BM25 + RRF) | 0.692 | 0.818 | 0.713 | 69 ms |
|
||
| **bm25 + jina-rerank** | **0.920** | **0.939** | **0.927** | 490 ms (CPU) / ~50 ms (GPU) |
|
||
|
||
HPE docs use controlled vocabulary, so lexical match dominates; the
|
||
cross-encoder cleans up the long tail. See PLAN.md Phase 7/8 for the
|
||
reasoning.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
HPE Support DocPortal (sniff-the-API, no auth)
|
||
│
|
||
▼
|
||
scrape/ ──► corpus/<bundle>/<GUID>.{md,json} (committed)
|
||
│
|
||
▼
|
||
rag/index ──► chroma/ (dense, 768-dim nomic-embed-text)
|
||
──► bm25/ (SQLite FTS5)
|
||
│
|
||
▼
|
||
docs_mcp.server (FastMCP, streamable-HTTP)
|
||
│
|
||
├── BM25 → reranker (jina-reranker-v2-base GGUF, GPU sidecar)
|
||
│
|
||
▼
|
||
deploy/docker-compose.yml
|
||
│
|
||
├── MetaMCP gateway ── public at mcp.jpaul.io behind Cloudflare Tunnel
|
||
├── jina-rerank ── shared GPU sidecar (1080 Ti)
|
||
└── Watchtower ── auto-pulls :latest on weekly refresh
|
||
```
|
||
|
||
## CI (Gitea Actions on `git.jpaul.io`)
|
||
|
||
Two cadences:
|
||
|
||
- **`refresh.yml`** — weekly Monday 06:00 UTC cron + manual dispatch.
|
||
Re-scrapes upstream, commits corpus diffs, rebuilds Chroma + BM25,
|
||
builds & pushes image. ~5–8 min on the GPU pool.
|
||
- **`image-only.yml`** — manual dispatch. Skips scrape; rebuilds
|
||
indexes from committed corpus and ships a new image. ~3 min.
|
||
|
||
Image: `git.jpaul.io/justin/hvm-docs:latest` (Watchtower target),
|
||
plus rolling `:<sha7>` and `:YYYY.MM.DD` tags.
|
||
|
||
Embeddings fan out across the two GPU-pinned Ollama containers on
|
||
the Gitea host (`192.168.0.2:11435` Titan X, `:11436` 1080 Ti) — same
|
||
infra zerto-docs uses; see `OLLAMA_URLS` in both workflows.
|
||
|
||
## Local dev
|
||
|
||
```bash
|
||
python -m venv venv && source venv/bin/activate
|
||
pip install -r requirements.txt
|
||
|
||
# (Optional) the CPU dev reranker — pulls PyTorch (~2 GB); skip if
|
||
# you'll just be running stdio queries.
|
||
pip install -r requirements-rerank.txt
|
||
|
||
# Build / refresh the corpus + indexes
|
||
python -m scrape.bundles
|
||
python -m scrape.runner --all --force --concurrency 6
|
||
python -m rag.index --rebuild
|
||
|
||
# Local stdio server (Claude Desktop dev)
|
||
python -m docs_mcp.server --transport stdio
|
||
|
||
# Local streamable-HTTP for integration testing
|
||
python -m docs_mcp.server --transport streamable-http --port 8000
|
||
|
||
# Run the eval harness (without reranker)
|
||
python -m eval.run_eval --k 5
|
||
|
||
# With the dev reranker
|
||
python -m scripts.rerank_server &
|
||
RERANK_URL=http://127.0.0.1:8001 python -m eval.run_eval --k 5
|
||
```
|
||
|
||
## Repo layout
|
||
|
||
```
|
||
.
|
||
├── PLAN.md # 13-phase build guide (template-shared)
|
||
├── CLAUDE.md # Claude Code guidance
|
||
├── README.md # this file
|
||
├── Dockerfile
|
||
├── requirements.txt # production deps
|
||
├── requirements-rerank.txt # dev CPU reranker only
|
||
├── bundles.json # bundle catalog (committed)
|
||
├── corpus/ # 1,161 scraped pages (committed)
|
||
├── .gitea/workflows/ # refresh.yml + image-only.yml
|
||
├── scrape/
|
||
│ ├── bundles.py # HVM bundle catalog + discovery
|
||
│ ├── runner.py # TOC + single-doc page scraper
|
||
│ └── changelog.py # git-history → digest JSONL
|
||
├── rag/
|
||
│ ├── chunk.py # paragraph-aware splitter w/ 6 KB hard cap
|
||
│ ├── embeddings.py # OLLAMA_URLS (zerto-style fan-out)
|
||
│ ├── index.py # builds Chroma + BM25
|
||
│ └── bm25.py # FTS5 lexical index
|
||
├── docs_mcp/
|
||
│ ├── server.py # FastMCP + 11 tools
|
||
│ ├── usage.py # TimedCall JSONL telemetry
|
||
│ └── api_lessons.md # curated HVM operator gotchas
|
||
├── eval/
|
||
│ ├── queries.jsonl # 22 hand-curated golden queries
|
||
│ ├── retrievers.py # Dense/BM25/Hybrid/Reranked
|
||
│ ├── run_eval.py # MRR / Recall@K / nDCG@K
|
||
│ └── results/baseline.md # committed eval results
|
||
├── scripts/
|
||
│ ├── rerank_server.py # dev/CPU cross-encoder /v1/rerank
|
||
│ ├── usage_report.py # log summarizer
|
||
│ └── registry_gc.py # Gitea container-registry cleanup
|
||
└── deploy/
|
||
└── docker-compose.yml # production hosting (MCP + reranker + Watchtower)
|
||
```
|
||
|
||
## License
|
||
|
||
Internal — HVM is HPE's product; the docs MCP is a side project, not
|
||
HPE-sanctioned.
|