Files

7.3 KiB
Raw Permalink Blame History

hvm-docs

A hosted MCP server over the public documentation for HPE Morpheus VM Essentials Software (HVM) — the KVM-based hypervisor platform from HPE. Lets any MCP-aware client (Claude Desktop, Claude Code, Cursor, Copilot, MetaMCP) answer questions against the User Manual, Release Notes, and Deployment Guide; diff pages across 8.1.x versions; surface what changed recently; and (when enabled) submit documentation bugs back to HPE.

Live behind MetaMCP at https://mcp.jpaul.io/metamcp/hvm-docs/mcp once deployed.

Tools

11 tools, registered over MCP streamable-HTTP:

Tool Use
search_docs BM25-default search with optional version / platform / bundle filters; cross-encoder reranked when RERANK_URL is set
get_page Full markdown of one page with metadata header + source URL
list_versions Discover available versions, doc types, and bundle slugs
list_cluster Cross-version peers of a page (synthesized from same-GUID overlap)
diff_versions Unified diff of one topic between two bundles
bundle_changelog Added / removed / churn-ranked changed pages between two bundles
weekly_digest "What changed in the docs in the last N days" — reads CI-baked history.jsonl
corpus_status Image build time, upstream Published date, total bundles/pages/chunks
hvm_api_lessons Curated operator gotchas (manager sizing, upgrade ordering, plugin/worker compat, console keyboards, backups setup)
find_doc_inconsistencies Scoped scan for cross-version drift + redirect-chain stub pages
submit_doc_bug Env-gated draft → confirm → submit workflow to HPE's docs feedback (endpoint TBD; currently refuses with manual-fallback)

Corpus

Confirmed bundles (scraped 2026-05-22 from HPE Support DocPortal):

Bundle docId Pages
hvm_user_manual_8_1_0 sd00007520en_us 374
hvm_user_manual_8_1_1 sd00007620en_us 376
hvm_user_manual_8_1_2 sd00007735en_us 376
hvm_release_notes_8_1_0 sd00007497en_us 1
hvm_release_notes_8_1_1 sd00007609en_us 1
hvm_release_notes_8_1_2 sd00007734en_us 1
hvm_deployment_guide sd00007332en_us 32

Total: ~1,161 pages → 2,650 chunks in Chroma + same chunks indexed in SQLite FTS5 (BM25).

GUIDs are stable across HVM versions, so topic_cluster cross-version peer mapping is free (no fuzzy matching needed).

Retrieval

Eval against 22 hand-curated golden queries — see eval/results/baseline.md:

Retriever MRR Recall@5 nDCG@5 latency
dense (Ollama nomic-embed-text) 0.539 0.621 0.558 88 ms
BM25 (SQLite FTS5) 0.880 0.909 0.883 3 ms
hybrid (dense + BM25 + RRF) 0.692 0.818 0.713 69 ms
bm25 + jina-rerank 0.920 0.939 0.927 490 ms (CPU) / ~50 ms (GPU)

HPE docs use controlled vocabulary, so lexical match dominates; the cross-encoder cleans up the long tail. See PLAN.md Phase 7/8 for the reasoning.

Architecture

HPE Support DocPortal (sniff-the-API, no auth)
        │
        ▼
   scrape/        ──► corpus/<bundle>/<GUID>.{md,json}  (committed)
        │
        ▼
   rag/index      ──► chroma/  (dense, 768-dim nomic-embed-text)
                  ──► bm25/    (SQLite FTS5)
        │
        ▼
   docs_mcp.server (FastMCP, streamable-HTTP)
        │
        ├── BM25 → reranker (jina-reranker-v2-base GGUF, GPU sidecar)
        │
        ▼
   deploy/docker-compose.yml
        │
        ├── MetaMCP gateway   ── public at mcp.jpaul.io behind Cloudflare Tunnel
        ├── jina-rerank       ── shared GPU sidecar (1080 Ti)
        └── Watchtower        ── auto-pulls :latest on weekly refresh

CI (Gitea Actions on git.jpaul.io)

Two cadences:

  • refresh.yml — weekly Monday 06:00 UTC cron + manual dispatch. Re-scrapes upstream, commits corpus diffs, rebuilds Chroma + BM25, builds & pushes image. ~58 min on the GPU pool.
  • image-only.yml — manual dispatch. Skips scrape; rebuilds indexes from committed corpus and ships a new image. ~3 min.

Image: git.jpaul.io/justin/hvm-docs:latest (Watchtower target), plus rolling :<sha7> and :YYYY.MM.DD tags.

Embeddings fan out across the two GPU-pinned Ollama containers on the Gitea host (192.168.0.2:11435 Titan X, :11436 1080 Ti) — same infra zerto-docs uses; see OLLAMA_URLS in both workflows.

Local dev

python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# (Optional) the CPU dev reranker — pulls PyTorch (~2 GB); skip if
# you'll just be running stdio queries.
pip install -r requirements-rerank.txt

# Build / refresh the corpus + indexes
python -m scrape.bundles
python -m scrape.runner --all --force --concurrency 6
python -m rag.index --rebuild

# Local stdio server (Claude Desktop dev)
python -m docs_mcp.server --transport stdio

# Local streamable-HTTP for integration testing
python -m docs_mcp.server --transport streamable-http --port 8000

# Run the eval harness (without reranker)
python -m eval.run_eval --k 5

# With the dev reranker
python -m scripts.rerank_server &
RERANK_URL=http://127.0.0.1:8001 python -m eval.run_eval --k 5

Repo layout

.
├── PLAN.md                       # 13-phase build guide (template-shared)
├── CLAUDE.md                     # Claude Code guidance
├── README.md                     # this file
├── Dockerfile
├── requirements.txt              # production deps
├── requirements-rerank.txt       # dev CPU reranker only
├── bundles.json                  # bundle catalog (committed)
├── corpus/                       # 1,161 scraped pages (committed)
├── .gitea/workflows/             # refresh.yml + image-only.yml
├── scrape/
│   ├── bundles.py                # HVM bundle catalog + discovery
│   ├── runner.py                 # TOC + single-doc page scraper
│   └── changelog.py              # git-history → digest JSONL
├── rag/
│   ├── chunk.py                  # paragraph-aware splitter w/ 6 KB hard cap
│   ├── embeddings.py             # OLLAMA_URLS (zerto-style fan-out)
│   ├── index.py                  # builds Chroma + BM25
│   └── bm25.py                   # FTS5 lexical index
├── docs_mcp/
│   ├── server.py                 # FastMCP + 11 tools
│   ├── usage.py                  # TimedCall JSONL telemetry
│   └── api_lessons.md            # curated HVM operator gotchas
├── eval/
│   ├── queries.jsonl             # 22 hand-curated golden queries
│   ├── retrievers.py             # Dense/BM25/Hybrid/Reranked
│   ├── run_eval.py               # MRR / Recall@K / nDCG@K
│   └── results/baseline.md       # committed eval results
├── scripts/
│   ├── rerank_server.py          # dev/CPU cross-encoder /v1/rerank
│   ├── usage_report.py           # log summarizer
│   └── registry_gc.py            # Gitea container-registry cleanup
└── deploy/
    └── docker-compose.yml        # production hosting (MCP + reranker + Watchtower)

License

Internal — HVM is HPE's product; the docs MCP is a side project, not HPE-sanctioned.