Files
hvm-docs/README.md
T

182 lines
7.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# hvm-docs
A hosted MCP server over the public documentation for **HPE Morpheus
VM Essentials Software** (HVM) — the KVM-based hypervisor platform
from HPE. Lets any MCP-aware client (Claude Desktop, Claude Code,
Cursor, Copilot, MetaMCP) answer questions against the User Manual,
Release Notes, and Deployment Guide; diff pages across 8.1.x
versions; surface what changed recently; and (when enabled) submit
documentation bugs back to HPE.
Live behind MetaMCP at `https://mcp.jpaul.io/metamcp/hvm-docs/mcp`
once deployed.
## Tools
11 tools, registered over MCP streamable-HTTP:
| Tool | Use |
|---|---|
| `search_docs` | BM25-default search with optional version / platform / bundle filters; cross-encoder reranked when `RERANK_URL` is set |
| `get_page` | Full markdown of one page with metadata header + source URL |
| `list_versions` | Discover available versions, doc types, and bundle slugs |
| `list_cluster` | Cross-version peers of a page (synthesized from same-GUID overlap) |
| `diff_versions` | Unified diff of one topic between two bundles |
| `bundle_changelog` | Added / removed / churn-ranked changed pages between two bundles |
| `weekly_digest` | "What changed in the docs in the last N days" — reads CI-baked history.jsonl |
| `corpus_status` | Image build time, upstream Published date, total bundles/pages/chunks |
| `hvm_api_lessons` | Curated operator gotchas (manager sizing, upgrade ordering, plugin/worker compat, console keyboards, backups setup) |
| `find_doc_inconsistencies` | Scoped scan for cross-version drift + redirect-chain stub pages |
| `submit_doc_bug` | Env-gated draft → confirm → submit workflow to HPE's docs feedback (endpoint TBD; currently refuses with manual-fallback) |
## Corpus
Confirmed bundles (scraped 2026-05-22 from HPE Support DocPortal):
| Bundle | docId | Pages |
|---|---|---|
| `hvm_user_manual_8_1_0` | `sd00007520en_us` | 374 |
| `hvm_user_manual_8_1_1` | `sd00007620en_us` | 376 |
| `hvm_user_manual_8_1_2` | `sd00007735en_us` | 376 |
| `hvm_release_notes_8_1_0` | `sd00007497en_us` | 1 |
| `hvm_release_notes_8_1_1` | `sd00007609en_us` | 1 |
| `hvm_release_notes_8_1_2` | `sd00007734en_us` | 1 |
| `hvm_deployment_guide` | `sd00007332en_us` | 32 |
Total: ~1,161 pages → 2,650 chunks in Chroma + same chunks indexed in
SQLite FTS5 (BM25).
GUIDs are stable across HVM versions, so `topic_cluster` cross-version
peer mapping is free (no fuzzy matching needed).
## Retrieval
Eval against 22 hand-curated golden queries — see
[`eval/results/baseline.md`](eval/results/baseline.md):
| Retriever | MRR | Recall@5 | nDCG@5 | latency |
|---|---:|---:|---:|---:|
| dense (Ollama nomic-embed-text) | 0.539 | 0.621 | 0.558 | 88 ms |
| BM25 (SQLite FTS5) | 0.880 | 0.909 | 0.883 | 3 ms |
| hybrid (dense + BM25 + RRF) | 0.692 | 0.818 | 0.713 | 69 ms |
| **bm25 + jina-rerank** | **0.920** | **0.939** | **0.927** | 490 ms (CPU) / ~50 ms (GPU) |
HPE docs use controlled vocabulary, so lexical match dominates; the
cross-encoder cleans up the long tail. See PLAN.md Phase 7/8 for the
reasoning.
## Architecture
```
HPE Support DocPortal (sniff-the-API, no auth)
scrape/ ──► corpus/<bundle>/<GUID>.{md,json} (committed)
rag/index ──► chroma/ (dense, 768-dim nomic-embed-text)
──► bm25/ (SQLite FTS5)
docs_mcp.server (FastMCP, streamable-HTTP)
├── BM25 → reranker (jina-reranker-v2-base GGUF, GPU sidecar)
deploy/docker-compose.yml
├── MetaMCP gateway ── public at mcp.jpaul.io behind Cloudflare Tunnel
├── jina-rerank ── shared GPU sidecar (1080 Ti)
└── Watchtower ── auto-pulls :latest on weekly refresh
```
## CI (Gitea Actions on `git.jpaul.io`)
Two cadences:
- **`refresh.yml`** — weekly Monday 06:00 UTC cron + manual dispatch.
Re-scrapes upstream, commits corpus diffs, rebuilds Chroma + BM25,
builds & pushes image. ~58 min on the GPU pool.
- **`image-only.yml`** — manual dispatch. Skips scrape; rebuilds
indexes from committed corpus and ships a new image. ~3 min.
Image: `git.jpaul.io/justin/hvm-docs:latest` (Watchtower target),
plus rolling `:<sha7>` and `:YYYY.MM.DD` tags.
Embeddings fan out across the two GPU-pinned Ollama containers on
the Gitea host (`192.168.0.2:11435` Titan X, `:11436` 1080 Ti) — same
infra zerto-docs uses; see `OLLAMA_URLS` in both workflows.
## Local dev
```bash
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# (Optional) the CPU dev reranker — pulls PyTorch (~2 GB); skip if
# you'll just be running stdio queries.
pip install -r requirements-rerank.txt
# Build / refresh the corpus + indexes
python -m scrape.bundles
python -m scrape.runner --all --force --concurrency 6
python -m rag.index --rebuild
# Local stdio server (Claude Desktop dev)
python -m docs_mcp.server --transport stdio
# Local streamable-HTTP for integration testing
python -m docs_mcp.server --transport streamable-http --port 8000
# Run the eval harness (without reranker)
python -m eval.run_eval --k 5
# With the dev reranker
python -m scripts.rerank_server &
RERANK_URL=http://127.0.0.1:8001 python -m eval.run_eval --k 5
```
## Repo layout
```
.
├── PLAN.md # 13-phase build guide (template-shared)
├── CLAUDE.md # Claude Code guidance
├── README.md # this file
├── Dockerfile
├── requirements.txt # production deps
├── requirements-rerank.txt # dev CPU reranker only
├── bundles.json # bundle catalog (committed)
├── corpus/ # 1,161 scraped pages (committed)
├── .gitea/workflows/ # refresh.yml + image-only.yml
├── scrape/
│ ├── bundles.py # HVM bundle catalog + discovery
│ ├── runner.py # TOC + single-doc page scraper
│ └── changelog.py # git-history → digest JSONL
├── rag/
│ ├── chunk.py # paragraph-aware splitter w/ 6 KB hard cap
│ ├── embeddings.py # OLLAMA_URLS (zerto-style fan-out)
│ ├── index.py # builds Chroma + BM25
│ └── bm25.py # FTS5 lexical index
├── docs_mcp/
│ ├── server.py # FastMCP + 11 tools
│ ├── usage.py # TimedCall JSONL telemetry
│ └── api_lessons.md # curated HVM operator gotchas
├── eval/
│ ├── queries.jsonl # 22 hand-curated golden queries
│ ├── retrievers.py # Dense/BM25/Hybrid/Reranked
│ ├── run_eval.py # MRR / Recall@K / nDCG@K
│ └── results/baseline.md # committed eval results
├── scripts/
│ ├── rerank_server.py # dev/CPU cross-encoder /v1/rerank
│ ├── usage_report.py # log summarizer
│ └── registry_gc.py # Gitea container-registry cleanup
└── deploy/
└── docker-compose.yml # production hosting (MCP + reranker + Watchtower)
```
## License
Internal — HVM is HPE's product; the docs MCP is a side project, not
HPE-sanctioned.