From 582a0b9464a5ac46a97429584f271570ccbe50ac Mon Sep 17 00:00:00 2001 From: Justin Paul Date: Fri, 22 May 2026 14:04:14 -0400 Subject: [PATCH] docs: replace template README with HVM-specific content The repo still shipped the docs-mcp-template README. Replaced with HVM-specific: tool table (11 tools), bundle catalog, retrieval eval numbers, architecture diagram, CI cadences, local dev recipes, repo layout. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 227 ++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 152 insertions(+), 75 deletions(-) diff --git a/README.md b/README.md index 52cd328..27461d8 100644 --- a/README.md +++ b/README.md @@ -1,104 +1,181 @@ -# docs-mcp-template +# hvm-docs -A reusable template for building hosted MCP servers over a product's -public documentation. Distilled from one production build; everything -product-specific has been factored out. +A hosted MCP server over the public documentation for **HPE Morpheus +VM Essentials Software** (HVM) — the KVM-based hypervisor platform +from HPE. Lets any MCP-aware client (Claude Desktop, Claude Code, +Cursor, Copilot, MetaMCP) answer questions against the User Manual, +Release Notes, and Deployment Guide; diff pages across 8.1.x +versions; surface what changed recently; and (when enabled) submit +documentation bugs back to HPE. -The end product is a streamable-HTTP MCP server with ~15 tools that -any LLM client (Claude Desktop, Claude Code, Cursor, Copilot) can -call to answer questions against the docs, surface what changed -recently, find inconsistencies, and (optionally) submit doc bugs -back upstream. +Live behind MetaMCP at `https://mcp.jpaul.io/metamcp/hvm-docs/mcp` +once deployed. -## What's here +## Tools -- **[PLAN.md](PLAN.md)** — comprehensive build guide. Phased - approach (13 phases, ~2–3 weeks of focused work for the full - stack). Includes the design decisions, the gotchas, and a - per-product customization checklist. -- **Scaffolded skeleton** — working FastMCP server with stub tools, - Dockerfile, docker-compose, CI workflows, eval harness layout, - usage logging. Everything you need to `git clone` and start - filling in the product-specific bits. +11 tools, registered over MCP streamable-HTTP: -## Quick start +| Tool | Use | +|---|---| +| `search_docs` | BM25-default search with optional version / platform / bundle filters; cross-encoder reranked when `RERANK_URL` is set | +| `get_page` | Full markdown of one page with metadata header + source URL | +| `list_versions` | Discover available versions, doc types, and bundle slugs | +| `list_cluster` | Cross-version peers of a page (synthesized from same-GUID overlap) | +| `diff_versions` | Unified diff of one topic between two bundles | +| `bundle_changelog` | Added / removed / churn-ranked changed pages between two bundles | +| `weekly_digest` | "What changed in the docs in the last N days" — reads CI-baked history.jsonl | +| `corpus_status` | Image build time, upstream Published date, total bundles/pages/chunks | +| `hvm_api_lessons` | Curated operator gotchas (manager sizing, upgrade ordering, plugin/worker compat, console keyboards, backups setup) | +| `find_doc_inconsistencies` | Scoped scan for cross-version drift + redirect-chain stub pages | +| `submit_doc_bug` | Env-gated draft → confirm → submit workflow to HPE's docs feedback (endpoint TBD; currently refuses with manual-fallback) | + +## Corpus + +Confirmed bundles (scraped 2026-05-22 from HPE Support DocPortal): + +| Bundle | docId | Pages | +|---|---|---| +| `hvm_user_manual_8_1_0` | `sd00007520en_us` | 374 | +| `hvm_user_manual_8_1_1` | `sd00007620en_us` | 376 | +| `hvm_user_manual_8_1_2` | `sd00007735en_us` | 376 | +| `hvm_release_notes_8_1_0` | `sd00007497en_us` | 1 | +| `hvm_release_notes_8_1_1` | `sd00007609en_us` | 1 | +| `hvm_release_notes_8_1_2` | `sd00007734en_us` | 1 | +| `hvm_deployment_guide` | `sd00007332en_us` | 32 | + +Total: ~1,161 pages → 2,650 chunks in Chroma + same chunks indexed in +SQLite FTS5 (BM25). + +GUIDs are stable across HVM versions, so `topic_cluster` cross-version +peer mapping is free (no fuzzy matching needed). + +## Retrieval + +Eval against 22 hand-curated golden queries — see +[`eval/results/baseline.md`](eval/results/baseline.md): + +| Retriever | MRR | Recall@5 | nDCG@5 | latency | +|---|---:|---:|---:|---:| +| dense (Ollama nomic-embed-text) | 0.539 | 0.621 | 0.558 | 88 ms | +| BM25 (SQLite FTS5) | 0.880 | 0.909 | 0.883 | 3 ms | +| hybrid (dense + BM25 + RRF) | 0.692 | 0.818 | 0.713 | 69 ms | +| **bm25 + jina-rerank** | **0.920** | **0.939** | **0.927** | 490 ms (CPU) / ~50 ms (GPU) | + +HPE docs use controlled vocabulary, so lexical match dominates; the +cross-encoder cleans up the long tail. See PLAN.md Phase 7/8 for the +reasoning. + +## Architecture + +``` +HPE Support DocPortal (sniff-the-API, no auth) + │ + ▼ + scrape/ ──► corpus//.{md,json} (committed) + │ + ▼ + rag/index ──► chroma/ (dense, 768-dim nomic-embed-text) + ──► bm25/ (SQLite FTS5) + │ + ▼ + docs_mcp.server (FastMCP, streamable-HTTP) + │ + ├── BM25 → reranker (jina-reranker-v2-base GGUF, GPU sidecar) + │ + ▼ + deploy/docker-compose.yml + │ + ├── MetaMCP gateway ── public at mcp.jpaul.io behind Cloudflare Tunnel + ├── jina-rerank ── shared GPU sidecar (1080 Ti) + └── Watchtower ── auto-pulls :latest on weekly refresh +``` + +## CI (Gitea Actions on `git.jpaul.io`) + +Two cadences: + +- **`refresh.yml`** — weekly Monday 06:00 UTC cron + manual dispatch. + Re-scrapes upstream, commits corpus diffs, rebuilds Chroma + BM25, + builds & pushes image. ~5–8 min on the GPU pool. +- **`image-only.yml`** — manual dispatch. Skips scrape; rebuilds + indexes from committed corpus and ships a new image. ~3 min. + +Image: `git.jpaul.io/justin/hvm-docs:latest` (Watchtower target), +plus rolling `:` and `:YYYY.MM.DD` tags. + +Embeddings fan out across the two GPU-pinned Ollama containers on +the Gitea host (`192.168.0.2:11435` Titan X, `:11436` 1080 Ti) — same +infra zerto-docs uses; see `OLLAMA_URLS` in both workflows. + +## Local dev ```bash -git clone https://git.jpaul.io/justin/docs-mcp-template.git my-product-docs -cd my-product-docs -git remote remove origin # detach from template python -m venv venv && source venv/bin/activate pip install -r requirements.txt -# Read PLAN.md before doing anything else. Pay particular attention to -# Phase 1 (scraper) — that's the most product-specific phase. +# (Optional) the CPU dev reranker — pulls PyTorch (~2 GB); skip if +# you'll just be running stdio queries. +pip install -r requirements-rerank.txt -# Run the stub server (no corpus yet — just verifies the wiring): +# Build / refresh the corpus + indexes +python -m scrape.bundles +python -m scrape.runner --all --force --concurrency 6 +python -m rag.index --rebuild + +# Local stdio server (Claude Desktop dev) python -m docs_mcp.server --transport stdio + +# Local streamable-HTTP for integration testing +python -m docs_mcp.server --transport streamable-http --port 8000 + +# Run the eval harness (without reranker) +python -m eval.run_eval --k 5 + +# With the dev reranker +python -m scripts.rerank_server & +RERANK_URL=http://127.0.0.1:8001 python -m eval.run_eval --k 5 ``` ## Repo layout ``` . -├── PLAN.md # The build guide. Read first. -├── README.md -├── requirements.txt +├── PLAN.md # 13-phase build guide (template-shared) +├── CLAUDE.md # Claude Code guidance +├── README.md # this file ├── Dockerfile -├── .gitignore -├── .gitea/workflows/ -│ ├── refresh.yml # Weekly scrape + index + image push -│ └── image-only.yml # On-demand code-only ship +├── requirements.txt # production deps +├── requirements-rerank.txt # dev CPU reranker only +├── bundles.json # bundle catalog (committed) +├── corpus/ # 1,161 scraped pages (committed) +├── .gitea/workflows/ # refresh.yml + image-only.yml ├── scrape/ -│ ├── README.md # Product-specific scraper goes here -│ └── changelog.py # Reusable: --json, --history-out +│ ├── bundles.py # HVM bundle catalog + discovery +│ ├── runner.py # TOC + single-doc page scraper +│ └── changelog.py # git-history → digest JSONL ├── rag/ -│ ├── embeddings.py # Ollama embedder, swappable -│ ├── chunk.py # Chunker — adjust per page format -│ ├── index.py # Builds Chroma + (optionally) BM25 -│ └── bm25.py # SQLite FTS5 lexical index +│ ├── chunk.py # paragraph-aware splitter w/ 6 KB hard cap +│ ├── embeddings.py # OLLAMA_URLS (zerto-style fan-out) +│ ├── index.py # builds Chroma + BM25 +│ └── bm25.py # FTS5 lexical index ├── docs_mcp/ -│ ├── server.py # FastMCP server with stub tools -│ └── usage.py # TimedCall + JSONL telemetry +│ ├── server.py # FastMCP + 11 tools +│ ├── usage.py # TimedCall JSONL telemetry +│ └── api_lessons.md # curated HVM operator gotchas ├── eval/ -│ ├── queries.jsonl.example # Curate ~25 hand-labeled queries -│ ├── retrievers.py # Retriever protocol + implementations -│ └── run_eval.py # MRR / Recall@k / nDCG@k harness +│ ├── queries.jsonl # 22 hand-curated golden queries +│ ├── retrievers.py # Dense/BM25/Hybrid/Reranked +│ ├── run_eval.py # MRR / Recall@K / nDCG@K +│ └── results/baseline.md # committed eval results ├── scripts/ -│ ├── usage_report.py # Standalone log analyzer -│ └── registry_gc.py # Container registry cleanup +│ ├── rerank_server.py # dev/CPU cross-encoder /v1/rerank +│ ├── usage_report.py # log summarizer +│ └── registry_gc.py # Gitea container-registry cleanup └── deploy/ - └── docker-compose.yml # Hosting stack: MCP + reranker + Watchtower + └── docker-compose.yml # production hosting (MCP + reranker + Watchtower) ``` -## What's product-specific (must implement) - -- `scrape/` — the scraper itself. The template gives you the corpus - layout contract and a working `changelog.py`; the actual extraction - logic is yours. -- The corpus on disk (gitignored; rebuilt by CI). -- The reranker GGUF model and llama.cpp container (commented in - `deploy/docker-compose.yml`). -- The reverse proxy / TLS layer in front of the public endpoint. -- The hand-curated knowledge surface (your product's API gotchas, - example scripts, anything the LLM should know that the docs - don't say). - -## What's NOT product-specific (works as-is) - -- FastMCP server skeleton + tool decoration pattern -- Chroma + Ollama embedding pipeline -- BM25 / SQLite FTS5 lexical index -- Hybrid retrieval (RRF) + reranker integration -- Eval harness (Retriever protocol, MRR/Recall/nDCG) -- Usage logging (TimedCall, JSONL, daily rotation) -- CI workflow shape (weekly + on-demand, retry-on-race, three-tag - image scheme) -- Registry GC script -- Standard tools: `search_docs`, `get_page`, `list_versions`, - `diff_versions`, `bundle_changelog`, `weekly_digest`, - `find_doc_inconsistencies`, `submit_doc_bug`, etc. - ## License -Internal template. Adjust before publishing. +Internal — HVM is HPE's product; the docs MCP is a side project, not +HPE-sanctioned. -- 2.52.0