Files

28 lines
1.1 KiB
Markdown

# scrape/quickspecs/
Static HTML fixtures for HPE QuickSpecs documents that aren't reachable
from the runner (www.hpe.com edge drops connections from datacenter IPs
with non-browser User-Agents — verified 2026-05-22 with curl, wget, and
Anthropic's WebFetch).
## Workflow
1. Operator visits `https://www.hpe.com/psnow/doc/<doc_id>` in a real
browser, opens DevTools → Elements → Copy the `<body>` HTML.
2. Save it at `scrape/quickspecs/<doc_id>.html`.
3. Add a bundle entry in `scrape/bundles.py` with `mode="html-file"`.
4. `python -m scrape.runner --bundle hvm_quickspecs --force` reads the
committed HTML and writes `corpus/hvm_quickspecs/<doc_id>.{md,json}`.
5. Re-index and ship.
QuickSpecs only update every few months (HPE rebrand, new SKU added,
feature change). When a new version drops, refresh the local HTML
file and re-run the scrape.
## Current fixtures
- `a50004260enw.html` — HPE Morpheus VM Essentials Software QuickSpecs
(Version 4, 02-February-2026). SKUs: S5Q81AAE (1-yr), S5Q82AAE
(3-yr), S5Q83AAE (5-yr) — all "per Socket E-LTU" with Tech Care
Essentials included.