28 lines
1.1 KiB
Markdown
28 lines
1.1 KiB
Markdown
# scrape/quickspecs/
|
|
|
|
Static HTML fixtures for HPE QuickSpecs documents that aren't reachable
|
|
from the runner (www.hpe.com edge drops connections from datacenter IPs
|
|
with non-browser User-Agents — verified 2026-05-22 with curl, wget, and
|
|
Anthropic's WebFetch).
|
|
|
|
## Workflow
|
|
|
|
1. Operator visits `https://www.hpe.com/psnow/doc/<doc_id>` in a real
|
|
browser, opens DevTools → Elements → Copy the `<body>` HTML.
|
|
2. Save it at `scrape/quickspecs/<doc_id>.html`.
|
|
3. Add a bundle entry in `scrape/bundles.py` with `mode="html-file"`.
|
|
4. `python -m scrape.runner --bundle hvm_quickspecs --force` reads the
|
|
committed HTML and writes `corpus/hvm_quickspecs/<doc_id>.{md,json}`.
|
|
5. Re-index and ship.
|
|
|
|
QuickSpecs only update every few months (HPE rebrand, new SKU added,
|
|
feature change). When a new version drops, refresh the local HTML
|
|
file and re-run the scrape.
|
|
|
|
## Current fixtures
|
|
|
|
- `a50004260enw.html` — HPE Morpheus VM Essentials Software QuickSpecs
|
|
(Version 4, 02-February-2026). SKUs: S5Q81AAE (1-yr), S5Q82AAE
|
|
(3-yr), S5Q83AAE (5-yr) — all "per Socket E-LTU" with Tech Care
|
|
Essentials included.
|