v0.1.2: Config Checkpoints dialog, descriptions, daily auto-snapshot, docs (#3)

* Documentation: install/upgrade/uninstall guides + recipes incl. Zerto

Adds a docs/ folder under the repo root with full operator documentation
aimed at sysadmins (not webhook developers). The Zerto pre/post script
recipe is the canonical "why does this exist" walkthrough; the GitHub
HMAC, AD password reset, and UI-on-desktop recipes round out common
patterns.

Pages:
- README.md (index)
- concepts.md (5-minute "what is a webhook" explainer)
- installation.md (interactive + silent install)
- upgrading.md (single-click upgrade flow + edge cases)
- uninstalling.md (clean removal + wiping ProgramData)
- runas-modes.md (Service / InteractiveUser / SpecificUser decision flow)
- service-account-and-ad.md (gMSA setup, delegated rights)
- network-and-security.md (bind addresses, allowlists, HTTPS, secret storage)
- troubleshooting.md (symptom -> first check, common errors)
- recipes/zerto-pre-post-scripts.md (canonical use case)
- recipes/github-style-hmac.md (GitHub / Stripe-shaped webhooks)
- recipes/ad-password-reset.md (gMSA-backed self-service reset)
- recipes/ui-on-desktop.md (InteractiveUser pattern)

Top-level README.md restructured to point at docs/ as the source of
truth, dropping the duplicated installation snippets.

Installer ships docs/ alongside the binaries so they're available
offline at C:\Program Files\WebhookServer\docs\. GUI Help menu gains
a "Documentation" item that opens the docs site in a browser.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Config Checkpoints dialog + daily auto-checkpoint; drop installer GUI launch

Three fixes:

1. Config Checkpoints submenu replaced with a proper dialog. Lists
   checkpoints with timestamp/size/filename, has a "Take Checkpoint
   Now" button, and a "Roll Back" button that becomes enabled when a
   row is selected. The previous click-a-menu-entry-immediate-restore
   flow was too easy to fire by accident.

2. New CheckpointScheduler BackgroundService creates a checkpoint at
   midnight every day. Combined with the existing auto-on-save
   snapshots, this guarantees a daily rollback point even if the
   config wasn't edited that day. A new "create-checkpoint" admin op
   plus AdminPipeServer.CreateCheckpoint helper does the actual file
   copy; both manual (via the dialog) and the scheduler use it.

3. Installer: drop the post-install "Launch Webhook Server" wizard
   step. It tried to launch the GUI un-elevated, which fails because
   the GUI's manifest is requireAdministrator. The Start Menu shortcut
   handles elevation correctly, so the user can launch from there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Docs: replace AD-reset recipe with realistic Zerto failover walkthrough

The AD password reset endpoint was a poor fit for what people actually
need this server for. Replaced with a realistic Zerto post-failover
example that's much closer to the project's purpose:

- Update DNS A records for failed-over hostnames
- Wait for the VM to come up at the DR site
- PowerShell-remote into the VM and check / start critical services
- Notify Teams with the result

The flagship pattern is now: Zerto post-script (curl, fire-and-forget)
calls an Async webhook endpoint -> 202 in milliseconds -> Zerto's
failover sequence is never blocked. The server runs the actual work in
the background, with full output captured in the daily log.

A ready-to-use Zerto-side script ships at
scripts/examples/zerto-post-failover.ps1 - pure curl.exe (no
PowerShell modules), reads the bearer token from a file the ZVM
service account can read.

The installer now bundles scripts/examples/ alongside docs/ so the
example is also available locally at
C:\Program Files\WebhookServer\scripts\examples\.

Removed: docs/recipes/ad-password-reset.md.
Updated: docs/README.md, README.md, the recipe content itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Restore installer GUI launch (via shellexec) + checkpoint descriptions

Two follow-ups to the previous Config Checkpoints commit:

1. Bring back the post-install "Launch Webhook Server" checkbox in the
   installer. The previous attempt failed because Inno Setup's
   postinstall flag launches via CreateProcess after Setup exits,
   bypassing the GUI's requireAdministrator manifest. Adding the
   shellexec flag switches to ShellExecute, which DOES honor the
   manifest and triggers a clean UAC prompt - so the post-install
   GUI launch works as expected.

2. Each checkpoint now carries a description, stored in a sidecar
   .meta.json file next to the snapshot. Defaults:
     - Auto-on-save: "Before save"
     - Midnight scheduler: "Nightly auto-checkpoint"
     - Manual: opens a small dialog so the user can type a meaningful
       description (defaults to "Manual checkpoint" if blank)
   The dialog and pruning both clean up sidecars alongside snapshots.
   The Config Checkpoints grid grows a Description column between
   When and Size.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.1.2: bump checkpoint retention 30 -> 90

Each checkpoint is a few KB of JSON plus a tiny sidecar; even at 90
entries on a config with hundreds of endpoints the on-disk footprint
is negligible (worst case ~20 MB). With daily auto-checkpoints plus
on-save snapshots, 30 entries could fill in a couple weeks of
moderate use; 90 gives a comfortable ~3-month window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-08 10:49:09 -04:00
committed by GitHub
parent 7d94535d5d
commit f00ee0cf3a
30 changed files with 1871 additions and 146 deletions
+122
View File
@@ -0,0 +1,122 @@
# Recipe: GitHub-style HMAC-signed webhook
GitHub, Stripe, Slack, Shopify, and most SaaS providers sign their outbound webhooks with HMAC. The receiver computes the same HMAC over the request body using a shared secret and rejects the request if the signatures don't match. Webhook Server has this built in — you just point a real GitHub webhook at your endpoint.
## What we're building
A webhook URL that GitHub calls on every push to a repo. The server runs a PowerShell script that pulls the latest commit and triggers a deployment. Authentication is HMAC-SHA256 over the request body, using the secret you configured in GitHub's webhook settings.
## On the GitHub side
In your repo: **Settings → Webhooks → Add webhook**.
| Field | Value |
|---|---|
| Payload URL | `https://hooks.contoso.com/hook/gh-deploy` (yes, HTTPS — GitHub enforces it for public hosts) |
| Content type | `application/json` |
| Secret | Generate a long random string. Copy it for the next step. |
| SSL verification | Enable |
| Events | Just `push` |
Save. GitHub immediately delivers a `ping` event for testing. You'll see it in **Recent Deliveries** with whatever response code your server returns.
## The PowerShell deployment script
`C:\Scripts\gh-deploy.ps1`:
```powershell
[CmdletBinding()]
param()
$ErrorActionPreference = 'Stop'
$payload = $input | ConvertFrom-Json
# Verify the event type via the X-GitHub-Event header passed as an env var
$event = $env:WEBHOOK_HEADER_X_GITHUB_EVENT
if ($event -eq 'ping') {
"got ping from $($payload.repository.full_name)"
return
}
if ($event -ne 'push') {
Write-Error "ignoring $event event"
}
$repo = $payload.repository.full_name
$branch = $payload.ref -replace '^refs/heads/', ''
$sha = $payload.after
if ($branch -ne 'main') {
"ignoring push to $branch"
return
}
$repoDir = "C:\Deploys\$($payload.repository.name)"
if (-not (Test-Path $repoDir)) {
git clone "https://github.com/$repo.git" $repoDir
}
Push-Location $repoDir
try {
git fetch --all
git reset --hard $sha
# ...your build/deploy steps here...
& npm ci
& npm run build
Restart-Service MyAppService
}
finally {
Pop-Location
}
"deployed $repo @ $sha"
```
## Configure the endpoint
**File → New endpoint**:
| Section | Setting | Value |
|---|---|---|
| Identity | Slug | `gh-deploy` |
| Auth | Mode | **HMAC** |
| Auth | HMAC secret | paste the GitHub-side secret |
| Auth | HMAC header | `X-Hub-Signature-256` *(GitHub's default)* |
| Allowed clients | | `140.82.112.0/20`, `192.30.252.0/22` *(GitHub's webhook IP ranges; check [docs.github.com](https://api.github.com/meta) for the live list)* |
| Executor | Type | **Windows PowerShell** |
| Executor | Script path | `C:\Scripts\gh-deploy.ps1` |
| Data passing | JSON body to stdin | ✓ |
| Data passing | Headers/query as env vars | ✓ *(needed so `WEBHOOK_HEADER_X_GITHUB_EVENT` is set)* |
| Run as | Identity | **Service** (default) — assumes the deployment is local |
| Response | Mode | **Async** *(GitHub times out fast; don't make it wait for the build)* |
| Response | Timeout (sec) | `600` |
Save.
## What HMAC does for you here
GitHub computes `sha256(body, secret)` and sends it as `sha256=<hex>` in `X-Hub-Signature-256`. Webhook Server computes the same hash, verifies in fixed time, and rejects (401) on mismatch.
This means:
- A request with a tampered body fails the check
- A captured request can be **replayed verbatim** (the signature is valid for that body) — if that matters, GitHub also includes a `X-GitHub-Delivery` ID and timestamp you can deduplicate against
- The secret never travels over the network — only the digest does, so HTTPS is for confidentiality of the body, not the secret
## Adapting for Stripe, Slack, etc.
Same pattern, different headers and signing details. The four HMAC fields in the editor cover all common variants:
| Provider | Header | Prefix | Encoding | Algorithm |
|---|---|---|---|---|
| GitHub | `X-Hub-Signature-256` | `sha256=` | hex | SHA-256 |
| Stripe | `Stripe-Signature` | (none — but Stripe's format is multipart, see below) | hex | SHA-256 |
| Slack | `X-Slack-Signature` | `v0=` | hex | SHA-256 |
| Generic / custom | configurable | configurable | configurable | SHA-1 / SHA-256 / SHA-512 |
**Stripe** is special: their `Stripe-Signature` header has the format `t=<timestamp>,v1=<sig>,v0=<sig>`, where `v1` is HMAC-SHA256 of `<timestamp>.<body>`. Webhook Server's straight HMAC check doesn't match Stripe's signed-with-timestamp scheme. Workarounds:
- Use **Bearer auth** on Stripe webhooks instead, since you already control the secret
- Or do unauthenticated + IP allowlist + a script-side signature check using their official validation library
For everything that's "GitHub-shaped" (signed body, raw HMAC), the built-in HMAC mode is the right pick.
+68
View File
@@ -0,0 +1,68 @@
# Recipe: Pop UI on the user's desktop
The classic "fire a hook from your phone, see a calculator window appear on your PC." Useful for:
- Triggering interactive installers / wizards
- Opening browser tabs to specific dashboards on demand
- Playing a sound / showing a toast notification
- Demos and party tricks
## Why this is non-trivial on Windows
The Webhook Server service runs as `LocalSystem` in **session 0**. Anything launched normally from a Service-mode endpoint also lands in session 0, which has no visible desktop — UI runs but nobody sees it. To put a window on the desktop of whoever is logged in at the keyboard, the service has to:
1. Find the active console session ID (`WTSGetActiveConsoleSessionId`)
2. Get a primary token for the user in that session (`WTSQueryUserToken`)
3. Spawn the new process with `CreateProcessAsUser` against that token, targeting `winsta0\default`
Webhook Server does all of this for you when the endpoint's **Run as** is set to **InteractiveUser**.
## Configure the endpoint
| Section | Setting | Value |
|---|---|---|
| Identity | Slug | `calc` |
| Identity | Description | "Pop calculator on the logged-in user's desktop" |
| Auth | Mode | None / Bearer — your call |
| Allowed clients | | restrict; this is interactive UI |
| Executor | Type | **Executable** |
| Executor | Executable path | `C:\Windows\System32\calc.exe` |
| Run as | Identity | **InteractiveUser** |
| Response | Mode | **Async** *(calc never exits on its own; sync would 30-second-timeout-kill it every time)* |
| Response | Fail on non-zero exit | unticked |
Save. Hit `http://localhost:8080/hook/calc` from anywhere — calc.exe pops up on your desktop.
## Limits
- **Service must run as LocalSystem.** Only SYSTEM has the `SeTcbPrivilege` required for `WTSQueryUserToken`. If you switched the service to a gMSA (e.g. for AD-write hooks), this mode stops working. Run two instances of Webhook Server on different ports if you need both.
- **Someone must be logged in** at the console. If the desktop is at the lock screen with no user signed in, the hook fails with `No active console session - is anyone logged in at the keyboard?`.
- **RDP sessions complicate things.** `WTSGetActiveConsoleSessionId` always returns the *console* session, not RDP sessions. If only RDP users are connected and no one is at the physical keyboard, this mode fails. (A separate API, `WTSQueryUserToken` against an enumerated session ID, can target RDP — that'd be a v0.x feature request.)
- **Multiple users logged in via fast-user-switching** — the hook lands in whichever session is currently active (the foreground desktop), not all of them.
## Variations
### Notification toast instead of a window
Use a PowerShell script that emits a Windows 10/11 toast via `BurntToast` (third-party module) or the built-in WinRT API:
```powershell
# requires: Install-Module BurntToast
New-BurntToastNotification -Text 'Webhook fired',$($input | Out-String)
```
Configure the endpoint as InteractiveUser + WindowsPowerShell + inline command. The toast appears as the logged-in user — same as if they fired it themselves.
### Open a URL in the user's default browser
```powershell
Start-Process ($input | ConvertFrom-Json).url
```
Body: `{ "url": "https://contoso.servicenow.com/incident/123" }`
This opens the URL in whatever the user has set as default. Handy for "page on-call → they reply on their phone with a link → URL opens on their workstation when they sit down."
### Run a setup wizard / installer that needs UI
Some installers refuse to run silently or have steps that require human input. Wrap them as InteractiveUser hooks so the operator can trigger them from a help-desk console without having to RDP in.
+243
View File
@@ -0,0 +1,243 @@
# Recipe: Zerto failover post-script → DNS update + service checks
This is the canonical reason Webhook Server exists.
When Zerto fails a VM over from production to DR, the VM boots fine — but **the things around it** often need attention: DNS records still point at the production IP, dependent services need to be checked, on-call needs a heads-up. Zerto pre/post scripts run on the **Zerto Virtual Manager**, not on a domain controller and not necessarily with admin rights to the things that need fixing. So you want a single webhook URL that the post-script hits, and a Windows host on the DR side that does the actual work with the right identity.
## What we're building
Zerto's post-recovery script (a one-shot PowerShell file pointing at curl) calls `http://webhook.dr.contoso.local:8080/hook/post-failover` with a JSON body identifying the VPG and operation. The Webhook Server, running on a DR-side Windows host as a gMSA with delegated AD/DNS rights, runs PowerShell that:
1. Updates DNS A records to point the failed-over hostnames at their DR IPs
2. Waits for the failed-over VM to come up (ping + WinRM probe)
3. Connects to the VM via PowerShell remoting and starts/checks critical services
4. Sends a Teams notification with the result
The endpoint is **Async** so the Zerto script returns in milliseconds — no risk of timing out Zerto's failover sequence even if the actions take minutes. The script's full output ends up in the webhook log and (optionally) in an outbound callback.
## Why curl and not Invoke-WebRequest?
Zerto's PowerShell runner is intentionally minimal — many environments run an older Windows on the ZVM and don't have full PowerShell modules installed. `curl.exe` ships with Windows 10 1803+ and Server 2019+ and works without any modules. Plus, calling an HTTP endpoint with `curl.exe` doesn't depend on the version of `Invoke-WebRequest` shipped with the host's PowerShell.
## 1. The Zerto post-script (client side)
A ready-to-use script ships in this repo at [`scripts/examples/zerto-post-failover.ps1`](../../scripts/examples/zerto-post-failover.ps1). Copy it to the ZVM, edit `$WebhookUrl` and the bearer-token path at the top, and wire it into the VPG:
> **VPG settings → Recovery → Scripts → Post-Recovery Script**
> Path: `C:\Scripts\zerto-post-failover.ps1`
> Parameters: *(leave empty)*
The script is ~50 lines and only depends on `curl.exe` + a token file readable by the ZVM service account.
The flow:
```
Zerto VPG failover starts
|
+-- VM is brought up at DR site
|
+-- Zerto post-script fires:
| curl POST http://webhook.dr/hook/post-failover (async, returns 202 in ~50ms)
|
+-- Zerto sees success, finishes the failover and reports done
|
(meanwhile, on the webhook server)
|
running PowerShell for several minutes:
- update DNS
- wait for VM ready
- check services on VM
- notify Teams
```
## 2. The server-side script (does the actual work)
Save this on the webhook host as `C:\Scripts\post-failover-handler.ps1`:
```powershell
[CmdletBinding()]
param()
$ErrorActionPreference = 'Stop'
$body = $input | ConvertFrom-Json
# ---------- environment specifics; edit for your site ----------
$dnsServer = 'dc01.contoso.local'
$forwardZone = 'contoso.local'
$teamsWebhook = 'https://contoso.webhook.office.com/...'
$drIpMap = @{
'app01' = '10.42.10.11'
'app02' = '10.42.10.12'
'db01' = '10.42.10.21'
}
$serviceMap = @{
'app01' = @('W3SVC','MyAppSvc')
'app02' = @('W3SVC','MyAppSvc')
'db01' = @('MSSQLSERVER','SQLAgent')
}
# ---------------------------------------------------------------
# Default the VM list to "all VMs we know about" if the post-script didn't
# tell us, so the same handler works without having to embed the VM list in
# every Zerto post-script.
$vms = if ($body.vms) { $body.vms } else { $drIpMap.Keys }
$summary = @()
foreach ($vm in $vms) {
if (-not $drIpMap.ContainsKey($vm)) {
$summary += "skip $vm (no DR IP mapping in handler)"
continue
}
$ip = $drIpMap[$vm]
# 1. DNS - delete + re-add the A record
try {
$existing = Get-DnsServerResourceRecord -ZoneName $forwardZone -Name $vm `
-RRType A -ComputerName $dnsServer -ErrorAction SilentlyContinue
if ($existing) {
Remove-DnsServerResourceRecord -ZoneName $forwardZone -Name $vm `
-RRType A -RecordData $existing.RecordData.IPv4Address `
-ComputerName $dnsServer -Force
}
Add-DnsServerResourceRecordA -ZoneName $forwardZone -Name $vm `
-IPv4Address $ip -ComputerName $dnsServer -TimeToLive 00:05:00
$summary += "dns $vm -> $ip"
} catch {
$summary += "DNS! $vm $($_.Exception.Message)"
continue
}
# 2. Wait for the VM to be reachable (up to 5 minutes)
$deadline = (Get-Date).AddMinutes(5)
$reachable = $false
while ((Get-Date) -lt $deadline) {
if (Test-Connection -ComputerName $ip -Count 1 -Quiet -ErrorAction SilentlyContinue) {
try {
# Quick WinRM probe; succeeds when the VM has finished booting
Invoke-Command -ComputerName $ip -ScriptBlock { $true } -ErrorAction Stop | Out-Null
$reachable = $true
break
} catch { Start-Sleep -Seconds 10 }
} else {
Start-Sleep -Seconds 10
}
}
if (-not $reachable) {
$summary += "wait! $vm not reachable after 5 minutes"
continue
}
# 3. Check + start critical services on the VM
if ($serviceMap.ContainsKey($vm)) {
$svcReport = Invoke-Command -ComputerName $ip -ArgumentList @(,$serviceMap[$vm]) -ScriptBlock {
param($services)
$report = @()
foreach ($s in $services) {
$svc = Get-Service -Name $s -ErrorAction SilentlyContinue
if (-not $svc) { $report += "$s : missing"; continue }
if ($svc.Status -ne 'Running') {
Start-Service $s
Start-Sleep -Seconds 2
$svc.Refresh()
}
$report += "$s : $($svc.Status)"
}
return $report
}
$summary += "svc $vm : $($svcReport -join ', ')"
} else {
$summary += "svc $vm (no services configured)"
}
}
# 4. Notify Teams
$teamsBody = @{
text = "Webhook post-failover for VPG **$($body.vpg)**:`n" + ($summary -join "`n")
} | ConvertTo-Json
try {
Invoke-RestMethod -Uri $teamsWebhook -Method POST -ContentType 'application/json' -Body $teamsBody | Out-Null
} catch {
$summary += "teams! notification failed: $($_.Exception.Message)"
}
# Return the summary so it shows up in the webhook log + outbound callback
$summary -join "`n"
```
Two things to call out:
- **PowerShell remoting to the VM** uses the gMSA's network identity (or whoever the service runs as). Make sure the gMSA / service account can `Invoke-Command` to the failed-over hosts — usually that means the account is a local admin on the target VMs, or you've configured constrained delegation.
- **WinRM** must be enabled on the failed-over VMs for the remoting calls to work. `Enable-PSRemoting` is the simplest, but most prod environments configure WinRM via Group Policy.
## 3. Configure the endpoint in the GUI
**File → New endpoint:**
| Section | Setting | Value |
|---|---|---|
| Identity | Slug | `post-failover` |
| Identity | Description | "Zerto post-recovery: DNS + service checks" |
| Auth | Mode | **Bearer** |
| Auth | Bearer secret | generate a 32-byte random string; copy it for the Zerto script's token file |
| Allowed clients | (one per line) | `10.0.0.0/8` *(your ZVM's network)* |
| Executor | Type | **Windows PowerShell** |
| Executor | Script path | `C:\Scripts\post-failover-handler.ps1` |
| Data passing | JSON body to stdin | ✓ |
| Run as | Identity | **Service** if the service runs under a gMSA with the right rights, otherwise **SpecificUser** with a delegated account |
| Response | Mode | **Async** ← critical: this is what makes the Zerto script non-blocking |
| Response | Timeout (sec) | `600` *(this is the cap on the long-running handler script, not the Zerto-facing response)* |
| Response | Fail on non-zero exit | unticked *(async hooks have no caller to receive a 502)* |
Save. Right-click the row → **Copy URL** to grab `http://webhook.dr.contoso.local:8080/hook/post-failover` and paste it into `$WebhookUrl` at the top of the Zerto-side script.
> **Why Bearer instead of HMAC?** Both work. Bearer is simpler — drop the token in a file on the ZVM that's readable by the ZVM service account and you're done. HMAC requires the Zerto-side script to compute a signature, which is doable but adds a few lines of code. Pick what fits your environment.
## 4. Wire up the bearer token
Place the bearer token in a file the ZVM service account can read (and nobody else):
```powershell
# on the ZVM, from elevated PowerShell
$token = (New-Guid).ToString('N') # or paste the value from the GUI
$tokenPath = 'C:\ProgramData\Zerto\webhook-token.txt'
$token | Out-File -LiteralPath $tokenPath -Encoding utf8 -NoNewline
icacls $tokenPath /inheritance:r /grant 'NT SERVICE\Zerto Online Services:R' 'BUILTIN\Administrators:F' /T
```
Adjust the service principal name to whatever Zerto runs as on your version. The script reads from this path automatically; no change needed in the script itself.
## 5. Test before going live
In a maintenance window, fire the webhook by hand:
```powershell
# from any machine that can reach the webhook server
$body = @{
operation = 'test'
vpg = 'SmokeTest'
timestamp = (Get-Date).ToUniversalTime().ToString('o')
} | ConvertTo-Json -Compress
curl.exe --silent --show-error --max-time 10 -X POST `
-H "Authorization: Bearer paste-the-token" `
-H "Content-Type: application/json" `
-d $body `
http://webhook.dr.contoso.local:8080/hook/post-failover
```
You'll get back `{"runId":"…","accepted":true}` immediately. Open the Webhook Server GUI and watch the log panel — within 30 seconds or so you'll see lines for the run. Confirm DNS records updated, services on each VM ended in `Running`, and the Teams notification arrived.
## Variations
### Different actions for failover vs. failback
Pass an `operation` field in the body and branch on it. The Zerto-side script already sends `operation = 'failover'`. Add a separate post-failback script (or detect from `$env:ZertoOperationType`) that sends `operation = 'failback'` and have the handler revert DNS to production IPs.
### Per-VPG endpoints
If you want fine-grained access control or different actions per VPG, create one endpoint per VPG (`post-failover-app`, `post-failover-db`, …) and give each its own bearer token. The GUI handles dozens of endpoints fine.
### Audit trail to a SIEM
Each endpoint can have an outbound **Callback** URL. Configure it with your SIEM's HTTP collector + an HMAC secret, and every run produces a JSON record with runId, exit code, duration, stdout, and stderr — perfect for compliance.