Files
webhook-server/docs/troubleshooting.md
justin f00ee0cf3a v0.1.2: Config Checkpoints dialog, descriptions, daily auto-snapshot, docs (#3)
* Documentation: install/upgrade/uninstall guides + recipes incl. Zerto

Adds a docs/ folder under the repo root with full operator documentation
aimed at sysadmins (not webhook developers). The Zerto pre/post script
recipe is the canonical "why does this exist" walkthrough; the GitHub
HMAC, AD password reset, and UI-on-desktop recipes round out common
patterns.

Pages:
- README.md (index)
- concepts.md (5-minute "what is a webhook" explainer)
- installation.md (interactive + silent install)
- upgrading.md (single-click upgrade flow + edge cases)
- uninstalling.md (clean removal + wiping ProgramData)
- runas-modes.md (Service / InteractiveUser / SpecificUser decision flow)
- service-account-and-ad.md (gMSA setup, delegated rights)
- network-and-security.md (bind addresses, allowlists, HTTPS, secret storage)
- troubleshooting.md (symptom -> first check, common errors)
- recipes/zerto-pre-post-scripts.md (canonical use case)
- recipes/github-style-hmac.md (GitHub / Stripe-shaped webhooks)
- recipes/ad-password-reset.md (gMSA-backed self-service reset)
- recipes/ui-on-desktop.md (InteractiveUser pattern)

Top-level README.md restructured to point at docs/ as the source of
truth, dropping the duplicated installation snippets.

Installer ships docs/ alongside the binaries so they're available
offline at C:\Program Files\WebhookServer\docs\. GUI Help menu gains
a "Documentation" item that opens the docs site in a browser.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Config Checkpoints dialog + daily auto-checkpoint; drop installer GUI launch

Three fixes:

1. Config Checkpoints submenu replaced with a proper dialog. Lists
   checkpoints with timestamp/size/filename, has a "Take Checkpoint
   Now" button, and a "Roll Back" button that becomes enabled when a
   row is selected. The previous click-a-menu-entry-immediate-restore
   flow was too easy to fire by accident.

2. New CheckpointScheduler BackgroundService creates a checkpoint at
   midnight every day. Combined with the existing auto-on-save
   snapshots, this guarantees a daily rollback point even if the
   config wasn't edited that day. A new "create-checkpoint" admin op
   plus AdminPipeServer.CreateCheckpoint helper does the actual file
   copy; both manual (via the dialog) and the scheduler use it.

3. Installer: drop the post-install "Launch Webhook Server" wizard
   step. It tried to launch the GUI un-elevated, which fails because
   the GUI's manifest is requireAdministrator. The Start Menu shortcut
   handles elevation correctly, so the user can launch from there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Docs: replace AD-reset recipe with realistic Zerto failover walkthrough

The AD password reset endpoint was a poor fit for what people actually
need this server for. Replaced with a realistic Zerto post-failover
example that's much closer to the project's purpose:

- Update DNS A records for failed-over hostnames
- Wait for the VM to come up at the DR site
- PowerShell-remote into the VM and check / start critical services
- Notify Teams with the result

The flagship pattern is now: Zerto post-script (curl, fire-and-forget)
calls an Async webhook endpoint -> 202 in milliseconds -> Zerto's
failover sequence is never blocked. The server runs the actual work in
the background, with full output captured in the daily log.

A ready-to-use Zerto-side script ships at
scripts/examples/zerto-post-failover.ps1 - pure curl.exe (no
PowerShell modules), reads the bearer token from a file the ZVM
service account can read.

The installer now bundles scripts/examples/ alongside docs/ so the
example is also available locally at
C:\Program Files\WebhookServer\scripts\examples\.

Removed: docs/recipes/ad-password-reset.md.
Updated: docs/README.md, README.md, the recipe content itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Restore installer GUI launch (via shellexec) + checkpoint descriptions

Two follow-ups to the previous Config Checkpoints commit:

1. Bring back the post-install "Launch Webhook Server" checkbox in the
   installer. The previous attempt failed because Inno Setup's
   postinstall flag launches via CreateProcess after Setup exits,
   bypassing the GUI's requireAdministrator manifest. Adding the
   shellexec flag switches to ShellExecute, which DOES honor the
   manifest and triggers a clean UAC prompt - so the post-install
   GUI launch works as expected.

2. Each checkpoint now carries a description, stored in a sidecar
   .meta.json file next to the snapshot. Defaults:
     - Auto-on-save: "Before save"
     - Midnight scheduler: "Nightly auto-checkpoint"
     - Manual: opens a small dialog so the user can type a meaningful
       description (defaults to "Manual checkpoint" if blank)
   The dialog and pruning both clean up sidecars alongside snapshots.
   The Config Checkpoints grid grows a Description column between
   When and Size.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.1.2: bump checkpoint retention 30 -> 90

Each checkpoint is a few KB of JSON plus a tiny sidecar; even at 90
entries on a config with hundreds of endpoints the on-disk footprint
is negligible (worst case ~20 MB). With daily auto-checkpoints plus
on-save snapshots, 30 entries could fill in a couple weeks of
moderate use; 90 gives a comfortable ~3-month window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 10:49:09 -04:00

6.8 KiB

Troubleshooting

This page indexes the most common ways things go wrong, where to look, and what to do.

Where to look first

Symptom First check
GUI shows "Disconnected" Service running? Get-Service WebhookServer
Hook returns 404 Slug typo, or endpoint disabled
Hook returns 401 Auth header / signature mismatch
Hook returns 403 IP allowlist denies the caller
Hook returns 200 but nothing happens Response is the script's stdout — check exit code, stderr
Hook returns 502 Script ran and exited non-zero. Body has stderr.
Hook returns 500 Launch error (script not found, invalid path)
Hook hangs Timeout reached, or script is waiting on stdin
Calc / UI doesn't appear despite InteractiveUser See Run As modes — common pitfalls

Where the logs are

C:\ProgramData\WebhookServer\logs\webhook-YYYYMMDD.log — daily rolling, 14-day retention by default.

Every webhook run logs:

  • [INF] Run <id> <slug> ok exit=0 dur=<ms>ms stdout=... on success
  • [WRN] Run <id> <slug> non-zero exit=<n> dur=<ms>ms stdout=... stderr=... on script failure
  • [WRN] Run <id> <slug> failed to launch: <reason> on launch failure
  • [WRN] Run <id> <slug> timed out after <s>s; process killed on timeout

The GUI's bottom panel auto-refreshes the same log file every 3 seconds. Tick the Auto-scroll checkbox to keep it pinned to the latest line.

Common issues

"Disconnected: Access to the path is denied" right after install

You launched the GUI without elevation. The admin pipe ACL is SYSTEM + Administrators-full-control; UAC token splitting denies the standard token.

Fix in v0.1.1+: nothing — the GUI's manifest is requireAdministrator and Start Menu / shortcut launches auto-elevate.

Fix in v0.1.0: right-click the Start Menu shortcut → Run as administrator, or upgrade.

"Connection refused" hitting the hook URL

Three possibilities, in order of probability:

  1. Service stopped. Get-Service WebhookServer and Start-Service WebhookServer if needed.
  2. Wrong port. Default is 8080. Check Server → Settings → HTTP port in the GUI, or netstat -an | findstr :8080.
  3. Bound to a specific NIC and you're calling on another. Check Server → Settings → Listen on. If "Listen on all interfaces" is unchecked and you only ticked LAN IPs, calls to localhost may fail. Tick 127.0.0.1 too.

Hook works from localhost but not from another machine on the LAN

Windows Firewall. The installer doesn't add a firewall rule (intentional — you should choose your scope). Add one:

# from elevated PowerShell on the webhook host
New-NetFirewallRule -DisplayName "Webhook Server HTTP 8080" -Direction Inbound `
    -Action Allow -Protocol TCP -LocalPort 8080 -Profile Domain,Private

Use -Profile Public only if you really mean it. Better: front the server with a reverse proxy and don't expose 8080 directly.

[WRN] Run … failed to launch: launch error: An error occurred trying to start process 'X'. Access is denied.

Likely SpecificUser mode + psi.UserName failure. Should be impossible in v0.1.1+ (we use LogonUser + CreateProcessAsUser directly). If you see this on v0.1.1, double-check the version: Get-Item "C:\Program Files\WebhookServer\WebhookServer.Service.exe" | % VersionInfo.

[WRN] Run … failed to launch: LogonUser (DOMAIN\user) failed

The credentials don't authenticate. Common causes:

  • Typo in the password (paste it back into the GUI to verify; the field is plaintext for an admin user)
  • Account locked / disabled / expired
  • The account is denied the right logon types — check secpol.msc → Local Policies → User Rights Assignment → "Deny logon as a batch job" / "Deny logon locally"
  • For domain accounts: the host can't reach a DC

non-zero exit=-1073741502 (0xC0000142 STATUS_DLL_INIT_FAILED)

The new process couldn't initialize. With InteractiveUser mode this means we tried to open winsta0\default and the user's session token doesn't have access (e.g., no one's logged in). With SpecificUser this should not occur in v0.1.1+ — we deliberately don't set lpDesktop for that mode.

Hook returns 502 with empty stdout/stderr

The script's exit was non-zero but it didn't print anything. PowerShell's $ErrorActionPreference = 'Stop' is your friend — turn it on at the top of the script and any cmdlet failure becomes terminating with a clear message in stderr.

"ServiceState: ListenerSettingsChanged" → service restart

After saving Server Settings with a port or HTTPS change, the service stops itself so the SCM restarts it on the new bindings. The GUI briefly shows "Disconnected" then reconnects. If it doesn't reconnect within ~10 seconds:

Get-Service WebhookServer | Format-List Status, StartType

If the service is in Stopped, the SCM didn't restart it (failure-recovery only kicks in on abnormal termination, and a clean stop doesn't qualify). Manual:

Start-Service WebhookServer

GUI editor changes don't seem to take effect

After saving an endpoint, the service loads the new config in memory immediately — no restart needed. If a hook is mid-run when you save, that run finishes against the OLD config; the new config applies to subsequent runs.

If the GUI's grid still shows old values, hit any other endpoint or wait for the 3-second poll to refresh the display.

Tray icon doesn't appear

Check whether the GUI is running: Get-Process WebhookServer.Gui. If not, the tray icon doesn't exist (it's part of the GUI process). To have a persistent tray independent of the main window, leave the GUI running and minimize it — it'll hide-to-tray rather than truly close.

To run the GUI minimized at login: create a Windows shortcut to WebhookServer.Gui.exe, set "Run" to "Minimized" in the shortcut properties, and put it in your user's Startup folder (shell:startup). The auto-elevate manifest still takes effect.

Getting useful logs from a script

Inside your hook scripts, write to stderr for diagnostic info — Webhook Server logs stderr separately from stdout, and stderr is preserved even on success:

[Console]::Error.WriteLine("processing item $i of $total")

Or use Write-Error which produces non-fatal errors:

Write-Error "skipping bogus input"   # stderr but doesn't terminate

The full stderr appears in the log line for the run, plus in the response body for sync calls.

Asking for help

If you're stuck, file an issue at:

https://github.com/recklessop/webhook-server/issues

Include:

  • Webhook Server version (Help → About, or the file version of the .exe)
  • Windows version (winver)
  • The slug + relevant bits of the endpoint config (NOT the secrets)
  • The log lines for the failing run (search for the runId)
  • What you expected vs. what happened