Monitor Webhook Performance
You can't fix what you can't see. Here's how to monitor your webhook infrastructure.
Why Monitoring Matters
Webhooks fail silently. If you don't monitor, you won't know until a customer complains. By then, you've already lost data or revenue.
Key Metrics to Track
| Metric | Target | Alert When |
|---|---|---|
| Delivery success rate | >99.5% | <99% |
| P95 delivery latency | <2s | >5s |
| Consecutive failures per endpoint | 0 | >5 |
| DLQ depth | 0 | >100 |
| Retry rate | <5% | >10% |
Built-in Monitoring
HookSniff provides several monitoring tools out of the box:
- Dashboard analytics β Real-time delivery stats, success rate charts, latency graphs
- Endpoint health β Per-endpoint health status with failure counts
- Delivery logs β Searchable, filterable log of every delivery attempt
- Alerts β Configure notifications for failure thresholds
- API endpoint β GET /v1/stats for programmatic access
Grafana + OpenTelemetry
For advanced monitoring, HookSniff exports OpenTelemetry traces and Prometheus metrics. Connect to Grafana Cloud for custom dashboards and alerts.
Available metrics:
hooksniff_deliveries_totalβ Total deliveries by statushooksniff_delivery_duration_secondsβ Delivery latency histogramhooksniff_retries_totalβ Total retry attemptshooksniff_dlq_depthβ Current DLQ size
Setting Up Alerts
Configure alerts in the dashboard to get notified when things go wrong:
- Failure rate alert β Notify when success rate drops below 99%
- Endpoint down alert β Notify when an endpoint has 5+ consecutive failures
- DLQ alert β Notify when DLQ has 100+ unprocessed events
- Latency alert β Notify when P95 latency exceeds 5 seconds
See Dashboard Guide for alert configuration.