Reliability model
Beacon operational robustness strategy for critical events.
explanation • updated 2026-03-15
Reliability goals
- Keep processing predictable during spikes and partial failures.
- Reduce double-execution incidents on mutable operations.
- Ensure fast recovery with explicit runbooks.
Protection layers
- Fast ingestion: receive and acknowledge events with minimal validation.
- Idempotent persistence: technical and business dedupe per operation.
- Async execution: domain-isolated workers with controlled retries.
- Actionable observability: metrics, alerts, and audit trail.
Retry and backoff strategy
| Error class | Policy | Escalation |
|---|---|---|
| Transient | retry with exponential backoff | alert when threshold is exceeded |
| Logical/validation | no automatic retry | open manual action with context |
| External dependency degraded | limited retry + circuit breaker | trigger operational mitigation mode |
Daily operating signals
- Backlog above baseline by section/event.
- Unexpected growth in idempotent
409conflicts. - Time-to-final-state above declared SLO.