Skip to main content

Documentation Index

Fetch the complete documentation index at: https://altostrat.io/docs/llms.txt

Use this file to discover all available pages before exploring further.

The Fault Log is the operational timeline for detected issues in SDX. It shows active and resolved events so you can understand what is broken now, what recovered, and how long the event lasted.

Prerequisites

  • You have access to the team or sites you want to investigate.
  • You know the approximate site, time range, severity, or fault type if you are researching a specific incident.

What a Fault Shows

Fault rows can include:
  • Created time
  • Resolved time
  • Message
  • Severity
  • Type
  • Cause
  • Active or resolved status
  • Duration
Faults are normalized by SDX services and can feed notifications, workflows, and real-time portal updates.

Offline Detection

Managed routers send heartbeats about every 30 seconds. SDX declares a site offline after 10 missed heartbeats, which creates a roughly five-minute sensitivity window. The downtime period starts at the first missed heartbeat and clears when the next successful heartbeat is received. WAN faults are more specific. A WAN interface can go offline or experience packet loss while the site itself remains reachable through another path.

Investigate a Fault

  1. Open Monitoring and select Fault Logging.
  2. Choose whether to show resolved faults.
  3. Filter by cause, severity, type, site, or message text when available.
  4. Open the affected site to inspect current device and WAN state.
  5. Compare the event timestamp with recent policy, script, workflow, or user changes.
  6. Confirm recovery in the fault row and current dashboard state.

Advanced Use Cases

Use unresolved faults for live operations and resolved faults for incident review. Resolved fault history is helpful when you need to prove whether an issue was isolated, recurring, or tied to a maintenance window. Use fault types as workflow entry points. Workflows can start from site offline, site online, WAN offline, WAN online, WAN packet loss, and WAN packet loss resolved events.
Build notification groups before incidents happen. See Notifications for routing fault events to the right responders.