Interpreting Common Faults
Site Offline (Heartbeat Failure)
This is a critical fault indicating that a site’s router has stopped communicating with the Altostrat platform. It is triggered after several consecutive missed heartbeat checks and signifies a full outage.
WAN Tunnel Offline
This fault means a specific WAN interface in your WAN Failover configuration has gone down. The site itself may still be online via a backup link, but this event indicates a problem with a specific ISP connection.
Site Rebooted
This event is logged whenever a managed router restarts, whether it was a planned reboot triggered by an admin or an unexpected one caused by a power outage or hardware issue.
Investigating Faults
You can view fault data from a high-level overview or dive deep into the history for a specific site.The Recent Faults Dashboard
Your main SDX dashboard includes a Recent Faults widget that shows any unresolved faults or those that have occurred in the last 24 hours, giving you immediate visibility into the current health of your fleet.
The Site-Specific Fault Log
For a complete history, navigate to a specific site and click on the Faults tab. This provides a detailed, filterable log of every incident that has ever occurred for that site and its associated resources (like its WAN links).
Filtering to Find Answers
Use the powerful filter controls at the top of the Fault Log to answer specific troubleshooting questions.How do I see all critical outages this week?
How do I see all critical outages this week?
- Set the Date Range to “Last 7 days”.
- Set the Severity filter to
CRITICAL. - Set the Type filter to
siteto see only full site outages.
Has my primary ISP been reliable at the main office?
Has my primary ISP been reliable at the main office?
- Navigate to your “Main Office” site’s Fault Log.
- Set the Type filter to
wantunnel. This will show you every time a specific WAN link has gone down or recovered.
Are there any unresolved issues right now?
Are there any unresolved issues right now?
- Navigate to the main Fault Log from the top-level monitoring menu.
- Set the Status filter to
unresolved. This will give you a real-time list of all currently active incidents across your entire fleet.
Managing and Acknowledging Faults
The Fault Log is also an interactive tool for incident management. When investigating a fault, you can click on it to:- Add Comments: Leave notes for your team about your troubleshooting steps (e.g., “Contacted ISP, they are investigating a local outage.”). This creates a valuable timeline of the incident response.
- Manually Resolve: If an issue has been fixed but not yet automatically detected, you can manually mark a fault as
resolved.
Best Practices
Automate with Notifications
Don’t rely on watching the dashboard. Create Notification Groups to automatically alert the right team members the moment a critical fault is detected.
Use Comments for Context
Encourage your team to add comments to active faults. This provides a clear audit trail of the troubleshooting process and helps with post-incident reviews.
Analyze Your Top Faults
Periodically check the Top Faulty Resources analytics view. This automatically identifies the “noisiest” or most problematic sites and links in your network, helping you prioritize preventative maintenance.

