Skip to main content
The Fault Log is your real-time, historical record of every detected network issue, from a full site outage to a temporary link degradation. Altostrat’s monitoring services automatically detect and log these events as Faults, providing the foundational data that powers both our Notifications and Reporting engines. Mastering the Fault Log is the key to effective troubleshooting and maintaining a healthy, resilient network.

Interpreting Common Faults

Site Offline (Heartbeat Failure)

This is a critical fault indicating that a site’s router has stopped communicating with the Altostrat platform. It is triggered after several consecutive missed heartbeat checks and signifies a full outage.

WAN Tunnel Offline

This fault means a specific WAN interface in your WAN Failover configuration has gone down. The site itself may still be online via a backup link, but this event indicates a problem with a specific ISP connection.

Site Rebooted

This event is logged whenever a managed router restarts, whether it was a planned reboot triggered by an admin or an unexpected one caused by a power outage or hardware issue.

Investigating Faults

You can view fault data from a high-level overview or dive deep into the history for a specific site.

The Recent Faults Dashboard

Your main SDX dashboard includes a Recent Faults widget that shows any unresolved faults or those that have occurred in the last 24 hours, giving you immediate visibility into the current health of your fleet.

The Site-Specific Fault Log

For a complete history, navigate to a specific site and click on the Faults tab. This provides a detailed, filterable log of every incident that has ever occurred for that site and its associated resources (like its WAN links).

Filtering to Find Answers

Use the powerful filter controls at the top of the Fault Log to answer specific troubleshooting questions.
  1. Set the Date Range to “Last 7 days”.
  2. Set the Severity filter to CRITICAL.
  3. Set the Type filter to site to see only full site outages.
  1. Navigate to your “Main Office” site’s Fault Log.
  2. Set the Type filter to wantunnel. This will show you every time a specific WAN link has gone down or recovered.
  1. Navigate to the main Fault Log from the top-level monitoring menu.
  2. Set the Status filter to unresolved. This will give you a real-time list of all currently active incidents across your entire fleet.

Managing and Acknowledging Faults

The Fault Log is also an interactive tool for incident management. When investigating a fault, you can click on it to:
  • Add Comments: Leave notes for your team about your troubleshooting steps (e.g., “Contacted ISP, they are investigating a local outage.”). This creates a valuable timeline of the incident response.
  • Manually Resolve: If an issue has been fixed but not yet automatically detected, you can manually mark a fault as resolved.

Best Practices

Automate with Notifications

Don’t rely on watching the dashboard. Create Notification Groups to automatically alert the right team members the moment a critical fault is detected.

Use Comments for Context

Encourage your team to add comments to active faults. This provides a clear audit trail of the troubleshooting process and helps with post-incident reviews.

Analyze Your Top Faults

Periodically check the Top Faulty Resources analytics view. This automatically identifies the “noisiest” or most problematic sites and links in your network, helping you prioritize preventative maintenance.