Change Log

Weekly Changelog: Device Monitoring, Performance Metrics Rollout & Enhanced Reliability

This week brings fully activated device monitoring (online/offline status), comprehensive performance metric collection (CPU, memory, etc.), and enhanced reliability for backups, temporary access, and device adoption across the platform.

-

What's New This Week

  • Comprehensive Device Monitoring (via mikrotik service): Real-time device online/offline status monitoring (has_pulse flag) is now fully activated and reflected across the platform. This provides instant visibility into your connected infrastructure's availability.
  • Detailed Performance Metrics (via metrics & mikrotik services): The metrics service, in conjunction with mikrotik heartbeat processing, now collects and stores detailed device performance metrics including CPU load, memory usage, uptime, and free disk space. This historical data is accessible for deeper insights into device health and performance trends.

Enhancements & Improvements

We've focused on improving the reliability and stability of core functions for services activated in previous weeks:

  • Enhanced Reliability for Transient Access (control-plane service): Improved the reliability of temporary access credentials (Winbox/SSH) and port forwards to ensure they expire correctly and are consistently applied/removed on management servers.
  • Improved Management Server Mapping (control-plane & mikrotik services): Enhanced the accuracy and reliability in mapping sites to their managing servers, ensuring commands and configurations reach the correct destinations.
  • Smoother Device Adoption (mikrotik & control-plane services): Improved the reliability of the process for adding new devices to the platform, including bootstrap script delivery and initial heartbeat processing.
  • Reliable Backups (backups & mikrotik services): Enhanced the reliability of automated backup generation triggered via SNS and processed by the mikrotik service, with configurations securely stored by the backups service in S3.

Bug Fixes

Addressed minor issues to improve overall system stability:

  • Performance Data Processing (within metrics service): Resolved minor issues related to the ingestion and processing of performance metric data from mikrotik service heartbeats.
  • Backup Data Access (within backups service): Fixed potential issues that, in rare cases, could prevent access to site-specific backup data, ensuring consistent availability of stored configurations.