Skip to main content
These practices keep SDX predictable as your estate grows. They are written for operators managing many sites, not for a single demo router.

Name For Operations

  • Use site names that match how your team talks during incidents.
  • Include location, customer, or service context where it helps.
  • Keep policy and workflow names action-oriented, such as Apply guest DNS filtering or Notify NOC on WAN packet loss.
  • Avoid names that only make sense to the creator.

Tag Early

Use tags to classify:
  • Region.
  • Customer or business unit.
  • Environment.
  • Site criticality.
  • ISP or access type.
  • Service ownership.
Tags make workflows, reports, searches, and incident response much easier later.

Keep Management Access Narrow

  • Use control plane policies to restrict router management services.
  • Keep 154.66.115.255/32 available where SDX must manage the router.
  • Use transient access for short-lived operator access.
  • Avoid permanent broad WinBox or SSH exposure.
  • Recreate the management filter if site controls indicate the SDX rules have drifted.

Test Before Fleet Changes

For scripts, policies, and workflows:
  1. Test on a non-critical site.
  2. Roll out to a small cohort.
  3. Watch outcomes and faults.
  4. Expand only after you understand the result.
This is slower than a one-click fleet change and much faster than recovering a broken fleet.

Build Workflows Like Production Logic

  • Give each workflow a clear owner.
  • Store secrets in the vault.
  • Keep loops bounded.
  • Make actions idempotent.
  • Inspect the first production run after every change.
  • Use workflow chaining for repeated logic instead of copy-pasting large graphs.

Keep Rollback Paths

  • Use configuration backups before major script or policy changes.
  • Document the intended state in the script or change description.
  • Keep old templates available until the replacement has run successfully.
  • Do not delete known-good workflow authorizations until dependent workflows have moved.

Monitor The Result

After changes, check:
  • Site status.
  • Fault log.
  • WAN health.
  • Workflow runs and node logs.
  • Scheduled script outcomes.
  • Captive portal sessions, if guest access was affected.
The change is not finished when you click save. It is finished when the managed sites report the expected state.