Systems involved
| System | Role |
|---|---|
| Studio Procedures | Voice DR drill runbook. |
| Primary SBC + Secondary SBC | Active and standby. |
| FreeRADIUS primary + secondary | Active and standby. |
| Anycast / DNS routing | Where carrier-side traffic is steered. |
| TestCallin / SIPp | Synthetic call generation for validation. |
| Bandwidth / Telnyx | Verify carrier sees the new SBC IP and accepts. |
Microsoft Teams #voice-ops | Drill war room. |
| Gmail / Outlook | Pre-drill customer notice (large enterprise tenants). |
| Studio Files | Compliance evidence pack. |
Walkthrough
Pre-drill announcement
72 hours before, Copilot drafts the customer notice through Gmail: drill window, expected impact (none), what to do if real symptoms appear, a single point of contact for the window. Sent to the enterprise tenant contacts.
Open the drill war room
At T-15 minutes, open the
#voice-ops Teams thread with the drill checklist, the rollback path, the on-call names, and the validation criteria. The drill procedure starts.Capture pre-drill state
Copilot snapshots: active calls per SBC, RADIUS auth rate, carrier-side reachability checks, the current Anycast announcement state. The snapshot is the baseline for validation.
Fail SBC to secondary
Through SSH and the routing connector, withdraw the primary SBC’s Anycast announcement. The secondary becomes preferred. Existing calls on the primary continue; new calls land on the secondary within 30 seconds.
Fail RADIUS to secondary
Stop the primary RADIUS. The secondary takes over. Auth rate stays inside SLA. The procedure captures the failover transition time.
Validate inbound and outbound
Run a TestCallin sweep: inbound on five test DIDs, outbound to five test endpoints, both with and without media. Bidirectional audio confirmed in all cases.
Validate 911 routing
The single most important test. Place a 911 test call from a test endpoint with a known address. Confirm it routes to the test PSAP for the address, not to a stale primary-site path.
Hold for the soak window
Stay on the secondary for the contracted soak window (often 2 hours). Watch every metric. Anything outside SLA terminates the drill into rollback and is documented.
Fail back
Reverse the procedure: re-announce the primary, restart RADIUS primary, validate. Capture the restoration time and the call-continuity status.
Where Studio earns its keep
- The drill runs as a procedure, so the next quarter’s drill is the same drill — not a rewrite from memory.
- 911 validation is a hard step, not an afterthought — the procedure does not pass without it.
- The evidence pack is the procedure’s output, not a separate document someone has to write the next week.
- The customer notice and the regulator evidence reference the same drill ID, so there’s a clean audit trail without manual cross-referencing.
Related
Procedures
Build the
Voice DR drill once and run it every quarter.Files and artifacts
The compliance evidence pack lives in the workspace.