Root Cause Overview¶
Root Cause Identification (RCI) is a technology integrated into AOS that automatically determines root causes of complex network issues. RCI leverages the AOS datastore for realtime network status, and automatically correlates telemetry with each active blueprint intent. Root cause use cases include the following:
- Link broken
- Symptoms: Both interfaces are operationally down, LLDP is missing on both sides, BGP peered across that link is operationally down.
- Link miscabled
- Symptoms: LLDP indicates wrong neighbors, BGP peered across that link is operationally down.
- Operator shut interface
- Symptoms: Both interfaces on the link are operationally down; the interface in question is administratively down; LLDP missing on both sides, BGP peered across that link is operationally down.
- Disconnection between 2 devices
Symptoms: Union of symptoms for link broken / link miscabled / operator shut interface for all constituent links between a spine and a leaf
For instance, if there are 3 links between a spine and a leaf, then 2 could be miscabled and 1 is broken - this results in a Disconnection between that spine and that leaf.
Enabling Root Cause Analysis¶
From the blueprint, navigate to Active > Root Causes.
Click Enable Root Cause Analysis.
Enter a Trigger Period or leave the default, and click Create to enable root cause analysis and return to the list view.
Viewing Root Cause Analysis¶
From the blueprint, navigate to Active > Root Causes and click the model name connectivity in the list.
Root cause analysis runs periodically. Each time it runs, it produces zero or more root causes. Each root cause has associated detection timestamp, context, a human-readable description, and a list of symptoms caused by the root cause. Each symptom has associated context and a human-readable description.
AOS version 3.0 added RCI Connectivity Fault Model, which identifies any miscabled leaf/spine links. AOS correlates faults between neighbor down (with UP/UP interfaces) and intended LLDP.