Root Causes

Root Cause Overview

Root Cause Identification (RCI) is a technology integrated into AOS that automatically determines root causes of complex network issues. RCI leverages the AOS datastore for realtime network status, and automatically correlates telemetry with each active blueprint intent. Root cause use cases include the following:

Link broken
Symptoms: Both interfaces are operationally down, LLDP is missing on both sides, BGP peered across that link is operationally down.
Link miscabled
Symptoms: LLDP indicates wrong neighbors, BGP peered across that link is operationally down.
Operator shut interface
Symptoms: Both interfaces on the link are operationally down; the interface in question is administratively down; LLDP missing on both sides, BGP peered across that link is operationally down.
Disconnection between 2 devices

Symptoms: Union of symptoms for link broken / link miscabled / operator shut interface for all constituent links between a spine and a leaf

For instance, if there are 3 links between a spine and a leaf, then 2 could be miscabled and 1 is broken - this results in a Disconnection between that spine and that leaf.

Enabling Root Cause Analysis

  1. From the blueprint, navigate to Active > Root Causes.

    _images/root_causes_overview_330.png
  2. Click Enable Root Cause Analysis.

  3. Enter a Trigger Period or leave the default, and click Create to enable root cause analysis and return to the list view.

Viewing Root Cause Analysis

From the blueprint, navigate to Active > Root Causes and click the model name connectivity in the list.

_images/rci_detail.png

Root cause analysis runs periodically. Each time it runs, it produces zero or more root causes. Each root cause has associated detection timestamp, context, a human-readable description, and a list of symptoms caused by the root cause. Each symptom has associated context and a human-readable description.

AOS version 3.0 added RCI Connectivity Fault Model, which identifies any miscabled leaf/spine links. AOS correlates faults between neighbor down (with UP/UP interfaces) and intended LLDP.