Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Interface Flapping (Fabric Interfaces) Probe

The Interface Flapping (Fabric Interfaces) probe determines if fabric interfaces are flapping and raises anomalies accordingly.

Probe Overview

If the number of times that the operational state of a fabric interface changes is greater than a specified number (Threshold) over a specified amount of time (Duration), then the interface is flapping and an anomaly is raised. Also, if the percentage of flapping interfaces exceeds the specified percentage (Max Flapping Interfaces Percentage), then an anomaly is raised for that device.

When you instantiate the predefined Interface Flapping (Fabric Interfaces) probe you can customize the above mentioned parameters or leave default values as is, as shown in the screenshot below.

The probe performs the following tasks in stages:

  1. Identify fabric interfaces (leafs facing spines) on all leafs (using the leaf fab int status processor).

  2. For each interface, count the number of times that the operational state changes and create a time series from it (using the leaf fabric interface status history processor).

  3. If the number of state changes during the specified duration is more than the specified upper limit, then generate an anomaly (using the leaf fabric interface flapping processor

  4. If an anomaly was created based on the previous stage, create a time series for it (using the

  5. Calculate the percentage of interfaces on devices with the anomaly. If the percentage is higher than the specified threshold, then raise device level anomaly(so that recent history of existence and clearing of anomaly can be inspected).

  6. Create time series for the anomaly so that recent history can be inspected. the last "Anomaly History Count" anomaly state-changes are stored for observation.

Probe Processors

The following stages are used in the interface flapping probe for fabric interfaces:

See the sections below for detailed information about each stage.

Leaf Fabric Interface Status

We begin with the source processor (no inputs), which is a Service Collector configured to collect interface status telemetry for all fabric interfaces on leaf devices, as shown in the screenshot below.

This processor keeps track of when the operational state (up, down) of the interface changes. It collects and outputs this information as leaf interface status (leaf_if_status). Each interface is identified by its system ID and interface name. Operational state and a few other details are included in the output as shown in the list below:

  • System ID - ID of the leaf device, usually the serial number

  • Interface - name of the interface

  • Remote Interface - interface name on the other end (new in Apstra version 5.1.0)

  • Remote System Label - the device name on the other end (new in Apstra version 5.1.0)

  • Value - operational state of the device (up, down)

  • Updated - when the state was last updated

The following screenshot is an example of the details that are collected from this processor.

leaf fabric interface status history

The output from the prevous stage (leaf_if_status) becomes the input for this one (leaf_fab_int_status_accumulate). The leaf fabric interface status history is a set of interface status time series (for each spine facing interface on each leaf). Each set member has the following keys to identify it: system_id (id of the leaf system, usually serial number), interface (name of the interface).

Purpose: create recent history time series for each interface status In terms of the number of samples, the time series will hold the smaller of: 1024 samples or samples collected during the last 'total_duration' seconds (facade parameter).

For this stage, the Accumulate processor is configured for collecting leaf fabric status history as shown in the screenshot below. The defaults are shown in the screenshot below. It states if the status changes more than 5 times within one minute

This processor collects and outputs leaf fabric interface status accumulate (leaf_fab_int_status_accumulate). Each interface is identified by its system ID and interface name. The inputs here are the same as the outputs from the previous processor with the addition of Count, which counts the transition states as shown below

  • System ID - ID of the leaf device, usually the serial number

  • Interface - name of the interface

  • Remote Interface - interface name on the other end (new in Apstra version 5.1.0)

  • Remote System Label - the device name on the other end (new in Apstra version 5.1.0)

  • Count -

  • Value - operational state of the device (up, down)

  • Updated - when the state was last updated

leaf fabric interface flapping

We begin with the source processor (no inputs), which is a Service Collector configured to collect interface status telemetry for all fabric interfaces on leaf devices, as shown in the screenshot below.

The output from the prevous stage (leaf_if_status) becomes the input for this one (leaf_fab_int_status_accumulate). The leaf fabric interface status history is a set of interface status time series (for each spine facing interface on each leaf). Each set member has the following keys to identify it: system_id (id of the leaf system, usually serial number), interface (name of the interface).

Purpose: create recent history time series for each interface status In terms of the number of samples, the time series will hold the smaller of: 1024 samples or samples collected during the last 'total_duration' seconds (facade parameter).

For this stage, the Accumulate Processor is configured for collecting leaf fabric status history as shown in the screenshot below. The defaults are shown in the screenshot below. It states if the status changes more than 5 times within one minute

leaf fabric interface flapping (Range)

Purpose: Count the number of state changes in the leaf_fab_int_status_accumulate ("up" to "down" and "down" to "up"). If the count is higher than 'threshold' facade parameter return "true", otherwise "false".

Input Stage: leaf_fab_int_status_accumulate

Output Stage: if_status_flapping

Set of statuses (for each spine facing interface on each leaf), indicating if the interface has been flapping or not. Each set member has the following keys to identify it: system_id (id of the leaf system, usually serial number), interface (name of the interface).

For this stage, the Range processor is configured for (collecting leaf fabric interface flapping) as shown in the screenshot below.

percentage flapping per device interfaces

percentage flapping per device interfaces (MatchPercentage)

Input Stage: if_status_flapping

Output Stage: flapping_fab_int_perc

The Match Percentage Processor is configured for collecting flapping per device interfaces.

For this stage, the Match Percentage processor is configured for (collecting leaf fabric interface flapping) as shown in the screenshot below.

system anomalous flapping

system anomalous flapping (Range)

Input Stage: flapping_fab_int_perc

Output Stage: system_flapping

Set of statuses for each leaf, indicating if the leaf has higher then acceptable percentage of flapping interfaces. Each set member has the following key to identify it: system_id (id of the leaf system, usually serial number).

The Range Processor is configured for collecting system anomalous flapping.

For this stage, the Range processor is configured for (collecting leaf fabric interface flapping) as shown in the screenshot below.