Predictions Overview
This page describes how the Predictive Analytics feature notifies administrators of potential issues in the network using machine learning models to predict future behaviors and outliers.
Predictions Overview (Beta)
Predictive Analytics uses machine learning (ML) techniques to forecast possible outliers in resource utilization patterns or equipment failures. This feature helps the administrator to proactively address issues and prevent possible outages.
The Apstra Edge device receives data about the network devices through the configured vCenter and the flow servers. The Edge device sends this data to DC Assurance where the ML models perform the following tasks:
-
Aggregate data points received from the network
-
Train the ML models using these data points and learn normal behavior to create a baseline
-
Forecast future data points by using trained ML models on historical data
-
Identify deviations in the forecasted data by using the outlier detection model trained on historical metric data
Predictive Analytics uses the following ML algorithms:
|
Function |
Machine Learning Algorithm |
Description |
|---|---|---|
|
Forecasting |
Light Gradient Boost Machine (LGBM) |
An ensemble method based on decision trees that combines multiple weak models to produce a single strong prediction. Gradient boosting builds the model sequentially, with each model focusing on errors made by the previous one. |
|
Outlier Detection |
Isolation Forest (iForest) |
A tree-based method that builds an ensemble of random trees for outlier detection. Isolation tree-based methods recursively select random variables and random split values for these variables as tree nodes at each step to create the subtrees and eventually isolate the outliers at tree leaves. |
The Predictive Analytics feature identifies outliers in the system health metrics such as system CPU and memory utilization issues.
Predictive Analytics uses machine learning algorithms and utilizes the data that it receives from the network to predict network behavior. Once this feature has been enabled, it requires a certain amount of data points to train its model on expected behavior and to recognize outliers. During the initial period after enabling this feature, the predictions about network behavior may not be fully accurate. This is because the received data is insufficient for the ML algorithm to analyze and make meaningful predictions. Accuracy of predictions improve over time as more data is available for analysis.
View Predicted Impacts
To view the predictive outliers in the network, navigate to Assurance > Predictions. Use the site drop-down to select a specific site.
The Predictions tab displays a list of outliers, the affected device, the severity level of the outlier, the predicted time when the event might occur. It also displays the number of impacted clients and services in the selected site.
The Predictive Analytics feature forecasts outliers that might occur in a 24-hour period.
You can click the clients and services button to view and search from the full list of impacted clients.
You can also use the Predictive Search option on the top right of the page to search for a specific service and view its historical, current, and predicted metrics.
If no predictive outliers are displayed, it could be because of the following reasons:
-
The network devices are behaving as expected.
-
The Predictive Analytics feature is newly configured and not enough data is available yet to make reliable predictions.
-
The streaming receivers are not sending data from the Edge device to DC Assurance. In this case, run the pre-flight check for the Edge device and verify if the stream receivers are configured correctly. See Pre-Flight Checks for more information.
If the list of impacted clients and services for a predicted outlier is not displayed, it could be because:
-
There is no traffic flowing through the affected devices in the network.
-
DC Assurance is not receiving traffic information from the Edge device. In this case, run the pre-flight check for the Edge device and verify if the flow servers are configured correctly. See Pre-Flight Checks for more information.
When you select an outlier on the Predictions tab and click View topology, the network topology is displayed showing the device and the services impacted by the predicted outlier.
View Topology and Device Data
The network topology displays the impacted services and clients for the selected outlier along with the traffic flow between the services and the network devices as shown in Figure 3.
When you select a device from the topology, the right pane displays the historical, current, and predicted CPU and memory usage data for that device in the form of graphs. Click the Now button to scroll to the current metrics for the device on the plotted graph. You can also select the 15 min or the 1 hr options for the graph to display data aggregated in 15-minute intervals or 1-hour intervals respectively.
If an outlier is detected in the predicted data points, it is highlighted in purple on the graph. Mouse over the outlier to view details such as the type of outlier, the severity, the predicted start and end time of the outlier, and so on.
Benefits of Predictions
The Predictive Analytics feature provides the following benefits:
-
Provides an early warning to administrators about possible outliers.
-
Helps understand the impact of predicted outliers.
-
Enables administrators to proactively prevent potential outages.