Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Service Level Expectations Overview

Service level expectations (SLE) define a benchmark for the performance of a network. An SLE consists of a set of attributes that provide information about the performance of a network. These attributes are known as classifiers. SLE help network administrators understand whether the network is performing optimally and enable them to respond proactively to network events or performance issues.

Each classifier is assigned a value and the sum of the values for all the classifiers is the SLE score. The SLE score of an organization, a site, or a device is the aggregate score derived from all the classifiers defined for that SLE. The value or the percentage of the SLE score indicates the duration in minutes when the SLE meets or fails to meet the service levels. The SLE score is an indicator of how the network is performing. Administrators can analyze the SLE scores and look into events or problems that are impacting end user experience.

You can view SLE metrics at the organization level, site level, or at the device level.

To view information about SLE, click Monitor > Service Levels.

Apstra Cloud Services currently supports two SLEs— System Health SLE and Link Health SLE. These SLEs measure the time when the SLE classifiers meet or do not meet the defined threshold, resulting in an improved user experience or a deterioration in user experience. Whenever there is an impact on user experience, administrators can use these metrics to proactively intervene and resolve issues. As an administrator, you can define the threshold for each classifier, which makes it easy to measure the performance of the network. By observing the metrics provided by each classifier, you can easily conclude whether the network is performing optimally.

Figure 1: Service Levels Page Displays the SLE score.

For each SLE, you can view

  • SLE Score—SLE score for the organization, site, or the device during the selected period.

    • Click Success Rate to view the SLE score in percentage, which is the percentage of time during which the user experience met the defined service level threshold. If the success rate is less than 100 percent, it indicates failures in a site or a device. For example, a service level of 99 percent indicates that the metric met the SLE goal 99 percent of the time and failed to meet the threshold 1 percent of the time.

    • Click Value to view the value of the metric.

    Note: The severity of an SLE metric can be in the range of 0.0 to 100.0, with 0.0 being the least severe and 100.0 being the most severe.
  • Timeline—The timeline graph indicates how the network performed during the selected time range. Select a classifier or a sub-classifier to view the performance of that classifier, plotted on a graph. You can click and drag to select a specific time range on the graph or select a different time range from the drop-down to view more detailed information about the SLE. Mouse over the graph to view information about the overall service level, time, SLE score for the classifiers, and so on, displayed in a pop-up.

  • Classifiers—Classifiers enable network administrators to perform a root cause analysis of the unsuccessful user experiences. A lower score for any of the classifiers can alert the administrators to address potential issues in the network.

The following sections describe SLEs and the classifiers that contribute to the SLE score for each SLE:

System Health SLE

To access System Health SLE, click Monitor > Service Levels > System Health.

Figure 2: Classifiers and Sub-classifiers Displays the classifiers, sub-classifiers, and their success rate score.

System Health SLE provides information about how the device and its components are performing. Factors that impact the system health SLE include configuration changes, traffic passing through the device, performance of the device components such as the fans and power modules, and the system resource utilization.

  • Device Traffic—Indicates that the device traffic affected the SLE score, leading to an impact on performance.

  • Config Deviation—Change in system configuration affecting the SLE score.

  • Environment—Problems with the system components affecting the SLE score. Has the following sub-classifiers:

    • Fan—Problems with the device's fan affecting the SLE score.

    • Power—Issues with the power module in the device affecting the SLE score.

    • Temp—Problems with system temperature affecting the SLE score.

  • Resources—System resources that affect the SLE score. Has the following sub-classifiers:

    • CPU—Problems with CPU utilization affecting the SLE score.

    • Disk—Problems with disk utilization impacting the SLE score.

    • Memory—Problems with memory utilization affecting the SLE score.

Link Health SLE

To access Link Health SLE, click Monitor > Service Levels > Link Health. Link Health SLE provides information about the interface states or connectivity issues that leads to a lower SLE score.

  • Down Interfaces—An interface that is down impacts the SLE score.

  • Bad Optics—Optics issues impact the SLE score.

  • Hot Cold Interfaces—Hot cold interfaces impact the SLE score.

    • Fabric Interfaces—Issues in fabric interfaces impact the SLE score.

    • Specific Interfaces—Issues in specific interfaces impact the SLE score.

  • Interface Flapping—Frequent change of state of an interface contributed to a lower SLE score

    • Fabric Interfaces—Issues in fabric interfaces impact the SLE score.

    • Specific Interfaces—Issues in specific interfaces impact the SLE score.

Analyze SLE Scores

You can analyze the SLE score from the Root Cause Analysis page. The Root Cause Analysis page provides visualizations for distribution, timeline, and statistics for service level failures and enables administrators to understand the impact of these issues. To view the Root Cause Analysis page, click Monitor > Service Levels. Then click an SLE or a classifier for more detailed information.

Figure 3: Link Health SLE Classifiers in the Link Health SLE

Figure 3 shows how an issue with hot cold interface classifier contributes negatively to the Link Health SLE score.

  • Statistics—The Statistics tab displays the success rate of the SLE metric. Administrators can also view the distribution graph to understand the severity of the SLE with its impact duration.

  • Timeline—The Timeline tab provides a graph that plots the SLE trend during the selected time range. Select a classifier or a sub-classifier to view the timeline graph. Mouse over the graph to view more detailed information about the SLE score.

  • Distribution—Provides information about how a classifier impacts other devices in the data center.

  • Affected Items—Displays the list of devices or services that failed to meet the defined service levels and also the impact of a specific device or service.

Probes to be Enabled in Apstra to View SLE Information

Table 1 lists the predefined probes to be enabled in Juniper Apstra. For more information, see Predefined Probes.

Table 1: Probes to be Enabled in Apstra
SLE Probe
System Health
  • Device System Health Probe

  • Drain Traffic Anomaly Probe

  • Device Environmental Checks

Link Health
  • Optical Transceivers

  • Interface Flapping (Fabric Interfaces)

  • Interface Flapping (Specific Interfaces)

  • Hot/Cold Interface Counters (Fabric Interfaces)

  • Hot/Cold Interface Counters (Specific Interfaces)