Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Wired SLEs

Use the wired service-level experience (SLE) dashboard to assess the service levels for user-impacting factors such as throughput, connectivity, and switch health.

Overview

Juniper Mist™ cloud continuously collects network telemetry data and uses machine learning (ML) to analyze the end-user experience. This service efficiently collects and analyzes data your entire network, whether you have hundreds or thousands of ports.

You can access this information through the Juniper Mist wired service-level expectation (SLE) dashboards, which help you assess the network's user experience and resolve any issues proactively. It's not merely a matter of devices or links being up or down—it's the quality of the client experience.

For the wired network, the two burning questions are:

  • Are clients able to connect?

  • Are clients able to pass traffic after connecting?

The wired SLE dashboards show the user experience of the wired clients on your network at any given point in time. You can use these interactive dashboards to measure and manage your network proactively by identifying any user pain points before they become too big of an issue.

Finding the Wired SLEs Dashboard

To find the Wired SLEs dashboard, select Monitor > Service Levels from the left menu, and then click the Wired button.

Wired Button on the Monitor Page

Note:

The buttons appear only if you have the required subscriptions. For information about these requirements, see the Juniper Mist AI-Native Operations Guide.

Root Cause Analysis for the Wired Successful Connect SLE

After you click a classifier in the SLE block, you'll see the Root Cause Analysis page. Click classifiers and sub-classifiers to view timeline and scope information in the lower half of the screen.

Note:

The information in the lower half of the screen depends on what you've selected at the top.

Useful tabs in the lower half of the screen are:

  • Timeline—See exactly when the issues occurred.

  • Distribution—See which VLANs were affected.

  • Affected Items—See which interfaces and clients were affected and how much each one contributed to the overall impact. Also see the individual failure rate for each interface or client.

Let's look at an example for the Successful Connect SLE. By clicking options at the top of the page, you can drill down from the SLE to classifiers and sub-classifiers. The lower half of the page shows information relevant to these selections.

By selecting the Affected Items tab and then clicking the Interfaces option on the left, we see the interfaces that were unable to connect due to incorrect credentials.

Affected Items - Interfaces

By clicking the Clients tab on the left, we now see the affected clients.

Affected Items - Clients
Tip:
  • Overall Impact is the percentage that a client or interface contributed to all issues for the selected sub-classifier. For example, it can show if a client account for 20 percent or 90 percent of the issues.

  • Failure Rate is the impact of this issue on this interface or client. For example, it can show if an interface was unsuccessful on 20 percent or 90 percent of connection attempts.

  • To see more details, click the hyperlinks in the table to go to the Insights page, where you can see all client and switch events.

Wired Assurance: Day 2 - Wired SLEs Video Overview

One of the coolest features of Wired Assurance is the Service Level Expectations, or SLEs. SLEs were first introduced with Wi-Fi Assurance to help you understand the client experience. Now the SLE framework has been extended to Juniper EX switches.

You can see what the performance and experience for wired devices is, categorized into throughput, successful connect, and switch health. In the throughput SLE, there are classifiers, congestion, interface anomalies, storm control, etc. This is where you can drill down to get an accurate sense of what is going on in the network.

The distribution table breaks it down by clients, VLANs, interfaces, and switches. You can also sort by failure rate or biggest overall impact. Double-click into affected items by switches, VLANs, interfaces, and clients.

Going over to the switch health SLE, we immediately see there are CPU issues. The EX4300 shows the overall impact at 74%. The screen shows CPU utilization spiking over 100%, mapped to a time graph to help you narrow in on the issue.

Wired SLEs measure wired experience with pre and post-connection performance metrics to help you understand how the network experience is for your users, wired devices, and IoT endpoints.

Wired SLE Blocks

As shown in the following example, each SLE block provides valuable information.

  • At the left, you see that this SLE has an 89 percent success rate.

    At the center, the timeline shows variations across the time period. You can hover your mouse pointer over any point to see the exact time and SLE outcome.

    At the right, the classifiers show what percentage of issues were attributable to each root cause. In this example, 100 percent of the issues were attributed to Network.

Switch Health dashboard showing 89 percent overall success rate in green. Line graph highlights a 90 percent success rate over time. Metrics breakdown: Switch Unreachable 0 percent, Capacity 0 percent, Network 100 percent, System 0 percent.

If you click a classifier, you'll see more information on the Root Cause Analysis page. Most classifiers have sub-classifiers for greater insight into the exact causes of issues.

The following table provides more information about the wired SLEs and classifiers.

Table 1: Wired SLE Descriptions
SLE SLE Description Classifier Classifier Description
Successful Connect

Juniper Mist monitors client connection attempts and identifies failures. The source of data is 802.1X events on the switch. This SLE helps you to assess the impact of these failures and to identify the root causes to address.

This SLE is available if you use 802.1X on the wired network to authenticate clients or if you have DHCP snooping configured.

You cannot set the threshold for this SLE. It's assumed that you want 100 percent successful connects and consider any unsuccessful connect as a critical issue to track.

DHCP

Client connections that fail to reach the bound state within a minute.

This classifier is available only when DHCP snooping is enabled in the port profile.

DHCP snooping might not always work well with endpoints that have static IPs.

Authentication

Events when a client failed to authenticate.

Sub-Classifiers:

  • RADIUS Server Reject VLAN—Couldn't authenticate to the specified VLAN.

  • Wrong Credentials—The credentials weren't valid.

  • RADIUS Server Unreachable—The RADIUS server was down.

Access Port Security

Client connection failures caused by access port security issues.

Based on the security features configured in your port profiles, this classifier is triggered as security events occur.

Sub-Classifiers:

  • BPDU-Guard—Detects connection failures because of the BPDU guard configuration on the switch port. This feature is important to prevent looping, as when a switch is connected to a switch. To enable this feature, go to the port profile, and enable STP Edge.
  • MAC Limit—Detects connection failures reported when a client exceeds the MAC limit configured on the switch port. For example, you might configure your port profile with a MAC limit of 2 if you have an outdoor security camera or public address system and want to prevent other devices from connecting to that port. If someone unplugs your camera and attempts to connect their own device, the MAC limit would be reached, and this event would be reflected by the MAC limit classifier.
  • Dynamic ARP Inspection—Identifies client connection failures when a port drops invalid Dynamic ARP Inspection packets. This security feature prevents people from snooping for someone else's ARP address to gain access. Requires enabling ARP Inspection in the DHCP Snooping section of the port profile.
  • Rogue DHCP Server—Identifies client connection failures caused by a rogue DHCP server event. This could be an event where an untrusted port drops traffic from DHCP servers to block unauthorized servers. Enabling this feature can prevent rogue devices from connecting. This classifier shows any such attempts that occur. Requires enabling DHCP snooping in the port profile.
Throughput

This SLE represents the ability of wired users to pass traffic without impedance.

You cannot set the threshold for this SLE. It's assumed that you want 100 percent of traffic to pass without impedance and consider any impedance as a critical issue to track.

Storm Control

Events when storm control level was exceeded and packets were dropped.

Available only if you've enabled Storm Control in the port profile (recommended).

Interface Anomalies

Events when devices were powered up but could not pass traffic.

Sub-Classifiers:

  • Cable Issues—This sub-classifier shows the user minutes affected by faulty cables in the network. Cable issues can cause a high failure rate on an interface or client device.

  • Negotiation Failed—This sub-classifier identifies bad user minutes caused by issues such as incomplete negotiation, duplex conflict, and latency.

  • MTU Mismatch—This sub-classifier identifies issues where MTU size is mismatched somewhere along the packet's path (any MTU mismatch along the path will result in discarded or fragmented packets). The information for this SLE comes from the switch; each input error or MTU error contributes to a bad user minute under this sub-classifier.

Switch Bandwidth

Juniper Mist™ measures the available bandwidth on your network based on the queued packets and dropped packets for each configured queue.

A pattern of low success rates can indicate a need for more wired bandwidth.

You can click the Settings button to set the percentage to use as the success threshold for this SLE. This percentage represents the total_DropppedPackets as a portion of total_QueuedPackets.

Congestion

Heavy congestion causing dropped packets (TxDrops) when the input queue (buffer) fills up. Triggered by considering these ratios:

  • TxDrops to TxPackets—Total transmitted bytes dropped to total packets transmitted.

  • Txbps to Link speed—Total bytes transmitted per second to link speed.

  • RxSpeed to Link speed—Total bytes received per second to link speed.

Congestion Uplink

High congestion on uplinks with these uplink port characteristics:

  • Has a switch or a router as an LLDP neighbor

  • Is a Spanning Tree Protocol (STP) root port

  • Has a higher number of transmitted and received packets compared to the other ports

  • Experiencing congestion due to aggregated Ethernet links and module ports

Bandwidth Headroom

High bandwidth usage.

Switch Health

Juniper Mist™ monitors your switches' operating temperature, power consumption, CPU, and memory usage. Monitoring switch health is crucial because issues such as high CPU usage can directly impact connected clients. For example, if CPU utilization spikes to 100 percent, the connected APs might lose connectivity, affecting the clients' experience.

Switch Unreachable

Poor switch-to-cloud connectivity. The switch might be down, or the connection might be severed.

Capacity

Usage exceeding 80 percent. High usage can indicate that the switch is dealing with more requests that it can optimally handle.

Sub-Classifiers indicate usages exceeding 90 percent of the relevant table capacity:

  • ARP Table
  • Route Table
  • MAC Table
Network

Lower than expected throughput due to uplink capacity limitations.

Based on the round-trip time (RTT) value of packets sent from the switch to the Mist cloud.

Sub-Classifiers:

  • WAN Latency—Based on the average value of RTT over a period of time.

  • WAN Jitter—Calculated by comparing the standard deviation of RTT within a small period with the overall deviation of RTT over a longer period.

System

Issues on the switch that can impact user experiences

Sub-Classifiers:

  • CPU—Utilization above 90 percent

  • Memory—Utilization above 80 percent

  • Temp—Temperature above or below the specified operating range

  • Power—Consumption above 90 percent of the available power