Stateless Flow Latency Monitoring
This document describes what stateless flow latency monitoring is and how you can configure it for your system.
Stateless Flow Latency Monitoring Overview
Stateless flow latency monitoring enables you to troubleshoot network flows that have unusually high travel time between their input and output interfaces on your switch.
- Benefits of Stateless Flow Latency Monitoring
- How Does Stateless Flow Latency Monitoring Work?
- Caveats and Limitations of the Stateless Flow Latency Monitoring Feature
- How to View Your Monitoring Configuration and Outputs
- WHAT's NEXT
- Platform-Specific Stateless Flow Latency Monitoring Behavior
Stateless flow latency monitoring monitors packet residence time—the time a packet takes to travel from the input interface to the output interface on the switch. You configure a residence-time range comprising values that significantly exceed expected latency values. The software mirrors (copies) packets whose latency falls within your configured residence-time range to an external collector for analysis.
Stateless monitoring mirrors all packets that fall within the configured latency range to the collector.
Latency monitoring is supported for BUM traffic.
Use Feature Explorer to confirm platform and release support for specific features.
Review Platform-Specific Stateless Flow Latency Monitoring Behavior for notes related to your platform.
Benefits of Stateless Flow Latency Monitoring
-
Help monitor packets that cross a desired latency threshold.
-
AI-ML data center administrators might find this feature particularly useful for monitoring RDMA over Converged Ethernet version 2 (ROCEv2) flows.
How Does Stateless Flow Latency Monitoring Work?
The packet forwarding pipeline on the switch has three main stages:
-
Ingress pipeline
-
MMU
-
Egress pipeline
Packet latency occurs at all three stages, but the latency from the ingress and egress pipelines is typically (1) minimal and (2) consistent. Any significant switch latency usually occurs at the MMU stage, primarily due to packet buffering and scheduling delays.
Residence time is measured as Tx timestamp - Rx timestamp. The Rx timestamp is inserted at the beginning of packet ingress. The Tx timestamp is inserted at the end of the MMU pipeline.
The stateless flow latency monitoring feature operates as follows:
It detects a packet flow that has a residence-time value in the range you configured.
The packet flow is emitted from the output interface.
The packets in this flow can be L2 or L3 packets.
You can control the exported data rate by configuring an optional sample rate. See Configure Stateless Flow Latency Monitoring.
It mirrors (copies) packets from that flow, encapsulates the packets in IPFIX format, and sends the mirrored packets to the collector address that you've assigned.
Mirroring occurs through a port-mirroring and firewall-filter–based part of the configuration. Match-condition actions for this setup are
residence-timeandport-mirror-instance. You can also assign an optionalcountaction; see details in Configure Stateless Flow Latency Monitoring.Residence time (in nanoseconds) and the egress queue number are included with the mirrored packet as part of IPFIX; these values can also help with the flow analysis.
Original packet headers are included in the IPFIX export. An original packet is truncated to the size of a single cell—which is 208 bytes—and exported. The export includes L2, L3, L4, and IB BTH (the last in the case of RoCEv2). If the packet size is less than 208 bytes, the full packet is exported.
Caveats and Limitations of the Stateless Flow Latency Monitoring Feature
Caveats
-
If the closing value of the latency range you configure is less than the normal pipeline latency, all packets are mirrored to the collector. See Platform-Specific Stateless Flow Latency Monitoring Behavior for our recommendation of how to configure range values to capture just the packets you need to capture.
-
If you enable this feature on multiple egress ports, when those ports are congested, an egress mirror port might get congested and tail drops could happen.
-
If packets are dropped due to congestion at the MMU, those flows or packets are not eligible for latency monitoring.
-
Hardware resources used for this feature such as TCAM, mirror MTP, and loopback profile are shared resources and are used in an FCFS manner. If the required hardware resources are unavailable, this feature doesn't work.
-
A filter with a
residence-timematch can interface-bind only in the output direction. Also, the bindpoint cannot be under a VLAN.
-
The collector device must have an IPv4 address.
-
As egress filters (output direction) are not supported over VxLAN ports, flow latency monitoring is not supported on those ports.
-
For packets that fall in the configured latency range, only
port-mirror-instanceandcountactions are supported in the firewall filter configuration. The software doesn't do a commit check on those filter actions. - On the collector port, queue transmit statistics for flow latency monitoring IPFIX packets do not behave as expected. Although these packets are sent through a multicast queue, they are counted under unicast queue 0 (best-effort). This discrepancy is due to a hardware limitation. However, drop counters are correctly incremented on the corresponding multicast queues.
- The maximum bandwidth for flow latency monitoring IPFIX traffic is capped at 100 Gbps per switch. This figure includes the additional internal loopback header overhead introduced between the first and second pass. The limitation arises because the EPRC bandwidth is set to 100 Gbps in the default profile.
How to View Your Monitoring Configuration and Outputs
Use the following show commands to view your monitoring
configuration and to see information related to latency monitoring.
-
No link title
-
show configuration | display set | match filter
WHAT's NEXT
Platform-Specific Stateless Flow Latency Monitoring Behavior
Use the following table to review platform-specific behaviors for your platforms.
|
Platform |
Difference |
|---|---|
|
QFX Series |
|
Configure Stateless Flow Latency Monitoring
Use the information in this task to set up your monitoring configuration.
To configure this feature, you configure the following elements, each shown with its CLI hierarchy:
|
Element |
CLI Hierarchy |
|---|---|
|
mirror profile |
|
|
interface |
|
|
port-mirror instance |
|
|
firewall filter |
|
Configure the mirror profile:
Configure the IPv4 address of the collector (the device that you have packets mirrored to for analysis). Configure the mode of latency monitoring as
stateless.user@switch# set forwarding-options No link title profile-name No link title ip-address ip-address user@switch# set forwarding-options mirror-profile profile-name latency-monitoring stateless
Important:The collector IP address must be an IPv4 address.
Note:Ensure that the collector devices that you include in the configuration are capable of decoding the flow latency monitoring IPFIX export records based on the given IPFIX record format.
- The collector IP address can be any reachable IP address. Ensure that ARP has been resolved already for the collector IP address.
- (Optional) You can configure one, two, or
three of these values:
The collector's Layer 4 UDP port number on the collector device
Observation domain ID, a number that helps the collector identify an individual switch in a multiswitch topology
Sampling rate, a ratio of matched packets to mirrored packets; allows you to reduce the flow/packet export rate to the collector
user@switch# set forwarding-options mirror-profile profile-name collector l4-port number user@switch# set forwarding-options mirror-profile profile-name observation-domain-id id-number user@switch# set forwarding-options mirror-profile profile-name sample-rate rate
Note:When stateless flow latency monitoring is enabled on multiple switches in a topology, we recommend that you configure a unique observation domain ID. That ID can help the collector identify the switch that generated the latency monitoring record.
Configure the port-mirror instance:
Use one of these families:
inet,inet6, orethernet-switching:user@switch# #set forwarding-options port-mirroring instance instance-name family family-name output mirror-profile profile-name
- (Optional) Change the default mirror
queue:
user@switch# set forwarding-options port-mirroring instance instance-name family family-name output forwarding-class multicast-forwarding-class-name
Note: If you intend to use CoS-based traffic control, you might need to change the default mirror queue.
Configure the firewall filter:
Configure the firewall family, filter name, term name, the family, the filter name, the term name, and the residence-time range (in the format minimum-maximum, with both numbers representing nanoseconds):
user@switch# set firewall family family-name filter filter-name term term-name from residence-time residence-time-range
Important:See Platform-Specific Behavior Information for our recommendation for the starting point of the
residence-time rangevalue for your switch.Note:residence-time rangeis a firewall filter match condition. The possible range values are 1–1000000000 nanoseconds. You specify range values to match as input. Alternatively, you can specify a singleresidence-timevalue. In that case, monitoring is enabled from the configuredresidence-timevalue to the maximum supportedresidence-timevalue (1 sec).Configure the action as
port-mirror-instanceand provide an instance name:user@switch# set firewall family family-name filter filter-name term term-name then port-mirror-instance instance-name
(Optional) If you want to get a count of how many packets match the residence time, configure the
countaction in addition to theport-mirror-instanceaction.user@switch# set firewall family family-name filter filter-name term term-name then count number
Note:You can apply a
countaction without applying theport-mirror-instanceaction if you want to get a count of latency-exceeded flows without mirroring those flows.You can issue the
show firewallcommand to see the counter value.
Apply the port-mirroring instance over the interface:
user@switch# set interfaces interface-name unit logical-unit-number family family-name filter output filter-name
Sample Configuration
forwarding-options { mirror-profile latMon {
collector ip-address 10.12.1.1/24;
latency-monitoring stateless;
sample-rate 10;
}
port-mirroring instance flowMon {
family inet {
output {
mirror-profile latMon;
}
}
}
}
firewall family inet {
filter f1 {
term t1 {
from residence-time 1500-10000;
then port-mirror-instance flowMon;
}
}
}
interfaces {
et-0/0/10.0 {
family inet {
filter output f1;
}
}
}