Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Additional Features Optimized for AI-ML Fabrics

For more information about features optimized for AI-ML fabrics, see the AI-ML Data Center Feature Guide.

  • Reactive Path Rebalancing (QFX5240)—Starting in 23.4R2 Junos OS Evolved Release, QFX5240 devices support Reactive Path Rebalancing. Reactive Path Rebalancing is an enhancement to the existing Flowlet mode in the Dynamic Load Balancing (DLB) feature. In the Flowlet mode of DLB, the user configures an inactivity interval. The traffic uses assigned outgoing interface until a pause in flow is greater than the inactivity timer. It is possible that the current outgoing link quality becomes worse over a period of time and the pause within the flow does not exceed the inactivity timer that is configured. Classic Flowlet mode does not reassign to a different link within the inactivity interval and cannot utilize a better quality link. Reactive path rebalancing addresses this limitation by enabling the user to move the traffic to a link with a better quality in the Flowlet mode.

    As per the existing DLB feature, each ECMP egress member link has a quality band assigned based on the traffic flowing through it. The quality band depends on the port load or number of egress bytes transmitted and queue buffer or the number of bytes waiting to be transmitted from the egress port. You can customize these attributes based on the traffic pattern flowing through the ECMP.

    Benefits of the reactive path load balancing are:

    • Optimal use of bandwidth

    • Scalability

    • Helps in avoiding load balancing inefficiencies due to long lived flows.

    You need to configure DLB in the Flowlet mode. If you enable reactive path load balancing, packet reordering can occur when the flow moves from one port to another.

    You need to satisfy the following rules to reassign a flow to a higher-quality member:

    • An egress member port should be available whose quality is equal or greater than the current egress port.

    • The packet random value is lower than the reassignment probability threshold value. When you configure a lower probability threshold value, flows move to higher-quality member at slower rate. For example, a probability threshold value of 200 has faster movement of macro flows to higher-quality member than probability threshold value of 50.

    Example

    Consider topology as shown in Figure 1, where there are three ingress ports and two egress ports in a device. Also shown are table entries forwarding the traffic to each of the egress ports. All the ingress and egress ports are of the same speed.

    Figure 1: Reactive Path Rebalancing Reactive Path Rebalancing

    Reactive load rebalancing works with quality of delta 2 as follows:

    1. Start stream 1 dmac 0x123 with rate 10 percent ingress port et-0/0/0 and egress out of et-0/0/10. Start stream 3 with rate of 50 percent ingress port et-0/0/1 and it egresses out of et-0/0/11.

      Egress link utilization is et-0/0/10 is 10 percent with Quality Band 6 and et-0/0/11 is 50 percent with quality band 5.

    2. Start stream 2 dmac 0x223 with rate of 40 percent ingress port et-0/0/2 and it egresses out of et-0/0/11.

    The reactive load balancing algorithm kicks in if the difference in quality bands for ports et-0/0/10 and et-0/0/11 is equal or higher than the configured delta of 2. The algorithm moves the stream 3 from et-0/0/11 to a better-quality member link, which is et-0/0/10 in this case.

    After some time, you see et-0/0/10 link utilization of 60 percent with quality band of 5 as it egresses stream 1 and stream 3. The et-0/0/11 link utilization is of 40 percent with quality band of 5 as it egresses stream 2. See enhanced hash-key and show forwarding-options enhanced-hash-key.

  • Configurable FlowSet Table in DLB Flowlet Mode (QFX5000)—DLB uses the FlowSet table to determine the egress interface of flows. The FlowSet table can hold a total of 32000 entries that needs to be distributed among 128 DLB ECMP groups. By default, these are divided equally allocating 256 entries per ECMP group. Starting in Junos 23.4R2 Junos OS Evolved Release, you can change the distribution of entries among the ECMP groups. If the FlowSet table has more ECMP group entries, the ECMP group can accommodate larger number of flows thereby achieving better flow distribution. See Dynamic Load Balancing.

  • PFC watchdog support (QFX5230-64CD, QFX5240-64OD, QFX5240-64QD) —Starting in Junos OS Evolved Release 23.4R2, QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD switches support the PFC watchdog feature. The PFC watchdog monitors PFC-enabled ports for PFC pause storms. When a PFC-enabled port receives PFC pause frames for an extended period of time and PFC watchdog does not detect flow control frames on that port, PFC watchdog mitigates the situation. It does this by disabling the queue where the PFC pause storm was detected for a configurable length of time called the recovery time. After the recovery time passes, PFC watchdog re-enables the affected queue.

    You configure PFC watchdog by including the pfc-watchdog statement at the [class-of-service congestion-notification-profile profile-name] hierarchy level. There are four parameters for PFC watchdog that you can configure for QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD switches:

    • poll-interval—The interval at which PFC watchdog checks the status of PFC queues, which can be 1, 10, or 100 milliseconds.

    • detection—The number of polling intervals the PFC watchdog waits before it mitigates the stalled traffic, from 1-15 intervals.

    • watchdog-action—The action the PFC watchdog takes to mitigate a stalled traffic queue, either drop or forward all enqueued and newly arriving packets.

    • recovery—How long the PFC watchdog disables the affected queue before it restores PFC on the queue, from 100-1500 milliseconds with a default of 200 milliseconds.

    [See PFC Watchdog and congestion-notification-profile.]

  • Priority-based flow control X-ON Threshold support (QFX5230-64CD, QFX5240-64OD,and QFX5240-64QD)—The priority-based flow control (PFC) X-ON threshold is the ingress port's priority group (PG) shared buffer limit. At this limit, the ingress port's peer resumes transmission of packets after a brief PAUSE because of the PFC message sent by this ingress port. You can fine tune the X-ON threshold through the congestion notification profile (CNP).

    [See xon (Input Congestion Notification).]

  • Per-queue alpha support (QFX5230-64CD, QFX5240-64OD,and QFX5240-64QD)—You can tune globally the limit of buffers that each queue can consume from the shared pool based on the dynamic threshold setting called the alpha value. You can fine tune the alpha value on a per-queue basis through a scheduler.

    [See buffer-dynamic-threshold.]

  • Support for increased shared buffer pool (QFX5230-64CD, QFX5240-64OD,and QFX5240-64QD)—By default, the QFX5230 switch allocates 73MB of the total 113MB of global buffer space to shared buffers, and the QFX5240 switch allocates 82MB of the total 165MB of global buffer space to shared buffers. These switches allocate the remaining buffer space to dedicated buffers (ingress and egress). You can decrease the global dedicated buffer space from the default value, effectively increasing the global shared buffer space to up to 106MB on the QFX5230 and 147MB on the QFX5240.

    You can also define a dedicated buffer profile to increase or decrease the dedicated buffer allocated to an individual port. This feature is particularly useful for decreasing dedicated buffer space on unused or down ports, thereby increasing dedicated buffer space available to active ports.

    [See Configuring Ingress and Egress Dedicated Buffers.]

  • egress-quantization (QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD— Starting in Junos OS Evolved Release 23.4R2, you can modify port load and port queue metrics from their default values so that when dynamic load balancing is enabled, the metrics are used to determine an optimal link. Use the new egress-quantization CLI to configure the desired ratio of port load metric to port queue metric based on the traffic pattern.

    [See egress-quantization.]

  • rdma-opcode firewall filter match condition (QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD)— Starting in Junos OS Evolved Release 23.4R2, rdma-opcode and rdma-opcode-except firewall filter match conditions have been added to enable match on InfiniBand Base Transport header opcode.

    [See rdma-opcode.]

  • BGP Support for Global Load Balancing in DC Fabric (QFX5240)—Starting in Junos OS Evolved Release 23.4R2, a route with multiple ECMP links is hashed onto several links for load balancing. In a DC fabric, hashing is unable to ensure even load distribution over all ECMP links, which might result in congestions on certain links and underutilization on other links. Dynamic load balancing helps to avoid congested links to mitigate local congestion. However, dynamic load balancing cannot address some congestions. For example, AI ML traffic that has elephant flows and lacks entrophy causes congestions in the fabric. In this case, global load balancing (GLB) helps to mitigate these congestions.

    In a CLOS network the congestions on the first two next hops impacts the load balancing decisions of the local node and the previous hop nodes triggering global load balancing. If the route has only one next-next-hop, a simple path quality profile is created. If the route has more than one next next-hop node then a simple path quality profile is created for each next next-hop node.

    To enable global load balancing, include the global-load-balancing statement at the [edit protocols bgp] hierarchy level. We have disabled this statement by default.

  • Extended sFlow Functionality Support (QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD)—Starting in Junos OS Evolved Release 23.4R2, we’ve extended the sflow monitoring functionality to support the following features:

    • Export of sFlow sample packets via mgmt_junos interface.

      By default, the management Ethernet interface (usually named fxp0 or em0 for Junos OS, or re0:mgmt-* or re1:mgmt-* for Junos OS Evolved) provides the out-of-band management network for the device. Out-of-band management traffic is not clearly separated from in-band protocol control traffic. Instead, all traffic passes through the default routing instance and shares the default inet.0 routing table.

      Once you deploy the mgmt_junos VRF instance, management traffic no longer shares a routing table (that is, the default routing table) with other control traffic or protocol traffic in the system. Traffic in the mgmt_junos VRF instance uses private IPv4 and IPv6 routing tables.

      We’ve introduced a new configuration option “routing-instance” at [edit protocol sflow collector] hierarchy level to specify the routing instance name.

    • Export of sFlow sample packets via non-default VRF WAN ports.

    sFlow is a traffic monitoring protocol that supports VRFs. sFlow provides traffic sampling on configured ports based on sample rate and port information to a collector. An sFlow monitoring system consists of an sFlow agent embedded in the device and up to four external collectors. The sFlow agent performs packet sampling and gathers interface statistics, and then combines the information into UDP datagrams that are sent to the sFlow collectors.

    Collectors can be added and per VRF so that collectors can be spread out across different VRFs. The sFlow forwarding port can belong to a non-default VRF, and captured sFlow packets will have correct sample routing next hop information.

    With this extended feature, an sFlow collector can be connected to the switch through the management network. The software forwarding infrastructure daemon (SFID) on the switch looks up the next-hop address for the specified collector IP address to determine whether the collector is reachable by way of the management network or data network.

    Use the “show sflow collector detail” command to display the additional field “Routing Instance Name” to indicate the VRF name on which collector is reachable and “Routing Instance Id” that is corresponding to that VRF.

    [See collector and show sflow collector.]

  • Per-queue accounting of explicit congestion notification (ECN) packets (QFX5130, QFX5220, QFX5230, QFX5240, QFX5700)—Starting in Junos OS Evolved Release 23.4R2, counters on ECN-enabled queues increment when the queues experience congestion or receive packets that encountered congestion on another device. You can view these per-queue ECN accounting statistics through the show interfaces queue command. For example:

    [See Understanding CoS Explicit Congestion Notification and show interfaces queue .]

  • SNMP support for PFC, ECN, and CoS ingress packet drop accounting (QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD)—Junos OS Evolved Release 23.4R2 introduces SNMP support that helps to account for the packets that are dropped because of ingress port congestion. You can view and export the error counters data for explicit congestion notification (ECN), ingress drops, and priority-based flow control (PFC) using the following commands:

    • show snmp mib walk ifJnxTable

    • show snmp mib walk jnxCosPfcPriorityTable

    [See SNMP MIBs and Traps Supported by Junos OS and Junos OS Evolved and show snmp mib.]

  • Telemetry support for CoS ingress packet drop accounting (QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD)—Junos OS Evolved Release 23.4R2 supports streaming counters for packets that are dropped due to ingress port congestion. Use the native sensor /junos/system/linecard/interface/traffic to stream counters for priority flow control (PFC), explicit congestion notification (ECN), and ingress drops.

    Use the native sensor /junos/system/linecard/qmon-sw/ to stream the priority group (PG) buffer utilization. You can also stream counters for PFC, ECN, and ingress drops by means of OpenConfig using the sensor /interfaces/interface/.

    Counters for priority flow control (PFC), explicit congestion notification (ECN), and ingress drops are exported using the sensor /junos/system/linecard/interface/traffic.

    Counters for PFC, ECN, and ingress drops are also exported using OpenConfig sensor /interfaces/interface/. Priority group (PG) buffer utilization is exported using the sensor /junos/system/linecard/qmon-sw/.

    [See Guidelines for gRPC and gNMI Sensors (Junos Telemetry Interface).]

  • Remote port mirroring to IPv4/IPv6 address (GRE encapsulation) with DSCP, source-address, and rate-limiting parameters (QFX5230-64CD, QFX5240-64OD, and QFX5240-64QD)—Starting in Junos OS Evolved Release 23.4R2, you can configure DSCP, source-address, and rate-limiting parameters in your configuration for remote port mirroring to IPv4 or IPv6 addresses. You use remote port mirroring to copy packets entering a port or VLAN and send the copies to the IPv4 or IPv6 address of a device running an analyzer application on a remote network (sometimes referred to as “extended remote port mirroring”). The mirrored packets are GRE-encapsulated.

    You configure source-address or source-ipv6-address, dscp, and forwarding-class options—either in the analyzer configuration or the port-mirroring configuration—under these hierarchies, respectively:

    • [edit forwarding-options analyzer instance instance-name output]

    • [edit forwarding-options port-mirroring instance instance-name family inet|inet6 output]

    You configure the forwarding class and the shaping-rate option under the class-of-service hierarchy, as follows:

    • set class-of-service forwarding-classes class class-name queue-num queue-number

    • set class-of-service interfaces interface-name scheduler-map map-name

    • set class-of-service scheduler-maps map-name forwarding-class class-name scheduler scheduler-name

    • set class-of-service schedulers scheduler-name shaping-rate rate

    [See Port Mirroring and Analyzers.]

  • Strip and replace BGP private-AS path (QFX5230-64CD, QFX5240-64OD, QFX5240-64QD)—Starting in Junos OS Evolved Release 23.4R2, we have introduced the strip-as-path policy option that removes the incoming autonomous system (AS) path as part of the import policy for a BGP session and replaces the received autonomous system (AS) path with the receiving router's local AS number for the receiving session. Note that the local AS number may be different from the number configured under autonomous system in the [edit routing-options] hierarchy.

    If you need to normalize externally injected routes, you can use this policy option for the incoming autonomous system (AS) path so that it may be used similarly to routes that originate solely within the fabric. The new strip-as-path policy option has no impact on the BGP export policy.

    You can configure the strip-as-path option under policy-options then clause:

    set policy-options policy-statement do-strip term a then strip-as-path

    [See Autonomous Systems for BGP Sessions.]