Global Load Balancing (GLB)
Overview
This is an evolving feature for early adopters. More enhancements are planned in a future release.
AI-ML data centers have less entropy and larger data flows than other networks. Because hash-based load balancing does not always effectively load-balance this type of traffic, dynamic load balancing (DLB) is often used instead. However, DLB takes into account only the local link bandwidth utilization. For this reason, DLB can effectively mitigate traffic congestion only on the immediate next hop. Global load balancing (GLB) is an enhancement to DLB that has visibility into congestion at the next-to-next-hop (NNH) level. GLB more effectively load-balances large data flows by taking traffic congestion on remote links into account.
Classic load balancing mechanisms use a hashing algorithm to decide the egress interface through which to send traffic. These algorithms operate the hash function on five tuples of the received packet. However, the algorithms do not consider the real-time utilization of the links through which they send packets. Even in DLB, the decision is completely local and the algorithm is unable to globally detect link utilization. If a node farther out is congested, that node might drop the packet.
GLB takes into account the link utilization of remote links before deciding on the egress interface. Similarly to DLB, when one multipath leg experiences congestion, GLB can offload traffic to alternative legs to mitigate the congestion. Unlike DLB, GLB can reroute traffic flows on leaf devices to avoid traffic congestion on the spine level.
Benefits
-
Reduces packet loss due to congestion and remote link failures
-
Effectively load-balances large data flows in Clos topologies end-to-end to avoid congestion
-
Is particularly useful in AI-ML deployments where large data flows increase the likelihood of traffic congestion
Configuration
Considerations
Keep the following in mind when configuring GLB:
-
GLB is supported only in a 3-Clos (leaf-spine-leaf) topology.
-
All the devices in the 3-Clos topology must support GLB before you can configure GLB.
-
The 3-Clos topology can have a maximum of 64 leaf devices when it supports GLB.
-
GLB supports only one link between the same pair of devices (for example, a spine device and leaf device).
GLB does not support the following features:
-
Integrated routing and bridging (IRB) interfaces between top-of-rack (ToR) and spine devices
-
Multihomed servers
-
GLB for overlay routes (IPv4 or IPv6)
-
GLB for BGP routes learned in routing instances
Configure GLB
Platform Support
See Feature Explorer for platform and release support.