Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Examples: Configuring BGP Multipath

 

Understanding BGP Multipath

BGP multipath allows you to install multiple internal BGP paths and multiple external BGP paths to the forwarding table. Selecting multiple paths enables BGP to load-balance traffic across multiple links.

A path is considered a BGP equal-cost path (and is used for forwarding) if the BGP path selection process performs a tie-break after comparing the IGP cost to the next-hop. By default, all paths with the same neighboring AS, learned by a multipath-enabled BGP neighbor are considered in the multipath selection process.

BGP, typically selects only one best path for each prefix and installs that route in the forwarding table. When BGP multipath is enabled, the device selects multiple equal-cost BGP paths to reach a given destination, and all these paths are installed in the forwarding table. BGP advertises only the active path to its neighbors, unless add-path is in use.

The Junos OS BGP multipath feature supports the following applications:

  • Load balancing across multiple links between two routing devices belonging to different autonomous systems (ASs)

  • Load balancing across a common subnet or multiple subnets to different routing devices belonging to the same peer AS

  • Load balancing across multiple links between two routing devices belonging to different external confederation peers

  • Load balancing across a common subnet or multiple subnets to different routing devices belonging to external confederation peers

In a common scenario for load balancing, a customer is multihomed to multiple routers or switches in a point of presence (POP). The default behavior is to send all traffic across only one of the available links. Load balancing causes traffic to use two or more of the links.

BGP multipath does not apply to paths that share the same MED-plus-IGP cost, yet differ in IGP cost. Multipath path selection is based on the IGP cost metric, even if two paths have the same MED-plus-IGP cost.

Starting in Junos OS Release 18.1R1 BGP multipath is supported globally at [edit protocols bgp] hierarchy level. You can selectively disable multipath on some BGP groups and neighbors. Include disable at [edit protocols bgp group group-name multipath] hierarchy level to disable multipath option for a group or a specific BGP neighbor.

Starting in Junos OS Release 18.1R1, you can defer multipath calculation until all BGP routes are received. When multipath is enabled, BGP inserts the route into the multipath queue each time a new route is added or whenever an existing route changes. When multiple paths are received through BGP add-path feature, BGP might calculate one multipath route multiple times. Multipath calculation slows down the RIB (also known as the routing table) learning rate. To speed up RIB learning, multipath calculation can be either deferred until the BGP routes are received or you can lower the priority of the multipath build job as per your requirements until the BGP routes are resolved. To defer the multipath calculation configure defer-initial-multipath-build at [edit protocols bgp] hierarchy level. Alternatively, you can lower the BGP multipath build job priority using multipath-build-priority configuration statement at [edit protocols bgp] hierarchy level to speed up RIB learning.

Example: Load Balancing BGP Traffic

This example shows how to configure BGP to select multiple equal-cost external BGP (EBGP) or internal BGP (IBGP) paths as active paths.

Requirements

Before you begin:

  • Configure the device interfaces.

  • Configure an interior gateway protocol (IGP).

  • Configure BGP.

  • Configure a routing policy that exports routes (such as direct routes or IGP routes) from the routing table into BGP.

Overview

The following steps show how to configure per-packet load balancing:

  1. Define a load-balancing routing policy by including one or more policy-statement statements at the [edit policy-options] hierarchy level, defining an action of load-balance per-packet:

    Note

    To enable load-balancing among multiple EBGP paths and multiple IBGP paths , include the multipath statement globally at the [edit protocols bgp] hierarchy level. You cannot enable load-balancing of BGP traffic without including the multipath statement globally, or for a BGP group at the [edit protocols bgp group group-name hierarchy level, or for specific BGP neighbors at the [edit protocols bgp group group-name neighbor address] hierarchy level.

  2. Apply the policy to routes exported from the routing table to the forwarding table. To do this, include the forwarding-table and export statements:

    You cannot apply the export policy to VRF routing instances.

  3. Specify all next hops of that route, if more than one exists, when allocating a label corresponding to a route that is being advertised.

  4. Configure the forwarding-options hash key for MPLS to include the IP payload.

Note

On some platforms, you can increase the number of paths that are load balanced by using the chassis maximum-ecmp statement. With this statement, you can change the maximum number of equal-cost load-balanced paths to 32, 64, 128, 256, or 512 (the maximum number varies per platform—see maximum-ecmp.) Starting with Junos OS Release 19.1R1, you can specify a maximum number of 128 equal-cost paths on QFX10000 switches. Starting with Junos OS Release 19.2R1, you can specify a maximum number of 512 equal-cost paths on QFX10000 switches.—see Understanding Configuration of Up to 512 Equal-Cost Paths With Optional Consistent Load Balancing.

In this example, Device R1 is in AS 64500 and is connected to both Device R2 and Device R3, which are in AS 64501. This example shows the configuration on Device R1.

Topology

Figure 1 shows the topology used in this example.

Figure 1: BGP Load Balancing
BGP Load Balancing

Configuration

CLI Quick Configuration

To quickly configure this example, copy the following commands, paste them into a text file, remove any line breaks, change any details necessary to match your network configuration, and then copy and paste the commands into the CLI at the [edit] hierarchy level.

Step-by-Step Procedure

The following example requires that you navigate various levels in the configuration hierarchy. For information about navigating the CLI, see Using the CLI Editor in Configuration Mode in the CLI User Guide.

To configure the BGP peer sessions:

  1. Configure the BGP group.
  2. Enable the BGP group to use multiple paths.Note

    To disable the default check requiring that paths accepted by BGP multipath must have the same neighboring autonomous system (AS), include the multiple-as option.

  3. Configure the load-balancing policy.
  4. Apply the load-balancing policy.
  5. Configure the local autonomous system (AS) number.

Results

From configuration mode, confirm your configuration by entering the show protocols, show policy-options, and show routing-options commands. If the output does not display the intended configuration, repeat the instructions in this example to correct the configuration.

If you are done configuring the device, enter commit from configuration mode.

Verification

Confirm that the configuration is working properly:

Verifying Routes

Purpose

Verify that routes are learned from both routers in the neighboring AS.

Action

From operational mode, run the show route command.

user@R1> show route 10.0.2.0
user@R1> show route 10.0.2.0 detail

Meaning

The active path, denoted with an asterisk (*), has two next hops: 10.0.1.1 and 10.0.0.2 to the 10.0.2.0 destination. The 10.0.1.1 next hop is copied from the inactive path to the active path.

Note

The show route detail command output designates one gateway as selected. This output is potentially confusing in the context of load balancing. The selected gateway is used for many purposes in addition to deciding which gateway to install into the kernel when Junos OS is not performing per-packet load-balancing. For instance, the ping mpls command uses the selected gateway when sending packets. Multicast protocols use the selected gateway in some cases to determine the upstream interface. Therefore, even when Junos OS is performing per-packet load-balancing by way of a forwarding-table policy, the selected gateway information is still required for other purposes. It is useful to display the selected gateway for troubleshooting purposes. Additionally, it is possible to use forwarding-table policy to override what is installed into the kernel (for example, by using the install-nexthop action). In this case, the next-hop gateway installed in the forwarding table might be a subset of the total gateways displayed in the show route command.

Verifying Forwarding

Purpose

Verify that both next hops are installed in the forwarding table.

Action

From operational mode, run the show route forwarding-table command.

user@R1> show route forwarding-table destination 10.0.2.0

Understanding Configuration of Up to 512 Equal-Cost Paths With Optional Consistent Load Balancing

You can configure the equal-cost multipath (ECMP) feature with up to 512 paths for external BGP peers. Having the ability to configure up to 512 ECMP next hops allows you to increase the number of direct BGP peer connections with your specified routing device, thus improving latency and optimizing data flow. You can optionally include consistent load balancing in that ECMP configuration. Consistent load balancing ensures that if an ECMP member (that is, a path) fails, only flows flowing through the failed member are redistributed to other active ECMP members. Consistent load balancing also ensures that if an ECMP member is added, redistribution of flows from existing EMCP members to the new ECMP member is minimal.

Guidelines and Limitations for Configuring from 256 to 512 Equal-Cost Paths, Optionally with Consistent Load Balancing

  • The feature applies only to single-hop external BGP peers. (This feature does not apply to MPLS routes.)

  • The device’s routing process (RPD) must support 64-bit mode; 32-bit RPD is not supported.

  • The feature applies only to unicast traffic.

  • Traffic distribution might not be even across all group members—it depends on the traffic pattern and on the organization of the hashing flow set table in hardware. Consistent hashing minimizes remapping of flows to destination links when members are added to or deleted from the group.

  • If you configure set forwarding-options enhanced-hash-key with one of the options hash-mode, inet, inet6, or layer2, some flows might change destination links, because the new hash parameters might generate new hash indexes for the flows, resulting in new destination links.

  • To achieve the best-possible hashing accuracy, this feature uses a cascaded topology to implement the next-hop structure for configurations of more than 128 next hops. Hashing accuracy is therefore somewhat lesser than it is for ECMP next-hop configurations of less than 128, which do not require a cascaded topology.

  • Existing flows on affected ECMP paths and new flows flowing over those affected ECMP paths might switch paths during local route repair, and traffic skewing might be noticeable. However, any such skewing is corrected during the subsequent global route repair.

  • When you increase the maximum-ecmp value, consistency hashing is lost during the next next-hop-change event for the route prefix.

  • If you add a new path to an existing ECMP group, some flows over unaffected paths might move to the newly added path.

  • Fast reroute (FRR) might not work with consistent hashing.

  • Perfect ECMP-like traffic distribution cannot be achieved. Paths that have more “buckets” than other paths have more traffic flows than paths with fewer buckets (a bucket is an entry in the load-balancing table’s distribution list that is mapped to an ECMP member index).

  • During network topology change events, consistent hashing is lost for network prefixes in some instances because those prefixes point to a new ECMP next hop that does not have all properties of the prefixes’ previous ECMP next hops.

  • If multiple network prefixes point to the same ECMP next hop and one or more of those prefixes is enabled with the consistent-hash statement, all network prefixes pointing to that same ECMP next hop display consistent–hashing behavior.

  • Consistent hashing is supported on the equal-cost BGP routes–based ECMP group only. When other protocols or static routes are configured that have priority over BGP routes, consistent hashing is not supported.

  • Consistent hashing might have limitations when the configuration is combined with configurations for the following features, because these features have tunnel terminations or traffic engineering that does not use hashing for selecting paths—GRE tunneling; BUM traffic; EVPN-VXLAN; and MPLS TE, autobandwidth.

Instructions for Configuring Up to 512 ECMP Next Hops, and Optionally Configuring Consistent Load Balancing

When you are ready to configure up to 512 next hops, use the following configuration instructions:

  1. Configure the maximum number of ECMP next hops—for example, configure 512 ECMP next hops:

  2. Creating a routing policy and enable per-packet load balancing, thus enabling ECMP globally on the system:

  3. Enable resiliency on selected prefixes by creating a separate routing policy to match incoming routes to one or more destination prefixes—for example:

  4. Apply an eBGP import policy (for example, “c-hash”) to the BGP group of external peers:

For more detail on configuring equal-cost paths, see Example: Load Balancing BGP Traffic, which appears earlier in this document.

(Optional) For more detail on configuring consistent load balancing (also known as consistent hashing), see Configuring Consistent Load Balancing for ECMP Groups.

Example: Configuring Single-Hop EBGP Peers to Accept Remote Next Hops

This example shows how to configure a single-hop external BGP (EBGP) peer to accept a remote next hop with which it does not share a common subnet.

Requirements

No special configuration beyond device initialization is required before you configure this example.

Overview

In some situations, it is necessary to configure a single-hop EBGP peer to accept a remote next hop with which it does not share a common subnet. The default behavior is for any next-hop address received from a single-hop EBGP peer that is not recognized as sharing a common subnet to be discarded. The ability to have a single-hop EBGP peer accept a remote next hop to which it is not directly connected also prevents you from having to configure the single-hop EBGP neighbor as a multihop session. When you configure a multihop session in this situation, all next-hop routes learned through this EBGP peer are labeled indirect even when they do share a common subnet. This situation breaks multipath functionality for routes that are recursively resolved over routes that include these next-hop addresses. Configuring the accept-remote-nexthop statement allows a single-hop EBGP peer to accept a remote next hop, which restores multipath functionality for routes that are resolved over these next-hop addresses. You can configure this statement at the global, group, and neighbor hierarchy levels for BGP. The statement is also supported on logical systems and the VPN routing and forwarding (VRF) routing instance type. Both the remote next-hop and the EBGP peer must support BGP route refresh as defined in RFC 2918, Route Refresh Capability in BGP-4. If the remote peer does not support BGP route refresh, the session is reset.

When you enable a single-hop EBGP peer to accept a remote next hop, you must also configure an import routing policy on the EBGP peer that specifies the remote next-hop address.

This example includes an import routing policy, agg_route, that enables a single-hop external BGP peer (Device R1) to accept the remote next-hop 1.1.10.10 for the route to the 1.1.230.0/23 network. At the [edit protocols bgp] hierarchy level, the example includes the import agg_route statement to apply the policy to the external BGP peer and includes the accept-remote-nexthop statement to enable the single-hop EBGP peer to accept the remote next hop.

Figure 2 shows the sample topology.

Figure 2: Topology for Accepting a Remote Next Hop
Topology for Accepting a Remote
Next Hop

Configuration

CLI Quick Configuration

To quickly configure this example, copy the following commands, paste them into a text file, remove any line breaks, change any details necessary to match your network configuration, and then copy and paste the commands into the CLI at the [edit] hierarchy level.

Device R0

Device R1

Device R2

Device R0

Step-by-Step Procedure

The following example requires you to navigate various levels in the configuration hierarchy. For information about navigating the CLI, see Using the CLI Editor in Configuration Mode in the CLI User Guide.

To configure Device R0:

  1. Configure the interfaces.
  2. Configure EBGP.
  3. Enable multipath BGP between Device R0 and Device R1.
  4. Configure static routes to remote networks.

    These routes are not part of the topology. The purpose of these routes is to demonstrate the functionality in this example.
  5. Configure routing policies that accept the static routes.
  6. Export the agg_route and test_route policies from the routing table into BGP.
  7. Configure the autonomous system (AS) number.

Results

From configuration mode, confirm your configuration by entering the show interfaces, show policy-options, show protocols, and show routing-options commands. If the output does not display the intended configuration, repeat the instructions in this example to correct the configuration.

If you are done configuring the device, enter commit from configuration mode.

Configuring Device R1

Step-by-Step Procedure

The following example requires you to navigate various levels in the configuration hierarchy. For information about navigating the CLI, see Using the CLI Editor in Configuration Mode in the CLI User Guide.

To configure Device R1:

  1. Configure the interfaces.
  2. Configure OSPF.
  3. Enable Device R1 to accept the remote next hop.
  4. Configure IBGP.
  5. Configure EBGP.
  6. Enable multipath BGP between Device R0 and Device R1.
  7. Configure a routing policy that enables a single-hop external BGP peer (Device R1) to accept the remote next-hop 1.1.10.10 for the route to the 1.1.230.0/23 network.
  8. Import the agg_route policy into the routing table on Device R1.
  9. Configure the autonomous system (AS) number.

Results

From configuration mode, confirm your configuration by entering the show interfaces, show policy-options, show protocols, and show routing-options commands. If the output does not display the intended configuration, repeat the instructions in this example to correct the configuration.

If you are done configuring the device, enter commit from configuration mode.

Configuring Device R2

Step-by-Step Procedure

The following example requires you to navigate various levels in the configuration hierarchy. For information about navigating the CLI, see Using the CLI Editor in Configuration Mode in the CLI User Guide.

To configure Device R2:

  1. Configure the interfaces.
  2. Configure OSPF.
  3. Configure IBGP.
  4. Configure the autonomous system (AS) number.

Results

From configuration mode, confirm your configuration by entering the show interfaces, show protocols, and show routing-options commands. If the output does not display the intended configuration, repeat the instructions in this example to correct the configuration.

If you are done configuring the device, enter commit from configuration mode.

Verification

Confirm that the configuration is working properly.

Verifying That the Multipath Route with the Indirect Next Hop Is in the Routing Table

Purpose

Verify that Device R1 has a route to the 1.1.230.0/23 network.

Action

From operational mode, enter the show route 1.1.230.0 extensive command.

user@R1> show route 1.1.230.0 extensive

Meaning

The output shows that Device R1 has a route to the 1.1.230.0 network with the multipath feature enabled (Accepted Multipath). The output also shows that the route has an indirect next hop of 1.1.10.10.

Deactivating and Reactivating the accept-remote-nexthop Statement

Purpose

Make sure that the multipath route with the indirect next hop is removed from the routing table when you deactivate the accept-remote-nexthop statement.

Action

  1. From configuration mode, enter the deactivate protocols bgp accept-remote-nexthop command.

    user@R1# deactivate protocols bgp accept-remote-nexthop
    user@R1# commit
  2. From operational mode, enter the show route 1.1.230.0 command.

    user@R1> show route 1.1.230.0
  3. From configuration mode, reactivate the statement by entering the activate protocols bgp accept-remote-nexthop command.

    user@R1# activate protocols bgp accept-remote-nexthop
    user@R1# commit
  4. From operational mode, reenter the show route 1.1.230.0 command.

    user@R1> show route 1.1.230.0

Meaning

When the accept-remote-nexthop statement is deactivated, the multipath route to the 1.1.230.0 network is removed from the routing table .

Release History Table
Release
Description
Starting with Junos OS Release 19.2R1, you can specify a maximum number of 512 equal-cost paths on QFX10000 switches.
Starting with Junos OS Release 19.1R1, you can specify a maximum number of 128 equal-cost paths on QFX10000 switches.
Starting in Junos OS Release 18.1R1 BGP multipath is supported globally at [edit protocols bgp] hierarchy level. You can selectively disable multipath on some BGP groups and neighbors. Include disable at [edit protocols bgp group group-name multipath] hierarchy level to disable multipath option for a group or a specific BGP neighbor.
Starting in Junos OS Release 18.1R1, you can defer multipath calculation until all BGP routes are received. When multipath is enabled, BGP inserts the route into the multipath queue each time a new route is added or whenever an existing route changes. When multiple paths are received through BGP add-path feature, BGP might calculate one multipath route multiple times. Multipath calculation slows down the RIB (also known as the routing table) learning rate. To speed up RIB learning, multipath calculation can be either deferred until the BGP routes are received or you can lower the priority of the multipath build job as per your requirements until the BGP routes are resolved. To defer the multipath calculation configure defer-initial-multipath-build at [edit protocols bgp] hierarchy level. Alternatively, you can lower the BGP multipath build job priority using multipath-build-priority configuration statement at [edit protocols bgp] hierarchy level to speed up RIB learning.