Selecting the Best Path
BGP selects only one route to a destination as the best path. When multiple routes to a given destination exist, BGP must determine which of these routes is the best. BGP puts the best path in its routing table and advertises that path to its BGP neighbors.
If only one route exists to a particular destination, BGP installs that route. If multiple routes exist for a destination, BGP uses tie-breaking rules to decide which one of the routes to install in the BGP routing table.
BGP Path Decision Algorithm
BGP determines the best path to each destination for a BGP speaker by comparing path attributes according to the following selection sequence:
- Select a path with a reachable next hop.
- Select the path with the highest weight.
- If path weights are the same, select the path with the highest local preference value.
- Prefer locally originated routes (network routes, redistributed routes, or aggregated routes) over received routes.
- Select the route with the shortest AS-path length.
- If all paths have the same AS-path length, select the path based on origin: IGP is preferred over EGP; EGP is preferred over Incomplete.
- If the origins are the same, select the path with lowest MED value.
- If the paths have the same MED values, select the path learned by means of EBGP over one learned by means of IBGP.
- Select the path with the lowest IGP cost to the next hop.
- Select the path with the shortest route reflection cluster list. Routes without a cluster list are treated as having a cluster list of length 0.
- Select the path received from the peer with the lowest BGP router ID.
- Select the path that was learned from the neighbor with the lowest peer remote address.
The following sections discuss the attributes evaluated in the path decision process. Examples show how you might configure these attributes to influence routing decisions.
Configuring Next-Hop Processing
Routes sent by BGP speakers include the next-hop attribute. The next hop is the IP address of a node on the network that is closer to the advertised prefix. Routers that have traffic destined for the advertised prefix send the traffic to the next hop. The next hop can be the address of the BGP speaker sending the update or of a third-party node. The third-party node does not have to be a BGP speaker.
The next-hop attributes conform to the following rules:
- The next hop for EBGP sessions is the IP address of the peer that advertised the route.
- The next hop for IBGP sessions is one of the following:
- If the route originated inside the AS, the next hop is the IP address of the peer that advertised the route.
- If the route originated outside the ASthat is, it was injected into the AS by means of an EBGP sessionthe next hop is the IP address of the external BGP speaker that advertised the route.
- For routes advertised on multiaccess mediasuch as Frame Relay, ATM, or Ethernetthe next hop is the IP address of the originating router's interface that is connected to the medium.
Next Hops
If you use the neighbor remote-as command to configure the BGP neighbors, the next hop is passed according to the rules provided above when networks are advertised. Consider the network configuration shown in Figure 28. Router Jackson advertises 192.168.22.0/23 internally to router Memphis with a next hop of 10.2.2.1. Router Jackson advertises the same network externally to router Topeka with a next hop of 10.1.13.1.
![]()
Router Memphis advertises 172.24.160/19 with a next hop of 10.2.2.2 to router Jackson. Router Jackson advertises this same network externally to router Topeka with a next hop of 10.1.13.1.
Router Topeka advertises network 192.168.32.0/19 with a next hop of 10.1.13.2 to router Jackson. Because this network originates outside AS 604, router Jackson then internally advertises this network (192.168.32.0/19) to router Memphis with the same next hop, 10.1.13.2 (the IP address of the external BGP speaker that advertised the route).
When router Memphis has traffic destined for 192.168.32.0/19, it must be able to reach the next hop by means of an IGP, because it has no direct connection to 10.1.13.2. Otherwise, router Memphis will drop packets destined for 192.168.32.0/19 because the next-hop address is not accessible. Router Memphis does a lookup in its IP routing table to determine how to reach 10.1.13.2:
The next hop is reachable through router Jackson, and the traffic can be forwarded.
The following commands configure the routers as shown in Figure 28:
host1(config)#router bgp 604host1(config-router)#neighbor 10.1.13.2 remote-as 25host1(config-router)#neighbor 10.2.2.2 remote-as 604host1(config-router)#network 192.168.22.0 mask 255.255.254.0host2(config)#router bgp 604host2(config-router)#neighbor 10.2.2.1 remote-as 604host2(config-router)#network 172.24.160.0 mask 255.255.224.0host3(config)#router bgp 25host3(config-router)#neighbor 10.1.13.1 remote-as 604host3(config-router)#network 172.31.64.0 mask 255.255.192.0Additional configuration is required for routers Biloxi, Memphis, and Jackson; the details depend on the IGP running in AS 604.
neighbor remote-as
- Use to add an entry to the BGP neighbor table.
- Specifying a neighbor with an AS number that matches the AS number specified in the router bgp command identifies the neighbor as internal to the local AS. Otherwise, the neighbor is treated as an external neighbor.
- If you specify a BGP peer group by using the peerGroupName argument, all the members of the peer group inherit the characteristic configured with this command unless it is overridden for a specific peer.
- This command takes effect immediately.
- Use the no version to remove an entry from the table.
Next-Hop-Self
In some circumstances, using a third-party next hop causes routing problems. These configurations typically involve nonbroadcast multiaccess (NBMA) media. To better understand this situation, first consider a broadcast multiaccess (BMA) media network, as shown in Figure 29.
![]()
Routers Toledo, Madrid, and Barcelona are all on the same Ethernet network, which has a prefix of 10.19.7.0/24. When router Toledo advertises prefix 192.168.22.0/23 to router Madrid, it sets the next-hop attribute to 10.19.7.5. Before router Madrid advertises this prefix to router Barcelona, it sees that its own IP address, 10.19.7.7, is on the same subnet as the next hop for the advertised prefix. If router Barcelona can reach router Madrid, then it should be able to reach router Toledo. Router Madrid therefore advertises 192.168.22.0/23 to router Barcelona with a next-hop attribute of 10.19.7.5.
Now consider Figure 30, which shows the same routers on a Frame RelayNBMAnetwork.
![]()
Routers Toledo and Madrid are EBGP peers, as are routers Madrid and Barcelona. When router Toledo advertises prefix 192.168.22.0/23 to router Madrid, router Madrid makes the same comparison as in the BMA example, and leaves the next-hop attribute intact when it advertises the prefix to router Barcelona. However, router Barcelona will not be able to forward traffic to 192.168.22.0/23, because it does not have a direct PVC connection to router Toledo and cannot reach the next hop of 10.19.7.5.
You can use the neighbor next-hop-self command to correct this routing problem. If you use this command to configure router Madrid, the third-party next hop advertised by router Toledo is not advertised to router Barcelona. Instead, router Madrid advertises 192.168.22.0/23 with the next-hop attribute set to its own IP address, 10.19.7.7. Router Barcelona now forwards traffic destined for 192.168.22.0/23 to the next hop, 10.19.7.7. Router Madrid then passes the traffic along to router Toledo.
To disable third-party next-hop processing, configure router Madrid as follows:
host1(config)#router bgp 319host1(config-router)#neighbor 10.19.7.8 remote-as 211host1(config-router)#neighbor 10.19.7.8 next-hop-selfneighbor next-hop-self
- Use to prevent third-party next hops from being used on NBMA media such as Frame Relay. This command is useful in nonmeshed networks such as Frame Relay or where BGP neighbors may not have direct access to the same IP subnet.
- Forces the BGP speaker to report itself as the next hop for an advertised route it learned from a neighbor.
- If you specify a BGP peer group by using the peerGroupName argument, all the members of the peer group inherit the characteristic configured with this command. You cannot override the characteristic for a specific member of the peer group.
- New policy values are applied to all routes that are sent (outbound policy) or received (inbound policy) after you issue the command.
To apply the new policy to routes that are already present in the BGP routing table, you must use the clear ip bgp command to perform a soft clear or hard clear of the current BGP session.
Behavior is different for outbound policies configured for peer groups for which you have enabled Adj-RIBs-Out. If you change the outbound policy for such a peer group and want to fill the Adj-RIBs-Out table for that peer group with the results of the new policy, you must use the clear ip bgp peer-group command to perform a hard clear or outbound soft clear of the peer group. You cannot merely perform a hard clear or outbound soft clear for individual peer group members because that causes BGP to resend only the contents of the Adj-RIBs-Out table.
- Issuing this command automatically removes the neighbor next-hop-unchanged configuration (enabled or disabled) on the peer or peer group. Issuing the no or default version of this command has no effect on the neighbor next-hop-unchanged configuration.
- Use the no version to disable this feature (and therefore enable next-hop processing of BGP updates). Use the default version to remove the explicit configuration from the peer or peer group and reestablish inheritance of the feature configuration.
Assigning a Weight to a Route
You can assign a weight to a route when more than one route exists to the same destination. A weight indicates a preference for that particular route over the other routes to that destination. The higher the assigned weight, the more preferred the route. By default, the route weight is 32768 for paths originated by the router, and 0 for other paths.
In the configuration shown in Figure 31, routers Boston and NY both learn about network 192.68.5.0/24 from AS 200. Routers Boston and NY both propagate the route to router LA. Router LA now has two routes for reaching 192.68.5.0/24 and must decide the appropriate route. If you prefer that router LA direct traffic through router Boston, you can configure router LA so that the weight of routes coming from router Boston are highermore preferredthan the routes coming from router NY. Router LA subsequently prefers routes received from router Boston and therefore uses router Boston as the next hop to reach network 192.68.5.0/24.
![]()
You can use any of the following three ways to set the weights in routes coming in from router Boston:
Using the neighbor weight Command
The following commands assign a weight of 1000 to all routes router LA receives from AS 100 and assign a weight of 500 to all routes router LA receives from AS 300:
host1(config)#router bgp 400host1(config-router)#neighbor 10.5.5.1 remote-as 100host1(config-router)#neighbor 10.5.5.1 weight 1000host1(config-router)#neighbor 10.72.4.2 remote-as 300host1(config-router)#neighbor 10.72.4.2 weight 500Router LA sends traffic through router Boston in preference to router NY.
Using a Route Map
A route map instance is a set of conditions with an assigned number. The number after the permit keyword designates an instance of a route map. For example, instance 10 of route map 10 begins with the following:
host1(config)#route-map 10 permit 10In the following commands to configure router LA, instance 10 of route map 10 assigns a weight of 1000 to any routes from AS 100. Instance 20 assigns a weight of 500 to routes from any other AS.
host1(config)#router bgp 400host1(config-router)#neighbor 10.5.5.1 remote-as 100host1(config-router)#neighbor 10.5.5.1 route-map 10 inhost1(config-router)#neighbor 10.72.4.2 remote-as 300host1(config-router)#neighbor 10.72.4.2 route-map 20 inhost1(config-router)#exithost1(config)#route-map 10host1(config-route-map)#set weight 1000host1(config-route-map)#route-map 20host1(config-route-map)#set weight 500See JUNOSe IP Services Configuration Guide, Chapter 1, Configuring Routing Policy for more information about using route maps.
Using an AS-Path Access List
The following commands assign weights to routes filtered by AS-path access lists on router LA:
host1(config)#router bgp 400host1(config-router)#neighbor 10.5.5.1 remote-as 100host1(config-router)#neighbor 10.5.5.1 filter-list 1 weight 1000host1(config-router)#neighbor 10.72.4.2 remote-as 300host1(config-router)#neighbor 10.72.4.2 filter-list 2 weight 500host1(config-router)#exithost1(config)#ip as-path access-list 1 permit ^100_host1(config)#ip as-path access-list 2 permit ^300_Access list 1 permits any route whose AS-path attribute begins with 100 (specified by ^). This permits routes that pass through router Boston, whether they originate in AS 100 (AS path = 100) or AS 200 (AS path = 100 200) or AS 300 (AS path = 100 200 300). Access list 2 permits any route whose AS-path attribute begins with 300. This permits routes that pass through router NY, whether they originate in AS 300 (AS path = 300) or AS 200 (AS path = 300 200) or AS 100 (AS path = 300 200 100).
The neighbor filter-list commands assign a weight attribute of 1000 to routes passing through router Boston and a weight attribute of 500 to routes passing through router NY. Regardless of the origin of the route, routes learned through router Boston are preferred.
ip as-path access-list
- Use to define a BGP access list; use the neighbor filter-list command to apply a specific access list.
- You can apply access list filters on inbound or outbound BGP routes, or both.
- You can permit or deny access for a route matching the condition(s) specified by the regular expression.
- If the regular expression matches the representation of the AS path of the route as an ASCII string, then the permit or deny condition applies.
- The AS path allows substring matching. For example, the regular expression 20 matches AS path = 20 and AS path = 100 200 300, because 20 is a substring of each path. To disable substring matching and constrain matching to only the specified attribute string, place the underscore (_) metacharacter on both sides of the string, for example _20_.
- The AS path does not contain the local AS number.
- Use the no version to remove a single access list entry if permit or deny and a path-expression are specified. Otherwise, the entire access list is removed.
neighbor filter-list
- Use to apply an AS-path access list to advertisements inbound from or outbound to the specified neighbor, or to assign a weight to incoming routes that match the AS-path access list.
- You can specify an optional weight value with the weight keyword to assign a relative importance to incoming routes matching the AS-path access list.
- The name of the access list is a string of up to 32 characters.
- You can apply the filter to incoming or outgoing advertisements with the in or out keywords.
- If you specify a BGP peer group by using the peerGroupName argument, all the members of the peer group inherit the characteristic configured with this command unless it is overridden for a specific peer. However, you cannot configure a member of a peer group to override the inherited peer group characteristic for outbound policy.
- New policy values are applied to all routes that are sent (outbound policy) or received (inbound policy) after you issue the command.
To apply the new policy to routes that are already present in the BGP routing table, you must use the clear ip bgp command to perform a soft clear or hard clear of the current BGP session.
Behavior is different for outbound policies configured for peer groups for which you have enabled Adj-RIBs-Out. If you change the outbound policy for such a peer group and want to fill the Adj-RIBs-Out table for that peer group with the results of the new policy, you must use the clear ip bgp peer-group command to perform a hard clear or outbound soft clear of the peer group. You cannot merely perform a hard clear or outbound soft clear for individual peer group members because that causes BGP to resend only the contents of the Adj-RIBs-Out table.
neighbor weight
- Use to assign a weight to a neighbor connection.
- All routes learned from this neighbor will have the assigned weight initially.
- The route with the highest weight will be chosen as the preferred route when multiple routes are available to a particular network.
- The weights assigned with the set weight commands in a route map override the weights assigned with the neighbor weight and neighbor filter-list commands.
- If you specify a BGP peer group by using the peerGroupName argument, all the members of the peer group inherit the characteristic configured with this command unless it is overridden for a specific peer.
- New policy values are applied to all routes that are sent (outbound policy) or received (inbound policy) after you issue the command.
To apply the new policy to routes that are already present in the BGP routing table, you must use the clear ip bgp command to perform a soft clear or hard clear of the current BGP session.
Behavior is different for outbound policies configured for peer groups for which you have enabled Adj-RIBs-Out. If you change the outbound policy for such a peer group and want to fill the Adj-RIBs-Out table for that peer group with the results of the new policy, you must use the clear ip bgp peer-group command to perform a hard clear or outbound soft clear of the peer group. You cannot merely perform a hard clear or outbound soft clear for individual peer group members because that causes BGP to resend only the contents of the Adj-RIBs-Out table.
See Access Lists for more information about using access lists.
Configuring the Local-Pref Attribute
The local-pref attribute specifies the preferred path among multiple paths to the same destination. The preferred path is the one with the higher preference value. Local preference is used only within an AS, to select an exit point.
To configure the local preference of a BGP path, you can do one of the following:
- Use the bgp default local-preference command to set the local-preference attribute.
- Use a route map to set the local-pref attribute.
Using the bgp default local-preference Command
In Figure 32, AS 873 receives updates for network 192.168.5.0/24 from AS 32 and AS 17.
![]()
The following commands configure router LA:
host1(config-router)#router bgp 873host1(config-router)#neighbor 10.72.4.2 remote-as 32host1(config-router)#neighbor 10.2.2.4 remote-as 873host1(config-router)#bgp default local-preference 125The following commands configure router SanJose:
host2(config-router)#router bgp 873host2(config-router)#neighbor 10.5.5.1 remote-as 17host2(config-router)#neighbor 10.2.2.3 remote-as 873host2(config-router)#bgp default local-preference 200Router LA sets the local preference for all updates from AS 32 to 125. Router SanJose sets the local preference for all updates from AS 17 to 200. Because router LA and router SanJose exchange local preference information within AS 873, they both recognize that routes to network 192.168.5.0/24 in AS 293 have a higher local preference when they come to AS 873 from AS 17 than when they come from AS 32. As a result, both router LA and router SanJose prefer to reach this network through router Boston in AS 17.
bgp default local-preference
- Use to change the default local preference value.
- Changes apply automatically whenever BGP subsequently runs the best-path decision process for a destination prefix; that is, whenever a best route is picked for a given prefix.
To force BGP to run the decision process on routes already received, you must use the clear ip bgp command to perform an inbound soft clear or hard clear of the current BGP session.
Using a Route Map to Set the Local Preference
When you use a route map to set the local preference you have more flexibility in selecting routes for which you can set a local preference based on many criteria, including AS. In the previous section, all updates received by router SanJose were set to a local preference of 200.
Using a route map, you can specifically assign a local preference for routes from AS 17 that pass through AS 293.
The following commands configure router SanJose.
host2(config-router)#router bgp 873host2(config-router)#neighbor 10.2.2.3 remote-as 873host2(config-router)#neighbor 10.5.5.1 remote-as 17host2(config-router)#neighbor 10.5.5.1 route-map 10 inhost2(config-router)#exithost2(config)#ip as-path access-list 1 permit ^17 293$host2(config)#route-map 10 permit 10host2(config-route-map)#match as-path 1host2(config-route-map)#set local-preference 200host2(config-route-map)#exithost2(config)#route-map 10 permit 20Router SanJose sets the local-pref attributes to 200 for routes originating in AS 293 and passing last through AS 17. All other routes are accepted (as defined in instance 20 of the route map 10), but their local preference remains at the default value of 100, indicating a less-preferred path.
Understanding the Origin Attribute
BGP uses the origin attribute to describe how a route was learned at the originthe point where the route was injected into BGP. The origin of the route can be one of three values:
- IGPIndicates that the route was learned by means of an IGP and, therefore, is internal to the originating AS. All routes advertised by the network command have an origin of IGP.
- EGPIndicates that the route was learned by means of an EGP.
- IncompleteIndicates that the origin of the route is unknownthat is, learned from something other than IGP or EGP. All routes advertised by the redistribute commandsuch as static routeshave an origin of Incomplete. An origin of Incomplete occurs when a route is redistributed into BGP.
![]()
Consider the sample topology shown in Figure 33. Because routers Albany and Boston are not directly connected, they learn the path to each other by means of an IGP (not illustrated).
The following commands configure router Boston:
host1(config)#ip route 172.31.125.100 255.255.255.252host1(config)#router bgp 100host1(config-router)#neighbor 10.2.25.1 remote-as 100host1(config-router)#neighbor 10.4.4.1 remote-as 100host1(config-router)#neighbor 10.3.3.1 remote-as 300host1(config-router)#network 172.19.0.0host1(config-router)#redistribute staticThe following commands configure router NY:
host2(config)#router bgp 100host2(config-router)#neighbor 10.4.4.1 remote-as 100host2(config-router)#neighbor 10.2.25.2 remote-as 100host2(config-router)#network 172.28.8.0 mask 255.255.248.0The following commands configure router Albany:
host3(config)#router bgp 100host3(config-router)#neighbor 10.4.4.2 remote-as 100host3(config-router)#neighbor 10.2.25.2 remote-as 100host3(config-router)#network 192.168.33.0 mask 255.255.255.0The following commands configure router LA:
host4(config)#router bgp 300host4(config-router)#neighbor 10.3.3.2 remote-as 100host4(config-router)#network 192.168.204.0 mask 255.255.252.0host4(config-router)#redistribute isisConsider how route 172.21.10.0/23 is passed along to the routers in Figure 33:
- IS-IS injects route 172.21.10.0/23 from router Chicago into BGP on router LA. BGP sets the origin attribute to Incomplete (because it is a redistributed route) to indicate how BGP originally became aware of the route.
- Router Boston learns about route 172.21.10.0/23 by means of EBGP from router LA.
- Router NY learns about route 172.21.10.0/23 by means of IBGP from router Boston.
The value of the origin attribute for a given route remains the same, regardless of where you examine it. Table 21 shows this for all the routes known to routers NY and LA.
As a matter of routing policy, you can specify an origin for a route with a set origin clause in a redistribution route map. Changing the origin enables you to influence which of several routes for the same destination prefix is selected as the best route. In practice, changing the origin is rarely done.
Understanding the AS-Path Attribute
The AS-path attribute is a list of the ASs through which a route has passed. Whenever a route enters an AS, BGP prepends the AS number to the AS-path attribute. This feature enables network operators to track routes, but it also enables the detection and prevention of routing loops.
Consider the following sequence of events for the routers shown in Figure 34:
- Route 172.21.10.0/23 is injected into BGP by means of router London in AS 47.
- Suppose router London advertises that route to router Paris in AS 621. As received by router Paris, the AS-path attribute for route 172.21.10.0/23 is 47.
- Router Paris advertises the route to router Berlin in AS 11. As received by router Berlin, the AS-path attribute for route 172.21.10.0/23 is 621 47.
- Router Berlin advertises the route to router London in AS 47. As received by router London, the AS-path attribute for route 172.21.10.0/23 is 11 621 47.
![]()
A routing loop exists if router London accepts the route from router Berlin. Router London can choose not to accept the route from router Berlin because it recognizes from the AS-path attribute (11 621 47) that the route originated in its own AS 47.
As a matter of routing policy, you can prepend additional AS numbers to the AS-path attribute for a route with a set as-path prepend clause in an outbound route map. Changing the AS path enables you to influence which of several routes for the same destination prefix is selected as the best route.
Configuring a Local AS
You can change the local AS of a BGP peer or peer group within the current address family with the neighbor local-as command. By using different local AS numbers for different peers, you can avoid or postpone AS renumbering in the event the ASs are merged.
neighbor local-as
- Use to assign a local AS to the given BGP peer or peer group.
- If you specify a BGP peer group by using the peerGroupName argument, all the members of the peer group inherit the characteristic configured with this command unless it is overridden for a specific peer.
- This command takes effect immediately and automatically bounces the BGP session.
- Use the no version for an individual peer to restore the value set for the peer group, if present, or set globally for BGP with the router bgp command. Use the no version for a peer group to restore the value set globally for BGP.
The following example commands change the local AS number for peer 104.4.2 from the global local AS of 100 to 32:
host1(config)#router bgp 100host1(config-router)#address-family ipv4 unicast vrf bostonhost1(config-router)#neighbor 10.4.4.2 remote-as 645host1(config-router)#neighbor 10.4.4.2 local-as 32Configuring the MED Attribute
If two ASs connect to each other in more than one place, one link or path might be a better choice to reach a particular prefix within or behind one of the ASs. The MED value is a metric expressing a degree of preference for a particular path. Lower MED values are preferred.
Whereas the Local Preference attribute is used only within an AS (to select an exit point), the MED attribute is exchanged between ASs. A router in one AS sends the MED to inform a router in another AS which path the second router should use to reach particular destinations. If you are the administrator of the second AS, you must therefore trust that the router in the first AS is providing information that is truly beneficial to your AS.
You configure the MED on the sending router by using the set metric command in an outbound route map. Unless configured otherwise, a receiving router compares MED attributes only for paths from external neighbors that are members of the same AS. If you want MED attributes from neighbors in different ASs to be compared, you must issue the bgp always-compare-med command.
In Figure 35, router London in AS 303 can reach 192.168.33.0/24 in AS 73 through router Paris or through router Nice to router Paris.
![]()
The following commands configure router London:
host1(config)#router bgp 303host1(config-router)#neighbor 10.4.4.2 remote-as 73host1(config-router)#neighbor 10.3.3.2 remote-as 73host1(config-router)#neighbor 10.5.5.2 remote-as 4host1(config-router)#network 122.28.8.0 mask 255.255.248.0The following commands configure router Paris:
host2(config)#router bgp 73host2(config-router)#neighbor 10.4.4.1 remote-as 303host2(config-router)#neighbor 10.4.4.1 route-map 10 outhost2(config-router)#neighbor 10.2.25.1 remote-as 73host2(config-router)#neighbor 10.6.6.1 remote-as 4host2(config-router)#neighbor 10.6.6.1 route-map 10 outhost2(config-router)#network 192.168.33.0 mask 255.255.255.0host2(config-router)#exithost2(config)#route-map 10 permit 10host2(config-route-map)#set metric 50The following commands configure router Nice:
host3(config)#router bgp 73host3(config-router)#neighbor 10.3.3.1 remote-as 303host3(config-router)#neighbor 10.3.3.1 route-map 10 outhost3(config-router)#neighbor 10.2.25.2 remote-as 73host3(config-router)#network 172.19.0.0host3(config-router)#exithost3(config)#route-map 10 permit 10host3(config-route-map)#set metric 100The following commands configure router Dublin:
host4(config)#router bgp 4host4(config-router)#neighbor 10.5.5.1 remote-as 303host4(config-router)#neighbor 10.5.5.1 route-map 10 outhost4(config-router)#neighbor 10.6.6.2 remote-as 73host4(config-router)#network 172.14.27.0 mask 255.255.255.0host4(config-router)#exithost4(config)#route-map 10 permit 10host4(config-route-map)#set metric 25Router London receives updates regarding route 192.168.33.0/24 from both router Nice and router Paris. Router London compares the MED values received from the two routers: Router Nice advertises a MED of 100 for the route, whereas router Paris advertises a MED of 50. On this basis, router London prefers the path through router Paris.
Because BGP by default compares only MED attributes of routes coming from the same AS, router London can compare only the MED attributes for route 192.168.33.0/24 that it received from routers Paris and Nice. It cannot compare the MED received from router Dublin, because router Dublin is in a different AS than routers Paris and Nice.
However, you can use the bgp always-compare-med command to configure router London to take into account the MED attribute from router Dublin as follows:
host1(config)#router bgp 303host1(config-router)#neighbor 10.4.4.2 remote-as 73host1(config-router)#neighbor 10.3.3.2 remote-as 73host1(config-router)#neighbor 10.5.5.2 remote-as 4host1(config-router)#network 122.28.8.0 mask 255.255.248.0host1(config-router)#bgp always-compare-medRouter Dublin advertises a MED of 25 for route 192.168.33.0/24, which is lowermore preferredthan the MED advertised by router Paris or router Nice. However, the AS path for the route through router Dublin is longer than that through router Paris. The AS path is the same length for router Paris and router Nice, but the MED advertised by router Paris is lower than that advertised by router Nice. Consequently, router London prefers the path through router Paris.
Suppose, however that router Dublin was not configured to set the MED for route 192.168.33.0/24 in its outbound route map 10. Would router London receive a MED of 50 passed along by router Paris through router Dublin? No, because the MED attribute is nontransitive. Router Dublin does not transmit any MED that it receives. A MED is only of value to a direct peer.
bgp always-compare-med
- Use to enable the comparison of the MED for paths from neighbors in different ASs.
- Unless you specify the bgp always-compare-med command, the router compares MED attributes only for paths from external neighbors that are in the same AS.
- The BGP path decision algorithm selects a lower MED value over a higher one.
- Unlike local preferences, the MED attribute is exchanged between ASs, but does not leave the AS.
- The value is used for decision making within the AS only.
- When BGP propagates a route received from outside the AS to another AS, it removes the MED.
- Example
host1(config-router)#bgp always-compare-medChanges apply automatically whenever BGP subsequently runs the best-path decision process for a destination prefix; that is, whenever a best route is picked for a given prefix. To force BGP to run the decision process on routes already received, you must use the clear ip bgp command to perform an inbound soft clear or hard clear of the current BGP session.
set metric
- Use to set the metric valuefor BGP, the MEDfor a route.
- Sets an absolute metric. You cannot use both an absolute metric and a relative metric within the same route map sequence. Setting either metric overrides any previously configured value.
- Example
host1(config)#route-map nyc1 permit 10host1(config-route-map)#set metric 10Use the no version to delete the set clause from a route map. Missing MED Values
By default, a route that arrives with no MED value is treated as if it had a MED of 0, the most preferred value. You can use the bgp bestpath missing-as-worst command to specify that a route with any MED value is always preferred to a route that is missing the MED value.
bgp bestpath missing-as-worst
- Use to set a missing MED value to infinity, the least preferred value.
- After issuing this command, a route missing the MED is always preferred less than any route that has a MED configured.
- Example
host1(config-router)#bgp bestpath missing-as-worstChanges apply automatically whenever BGP subsequently runs the best-path decision process for a destination prefix; that is, whenever a best route is picked for a given prefix. To force BGP to run the decision process on routes already received, you must use the clear ip bgp command to perform an inbound soft clear or hard clear of the current BGP session.
- Use the no version to restore the default condition, where a missing MED value is set to 0, the most preferred value.
Comparing MED Values Within a Confederation
A BGP speaker within a confederation of sub-ASs might need to compare routes to determine the best path to a destination. By default, BGP does not use the MED value when comparing routes originated in different sub-ASs within the confederation to which the BGP speaker belongs. (Within the confederation, routes learned from different sub-ASs are treated as having originated in different places.) You can use the bgp bestpath med confed command to force MED values to be taken into account within a confederation.
bgp bestpath med confed
- Use to specify that BGP take into account the MED when comparing routes originated in different sub-ASs within the confederation to which the BGP speaker belongs.
- This command does not affect the comparison of routes that are originated in other ASs and does not affect the comparison of routes that are originated in other confederations.
- Example
host1(config-router)#bgp bestpath med confedChanges apply automatically whenever BGP subsequently runs the best-path decision process for a destination prefix; that is, whenever a best route is picked for a given prefix. To force BGP to run the decision process on routes already received, you must use the clear ip bgp command to perform an inbound soft clear or hard clear of the current BGP session.
Suppose a BGP speaker has three routes to prefix 10.10.0.0/16:
- Route 1 is originated by sub-AS 1 inside the confederation.
- Route 2 is originated by sub-AS 2 inside the confederation.
- Route 3 is originated by AS 3 outside the confederation.
BGP compares these routes to each other to determine the best path to the prefix. If you have issued the bgp bestpath med confed command, BGP takes into account the MED when comparing Route 1 with Route 2. However, BGP does not take into account the MED when comparing Route 3 with either Route 1 or Route 2, because Route 3 originates outside the confederation.
Capability Negotiation
The router accepts connections from peers that perform capability negotiation. Capabilities are negotiated by means of the open messages that are exchanged when the session is established. The router supports the following capabilities:
- Cisco-proprietary route refreshCapability code 128
- Cooperative route filteringCapability code 3
- Deprecated dynamic capability negotiationCapability code 66
- Dynamic capability negotiationCapability code 67
- Four-octet AS numbersCapability code 65
- Graceful restartCapability code 64
- Multiprotocol extensionsCapability code 1
- address family IPv4 unicastAFI 1 SAFI 1
- address family IPv4 multicastAFI 1 SAFI 2
- address family IPv4 unicast and multicastAFI 1 SAFI 3
- address family VPN-IPv4 unicastAFI 1 SAFI 128
The router advertises these capabilitiesexcept for the cooperative route filtering capabilityby default. You can prevent the advertisement of specific capabilities with the no neighbor capability command. You can also use this command to prevent all capability negotiation with the specified peer.
Cooperative Route Filtering
The cooperative route filtering capabilityalso referred to as outbound route filtering (ORF)enables a BGP speaker to send an inbound route filter to a peer and have the peer install it as an outbound filter on the remote end of the session.
You must specify both the type of inbound filter (ORF type) and the direction of ORF capability. The router currently supports prefix-lists as the inbound filter sent by the BGP speaker. The inbound filter sent by the BGP speaker can be a prefix list or a Cisco proprietary prefix list. The BGP speaker must indicate whether it will send inbound filters to peers, accept inbound filters from peers, or both. The router supports both standard and Cisco-proprietary orf messages.
Dynamic Capability Negotiation
If both peers acknowledge support of dynamic capability negotiation, then at any subsequent point after the session is established, either peer can send a capabilities message to the other indicating a desire to negotiate another capability or to remove a previously negotiated capability.
The data field of the capability message contains a list of all the capabilities that can be dynamically negotiated. In earlier versions, now deprecated, the data field did not carry this information. Use the dynamic-capability-negotiation keyword to include the list. Use the deprecated-dynamic-capability-negotiation keyword to exclude sending the list.
Nondynamic capability negotiation is supported for the cooperative route filtering, four-octet AS numbers, deprecated dynamic capability negotiation, and dynamic capability negotiation capabilities. Dynamic capability negotiation of these capabilities is not supported.
If both sides of the connection advertise support for the new dynamic capability negotiation capability, then the peers negotiate which capabilities are dynamic and which are not.
If both sides of the connection advertise support only for the deprecated dynamic capability negotiation, then the BGP speaker uses dynamic capability negotiation for all capabilities that allow it without attempting to negotiate this with the peer.
Four-Octet AS Numbers
BGP speakers that support four-octet AS and sub-AS numbers are sometimes referred to as "new" speakers. The four-octet AS numbers are employed by the AS-path and aggregator attributes. "Old" speakers are those that do not support the four-octet numbers.
Two new transitional optional attributes, new-as-path and new-aggregator, are used to carry the four-octet numbers across the old speakers. A new speaker communicating with an old speaker will send the new attributes with the four-octet numbers for locally-originated and propagated routes. The old speaker propagates the new attributes for received routes. The new speaker also sends the AS-path and aggregator attributes with two-octet numbers; any AS number greater than 65535 is replaced with a reserved AS number, 23456.
Graceful Restarts
When BGP restarts on a router, all of the router's BGP peers detect that the BGP session transitioned from up to down. The transition causes a routing flap throughout the network as the peers recalculate their best routes in light of the loss of routes from that peering session.
The BGP graceful restart capability reduces the network disruption that normally results from a peer session going down. If the session is with a peer that had previously advertised the graceful restart capability, the receiving BGP speaker marks all routes from that peer in the BGP routing table as stale. BGP keeps these stale routes for a limited time and continues to use these routes to forward traffic. Any existing stale routes from that peer are deleted to account for consecutive restarts.
When the restarting peer reestablishes the session, the receiving BGP speaker replaces the stale routes with the fresh routes it receives from the peer. The restarted peer sends an End-of-RIB marker to signal when it has finished sending all its routes to the BGP speaker. Until this point, BGP has still been using the stale routes to forward traffic. Upon receipt of the End-of-RIB marker, the BGP speaker flushes any remaining stale routes from the restarted peer.
The End-of-RIB marker is an update message that contains no advertised or withdrawn prefixes; it is sent only to BGP speakers that have previously advertised the graceful restart capability.
The receiving speaker also sends its own routes to the restarted speaker, and sends an End-of-RIB marker when it completes the update. The restarted peer defers reinitiating the BGP best-path selection process until it has received this marker from all peers with which it had a session in the established state and from which it had received an End-of-RIB marker before it restarted.
After running the selection process to pick the best route to all prefixes using the fresh routes, BGP then installs the best routes in the IP routing table on the restarted peer. Any of these that are best overall routes to a prefix are then pushed by the router to the forwarding tables on the line modules.
By waiting for all restarted peers to send the End-of-RIB marker, BGP risks delaying the initiation of the best path decision process indefinitely due to a single very slow peer. For a specific peer, you can avoid this delay by hard clearing the peer or issuing the clear ip bgp wait-end-of-rib command. Either method removes that peer from the set of peers for which BGP is awaiting an End-of-RIB marker. Alternatively, you can minimize this effect by using the bgp graceful-restart path-selection-defer-time-limit command to specify a maximum period that the restarted peer waits for the marker from its peers.
Note that the receiving peer does not defer its best-path selection process while waiting for a restarted peer to reestablish a session. The receiving peer continues to use the stale routes from the restarted peer in the decision process. When it flushes stale routes, the receiving peer then uses the freshly updated routes.
A restarting peer must bring the session back up and refresh its routes within a limited period, or BGP on the receiving peer will flush all the stale routes. When a BGP speaker advertises the graceful restart capability, it also advertises how long it expects to take to reestablish a session if it restarts. If the session is not reestablished within this restart period, the speaker's peers flush the stale routes from the speaker. You can use the bgp graceful-restart restart-time command to modify the restart period advertised to all peers; the neighbor graceful-restart restart-time command modifies the restart period advertised to specific peers or peer groups. A receiving peer starts the timer as soon as it recognizes that the session with the restarting peer has transitioned to down.
The receiving peer also has a configurable timer that starts when it recognizes that the session with the restarting peer has gone down. The bgp graceful-restart stalepaths-time command determines how long a receiving peer is willing to use stale paths from any restarted peer; the neighbor graceful-restart stalepaths-time command does the same for a specified restarted peer or peer group. If the receiving peer does not receive an End-of-RIB marker from the restarted peer before the stalepaths timer expires, the receiving peer flushes all stale routes from the peer.
bgp graceful-restart
- Use to enable the BGP graceful restart capability.
- Advertisement of the graceful restart capability is enabled by default.
- The no neighbor capability negotiation command prevents the advertisement of all BGP capabilities, including graceful restart, to the specified peers.
- This command takes effect immediately and automatically bounces the session.
- Example
host1(config-router)#bgp graceful-restartUse the no version to disable advertisement of the graceful restart capability. Use the default version to restore the default condition, advertising this capability. bgp graceful-restart path-selection-defer-time-limit
- Use to set the maximum time a restarted BGP speaker defers reinitiating the best-path selection process.
- This command takes effect immediately and automatically bounces the session.
- Example
host1(config-router)#bgp graceful-restart path-selection-defer-time-limit 180Use the no version to restore the default value, 120 seconds. bgp graceful-restart restart-time
- Use to set the time BGP advertises to all peers within which it expects to reestablish a session after restarting. Peers flush stale routes from the speaker if the session is not restarted within this period.
- Specify an interval shorter than the stalepaths time.
- This command takes effect immediately and automatically bounces the session.
- Example
host1(config-router)#bgp graceful-restart restart-time 240Use the no version to restore the default value, 120 seconds. bgp graceful-restart stalepaths-time
- Use to set the maximum time BGP waits to receive an End-of-RIB marker from any restarted peer before flushing all remaining stale routes from that peer. The timer begins when BGP recognizes that the peer session has gone down.
- This command prevents an excessive delay in BGP reconvergence due to a peer that brings a session back up but is slow to send fresh routes.
- Specify an interval longer than the restart time.
- This command takes effect immediately and automatically bounces the session.
- Example
host1(config-router)#bgp graceful-restart stalepaths-time 480Use the no version to restore the default value, 360 seconds. clear ip bgp wait-end-of-rib
- Use to clear a peer or peer group from the set of peers for which BGP is waiting to receive an End-of-RIB marker after a peer restart.
- Alternatively, performing a hard clear of a peer without this keyword has the same effect.
- This command takes effect immediately.
- Example
host1#clear ip bgp 192.168.1.158 wait-end-of-ribThere is no no version. neighbor graceful-restart
- Use to control the advertisement of the BGP graceful restart capability for specified peers or peer groups.
- Advertisement of the graceful restart capability is enabled by default.
- The no neighbor capability negotiation command prevents the advertisement of all BGP capabilities, including graceful restart, to the specified peers, but does not affect global advertisement of the graceful restart capability.
- This command takes effect immediately and automatically bounces the session.
- Example
host1(config-router)#no neighbor 10.21.3.5 graceful-restartUse the no version to disable advertisement of the graceful restart capability. Use the default version to remove the explicit configuration from the peer or peer group and reestablish inheritance of the capability configuration. neighbor graceful-restart restart-time
- Use to set the time BGP advertises to specified peers or peer groups within which it expects to reestablish a session after restarting. Peers flush stale routes from the speaker if the session is not restarted within this period.
- Specify an interval shorter than the stalepaths time.
- This command takes effect immediately and automatically bounces the session.
- Example
host1(config-router)#neighbor graceful-restart restart-time 240Use the no version to restore the default value, 120 seconds. neighbor graceful-restart stalepaths-time
- Use to set the maximum time BGP waits to receive an End-of-RIB marker from the specified restarted peer or peer group before flushing all remaining stale routes from that peer. The timer begins when BGP recognizes that the peer session has gone down.
- This command prevents an excessive delay in BGP reconvergence due to a peer that brings a session back up but is slow to send fresh routes.
- Specify an interval longer than the restart time.
- This command takes effect immediately and automatically bounces the session.
- Example
host1(config-router)#neighbor graceful-restart stalepaths-time 480Use the no version to restore the default value, 360 seconds. Route Refresh
If the router detects that a peer supports both Cisco-proprietary and standard route refresh messages, it will prefer to use the standard route refresh messages.
neighbor capability
- Use to control the advertisement of BGP capabilities to peers. Capability negotiation and advertisement of all capabilities are enabled by default.
- You can specify the deprecated-dynamic-capability-negotiation, dynamic-capability-negotiation, four-octet-as-numbers, orf, route-refresh, and route-refresh-cisco capabilities. The graceful restart capability is controlled by specific graceful-restart commands.
- If you specify a BGP peer group by using the peerGroupName argument, all the members of the peer group inherit the characteristic configured with this command unless it is overridden for a specific peer.
- You cannot configure the receive direction for the orf capability for a peer that is a member of a peer group or for a peer.
- If you issue the route-refresh or route-refresh-cisco keywords, the command takes effect immediately. If dynamic capability negotiation was negotiated for the session, a capability message is sent to inform the peer of the new capability configuration. If dynamic capability negotiation was not negotiated for the session, the session is bounced automatically.
- If you issue the deprecated-dynamic-capability-negotiation, dynamic-capability-negotiation, four-octet-as-numbers, negotiation, or orf keywords, the command takes effect immediately and bounces the session.
- If the BGP speaker receives a capability message for a capability that BGP did not previously advertise in the dynamic capability negotiation capability, BGP sends a notification to the peer with the error code "capability message error" and error subcode "unsupported capability code."
- IPv6 ORF prefix lists are not supported. Therefore you can specify an IPv6 address with the orf keyword only within the IPv4 address family and when you want to advertise IPv4 routes to IPv6 peers.
- Example
host1(config-router)#neighbor 10.6.2.5 capability orf prefix-list bothUse the no version to prevent advertisement of the specified capability or use the negotiation keyword with the no version to prevent all capability negotiation with the specified peer. Use the default version to restore the default, advertising the capability.