BGP is an exterior gateway protocol (EGP) that is used to exchange routing information among routers in different autonomous systems (ASs). BGP routing information includes the complete route to each destination. BGP uses the routing information to maintain a database of network reachability information, which it exchanges with other BGP systems. BGP uses the network reachability information to construct a graph of AS connectivity, which enables BGP to remove routing loops and enforce policy decisions at the AS level.
Multiprotocol BGP (MBGP) extensions enable BGP to support IP version 6 (IPv6). MBGP defines the attributes MP_REACH_NLRI and MP_UNREACH_NLRI, which are used to carry IPv6 reachability information. Network layer reachability information (NLRI) update messages carry IPv6 address prefixes of feasible routes.
BGP allows for policy-based routing. You can use routing policies to choose among multiple paths to a destination and to control the redistribution of routing information.
BGP uses TCP as its transport protocol, using port 179 for establishing connections. Running over a reliable transport protocol eliminates the need for BGP to implement update fragmentation, retransmission, acknowledgment, and sequencing.
The Junos OS routing protocol software supports BGP version 4. This version of BGP adds support for Classless Interdomain Routing (CIDR), which eliminates the concept of network classes. Instead of assuming which bits of an address represent the network by looking at the first octet, CIDR allows you to explicitly specify the number of bits in the network address, thus providing a means to decrease the size of the routing tables. BGP version 4 also supports aggregation of routes, including the aggregation of AS paths.
This section discusses the following topics:
An autonomous system (AS) is a set of routers that are under a single technical administration and normally use a single interior gateway protocol and a common set of metrics to propagate routing information within the set of routers. To other ASs, an AS appears to have a single, coherent interior routing plan and presents a consistent picture of what destinations are reachable through it.
AS Paths and Attributes
The routing information that BGP systems exchange includes the complete route to each destination, as well as additional information about the route. The route to each destination is called the AS path, and the additional route information is included in path attributes. BGP uses the AS path and the path attributes to completely determine the network topology. Once BGP understands the topology, it can detect and eliminate routing loops and select among groups of routes to enforce administrative preferences and routing policy decisions.
External and Internal BGP
BGP supports two types of exchanges of routing information: exchanges among different ASs and exchanges within a single AS. When used among ASs, BGP is called external BGP (EBGP) and BGP sessions perform inter-AS routing. When used within an AS, BGP is called internal BGP (IBGP) and BGP sessions perform intra-AS routing. Figure 1 illustrates ASs, IBGP, and EBGP.
A BGP system shares network reachability information with adjacent BGP systems, which are referred to as neighbors or peers.
BGP systems are arranged into groups. In an IBGP group, all peers in the group—called internal peers—are in the same AS. Internal peers can be anywhere in the local AS and do not have to be directly connected to one another. Internal groups use routes from an IGP to resolve forwarding addresses. They also propagate external routes among all other internal routers running IBGP, computing the next hop by taking the BGP next hop received with the route and resolving it using information from one of the interior gateway protocols.
In an EBGP group, the peers in the group—called external peers—are in different ASs and normally share a subnet. In an external group, the next hop is computed with respect to the interface that is shared between the external peer and the local router.
Multiple Instances of BGP
You can configure multiple instances of BGP at the following hierarchy levels:
[edit routing-instances routing-instance-name protocols]
[edit logical-systems logical-system-name routing-instances routing-instance-name protocols]
Multiple instances of BGP are primarily used for Layer 3 VPN support.
IGP peers and external BGP (EBGP) peers (both nonmultihop and multihop) are all supported for routing instances. BGP peering is established over one of the interfaces configured under the routing-instances hierarchy.
When a BGP neighbor sends BGP messages to the local routing device, the incoming interface on which these messages are received must be configured in the same routing instance that the BGP neighbor configuration exists in. This is true for neighbors that are a single hop away or multiple hops away.
Routes learned from the BGP peer are added to the instance-name.inet.0 table by default. You can configure import and export policies to control the flow of information into and out of the instance routing table.
For Layer 3 VPN support, configure BGP on the provider edge (PE) router to receive routes from the customer edge (CE) router and to send the instances’ routes to the CE router if necessary. You can use multiple instances of BGP to maintain separate per-site forwarding tables for keeping VPN traffic separate on the PE router.
You can configure import and export policies that allow the service provider to control and rate-limit traffic to and from the customer.
You can configure an EBGP multihop session for a VRF routing instance. Also, you can set up the EBGP peer between the PE and CE routers by using the loopback address of the CE router instead of the interface addresses.
BGP Routes Overview
A BGP route is a destination, described as an IP address prefix, and information that describes the path to the destination.
The following information describes the path:
AS path, which is a list of numbers of the ASs that a route passes through to reach the local router. The first number in the path is that of the last AS in the path—the AS closest to the local router. The last number in the path is the AS farthest from the local router, which is generally the origin of the path.
Path attributes, which contain additional information about the AS path that is used in routing policy.
BGP peers advertise routes to each other in update messages.
BGP stores its routes in the Junos OS routing table (inet.0). The routing table stores the following information about BGP routes:
Routing information learned from update messages received from peers
Local routing information that BGP applies to routes because of local policies
Information that BGP advertises to BGP peers in update messages
For each prefix in the routing table, the routing protocol process selects a single best path, called the active path. Unless you configure BGP to advertise multiple paths to the same destination, BGP advertises only the active path.
The BGP router that first advertises a route assigns it one of the following values to identify its origin. During route selection, the lowest origin value is preferred.
0—The router originally learned the route through an IGP (OSPF, IS-IS, or a static route).
1—The router originally learned the route through an EGP (most likely BGP).
2—The route's origin is unknown.
BGP Route Resolution Overview
An internal BGP (IBGP) route with a next-hop address to a remote BGP neighbor (protocol next hop) must have its next hop resolved using some other route. BGP adds this route to the rpd resolver module for next-hop resolution. If RSVP is used in the network, then the BGP next hop is resolved using the RSVP ingress route. This results in the BGP route pointing to an indirect next hop, and the indirect next hop pointing to a forwarding next hop. The forwarding next hop is derived from the RSVP route next hop. There is often a large set of internal BGP routes that have the same protocol next hop, and in such cases, the set of BGP routes would reference the same indirect next hop.
Prior to Junos OS Release 17.2R1, the resolver module of the routing protocol process (rpd) resolved routes within the IBGP received routes in the following ways:
- Partial route resolution—The protocol next hop is resolved based on helper routes, such as RSVP or IGP routes. The metric values are derived from the helper routes, and the next hop is referred to as the resolver forwarding next hop inherited from helper routes. These metric values are used for selecting routes in the routing information base (RIB), also known as the routing table.
- Complete route resolution—The final next hop is derived and is referred to as the kernel routing table (KRT) forwarding next hop based on the forwarding export policy.
Starting in Junos OS Release 17.2R1, the resolver module of rpd is optimized to increase the throughput of inbound processing flow, accelerating the learning rate of RIB and FIB. With this enhancement, the route resolution is affected as follows:
The partial and complete route resolution methods are triggered for each IBGP route, although each route might inherit the same resolved forwarding next hop or KRT forwarding next hops.
The BGP path selection is deferred until complete route resolution is performed for network layer reachability information (NLRI) received from a BGP neighbor, which might not be the best route in the RIB after route selection.
The benefits of the rpd resolver optimization include:
Lower RIB resolution lookup cost—The output of the resolved path is saved in a resolver cache, so that the same derived next hop and metric values can be inherited to another set of routes sharing the same path behavior instead of performing both partial and complete route resolution flow. This reduces the route resolution lookup cost by maintaining only the most frequent resolver state in a cache with limited depth.
BGP route selection optimization—The BGP route selection algorithm is triggered twice for every IBGP route received—first, while adding the route in the RIB with the next hop as unusable, and second, while adding the route with a resolved next hop in the RIB (after route resolution). This results in selecting the best route twice. With the resolver optimization, the route selection process is triggered in the receive flow only after getting the next-hop information from the resolver module.
Internal caching to avoid frequent lookup—The resolver cache maintains the most frequent resolver state, and as a result, the lookup functionality, such as next-hop lookup and route lookup is done from the local cache.
Path equivalence group—When different paths share the same forwarding state, or are received from the same protocol next hop, the paths can belong to one path equivalence group. This approach avoids the need to perform of complete route resolution for such paths. When a new path requires complete route resolution, it is first looked up in the path equivalence group database, which contains the resolved path output, such as indirect next hop and forwarding next hop.
BGP Messages Overview
All BGP messages have the same fixed-size header, which contains a marker field that is used for both synchronization and authentication, a length field that indicates the length of the packet, and a type field that indicates the message type (for example, open, update, notification, keepalive, and so on).
This section discusses the following topics:
After a TCP connection is established between two BGP systems, they exchange BGP open messages to create a BGP connection between them. Once the connection is established, the two systems can exchange BGP messages and data traffic.
Open messages consist of the BGP header plus the following fields:
Version—The current BGP version number is 4.
Local AS number—You configure this by including the autonomous-system statement at the [edit routing-options] or [edit logical-systems logical-system-name routing-options] hierarchy level.
Hold time—Proposed hold-time value. You configure the local hold time with the BGP hold-time statement.
BGP identifier—IP address of the BGP system. This address is determined when the system starts and is the same for every local interface and every BGP peer. You can configure the BGP identifier by including the router-id statement at the [edit routing-options] or [edit logical-systems logical-system-name routing-options] hierarchy level. By default, BGP uses the IP address of the first interface it finds in the router.
Parameter field length and the parameter itself—These are optional fields.
BGP systems send update messages to exchange network reachability information. BGP systems use this information to construct a graph that describes the relationships among all known ASs.
Update messages consist of the BGP header plus the following optional fields:
Unfeasible routes length—Length of the withdrawn routes field
Withdrawn routes—IP address prefixes for the routes being withdrawn from service because they are no longer deemed reachable
Total path attribute length—Length of the path attributes field; it lists the path attributes for a feasible route to a destination
Path attributes—Properties of the routes, including the path origin, the multiple exit discriminator (MED), the originating system’s preference for the route, and information about aggregation, communities, confederations, and route reflection
Network layer reachability information (NLRI)—IP address prefixes of feasible routes being advertised in the update message
BGP systems exchange keepalive messages to determine whether a link or host has failed or is no longer available. Keepalive messages are exchanged often enough so that the hold timer does not expire. These messages consist only of the BGP header.
BGP systems send notification messages when an error condition is detected. After the message is sent, the BGP session and the TCP connection between the BGP systems are closed. Notification messages consist of the BGP header plus the error code and subcode, and data that describes the error.
BGP systems send route-refresh messages to a peer only if they have received the route refresh capability advertisement from the peer. A BGP system must advertise the route refresh capability to its peers using BGP capabilities advertisement if it wants to receive route-refresh messages. This optional message is sent to request dynamic, inbound, BGP route updates from BGP peers or to send outbound route updates to a BGP peer.
Route-refresh messages consist of the following fields:
AFI—Address Family Identifier (16-bit).
Res—Reserved (8-bit) field, which must be set to 0 by the sender and ignored by the receiver.
SAFI—Subsequent Address Family Identifier (8-bit).
If a peer without the route-refresh capability receives a route-refresh request message from a remote peer, the receiver ignores the message.
Understanding BGP Path Selection
For each prefix in the routing table, the routing protocol process selects a single best path. After the best path is selected, the route is installed in the routing table. The best path becomes the active route if the same prefix is not learned by a protocol with a lower (more preferred) global preference value, also known as the administrative distance. The algorithm for determining the active route is as follows:
- Verify that the next hop can be resolved.
- Choose the path with the lowest preference value (routing
protocol process preference).
Routes that are not eligible to be used for forwarding (for example, because they were rejected by routing policy or because a next hop is inaccessible) have a preference of –1 and are never chosen.
- Prefer the path with higher local preference.
For non-BGP paths, choose the path with the lowest preference2 value.
- If the accumulated interior gateway protocol (AIGP) attribute is enabled, prefer the path with the lower AIGP attribute.
- Prefer the path with the shortest autonomous system (AS)
path value (skipped if the as-path-ignore statement is
A confederation segment (sequence or set) has a path length of 0. An AS set has a path length of 1.
- Prefer the route with the lower origin code.
Routes learned from an IGP have a lower origin code than those learned from an exterior gateway protocol (EGP), and both have lower origin codes than incomplete routes (routes whose origin is unknown).
- Prefer the path with the lowest multiple exit discriminator
Depending on whether nondeterministic routing table path selection behavior is configured, there are two possible cases:
If nondeterministic routing table path selection behavior is not configured (that is, if the path-selection cisco-nondeterministic statement is not included in the BGP configuration), for paths with the same neighboring AS numbers at the front of the AS path, prefer the path with the lowest MED metric. To always compare MEDs whether or not the peer ASs of the compared routes are the same, include the path-selection always-compare-med statement.
If nondeterministic routing table path selection behavior is configured (that is, the path-selection cisco-nondeterministic statement is included in the BGP configuration), prefer the path with the lowest MED metric.
Confederations are not considered when determining neighboring ASs. A missing MED metric is treated as if a MED were present but zero.
MED comparison works for single path selection within an AS (when the route does not include an AS path), though this usage Is uncommon.
By default, only the MEDs of routes that have the same peer autonomous systems (ASs) are compared. You can configure routing table path selection options to obtain different behaviors.
- Prefer strictly internal paths, which include IGP routes and locally generated routes (static, direct, local, and so forth).
- Prefer strictly external BGP (EBGP) paths over external paths learned through internal BGP (IBGP) sessions.
- Prefer the path whose next hop is resolved through the
IGP route with the lowest metric.
A path is considered a BGP equal-cost path (and will be used for forwarding) if a tie-break is performed after the previous step. All paths with the same neighboring AS, learned by a multipath-enabled BGP neighbor, are considered.
BGP multipath does not apply to paths that share the same MED-plus-IGP cost yet differ in IGP cost. Multipath path selection is based on the IGP cost metric, even if two paths have the same MED-plus-IGP cost.
BGP compares the type of IGP metric before comparing the metric value itself in rt_metric2_cmp. For example, BGP routes that are resolved through IGP are preferred over discarded or rejected next-hops that are of type RTM_TYPE_UNREACH. Such routes are declared inactive because of their metric-type.
- If both paths are external, prefer the currently active
path to minimize route-flapping. This rule is not used if any one
of the following conditions is true:
path-selection external-router-id is configured.
Both peers have the same router ID.
Either peer is a confederation peer.
Neither path is the current active path.
- Prefer a primary route over a secondary route. A primary route is one that belongs to the routing table. A secondary route is one that is added to the routing table through an export policy.
- Prefer the path from the peer with the lowest router ID. For any path with an originator ID attribute, substitute the originator ID for the router ID during router ID comparison.
- Prefer the path with the shortest cluster list length. The length is 0 for no list.
- Prefer the path from the peer with the lowest peer IP address.
Routing Table Path Selection
The shortest AS path step of the algorithm, by default, evaluates the length of the AS path and determines the active path. You can configure an option that enables Junos OS to skip this step of the algorithm by including the as-path-ignore option.
Starting with Junos OS Release 14.1R8, 14.2R7, 15.1R4, 15.1F6, and 16.1R1, the as-path-ignore option is supported for routing instances.
The routing process path selection takes place before BGP hands off the path to the routing table to makes its decision. To configure routing table path selection behavior, include the path-selection statement:
For a list of hierarchy levels at which you can include this statement, see the statement summary section for this statement.
Routing table path selection can be configured in one of the following ways:
Emulate the Cisco IOS default behavior (cisco-non-deterministic). This mode evaluates routes in the order that they are received and does not group them according to their neighboring AS. With cisco-non-deterministic mode, the active path is always first. All inactive, but eligible, paths follow the active path and are maintained in the order in which they were received, with the most recent path first. Ineligible paths remain at the end of the list.
As an example, suppose you have three path advertisements for the 192.168.1.0 /24 route:
Path 1—learned through EBGP; AS Path of 65010; MED of 200
Path 2—learned through IBGP; AS Path of 65020; MED of 150; IGP cost of 5
Path 3—learned through IBGP; AS Path of 65010; MED of 100; IGP cost of 10
These advertisements are received in quick succession, within a second, in the order listed. Path 3 is received most recently, so the routing device compares it against path 2, the next most recent advertisement. The cost to the IBGP peer is better for path 2, so the routing device eliminates path 3 from contention. When comparing paths 1 and 2, the routing device prefers path 1 because it is received from an EBGP peer. This allows the routing device to install path 1 as the active path for the route.
We do not recommend using this configuration option in your network. It is provided solely for interoperability to allow all routing devices in the network to make consistent route selections.
Always comparing MEDs whether or not the peer ASs of the compared routes are the same (always-compare-med).
Override the rule that If both paths are external, the currently active path is preferred (external-router-id). Continue with the next step (Step 12) in the path-selection process.
Adding the IGP cost to the next-hop destination to the MED value before comparing MED values for path selection (med-plus-igp).
BGP multipath does not apply to paths that share the same MED-plus-IGP cost, yet differ in IGP cost. Multipath path selection is based on the IGP cost metric, even if two paths have the same MED-plus-IGP cost.
BGP Table path selection
The following parameters are followed for BGP's path selection:
- Prefer the highest local-preference value.
- Prefer the shortest AS-path length.
- Prefer the lowest origin value.
- Prefer the lowest MED value.
- Prefer routes learned from an EBGP peer over an IBGP peer.
- Prefer best exit from AS.
- For EBGP-received routes, prefer the current active route.
- Prefer routes from the peer with the lowest Router ID.
- Prefer paths with the shortest cluster length.
- Prefer routes from the peer with the lowest peer IP address. Steps 2, 6 and 12 are the RPD criteria.
Effects of Advertising Multiple Paths to a Destination
BGP advertises only the active path, unless you configure BGP to advertise multiple paths to a destination.
Suppose a routing device has in its routing table four paths to a destination and is configured to advertise up to three paths (add-path send path-count 3). The three paths are chosen based on path selection criteria. That is, the three best paths are chosen in path-selection order. The best path is the active path. This path is removed from consideration and a new best path is chosen. This process is repeated until the specified number of paths is reached.
Supported Standards for BGP
Junos OS substantially supports the following RFCs and Internet drafts, which define standards for IP version 4 (IPv4) BGP.
For a list of supported IP version 6 (IPv6) BGP standards, see Supported IPv6 Standards.
Junos OS BGP supports authentication for protocol exchanges (MD5 authentication).
RFC 1745, BGP4/IDRP for IP—OSPF Interaction
RFC 1772, Application of the Border Gateway Protocol in the Internet
RFC 1997, BGP Communities Attribute
RFC 2283, Multiprotocol Extensions for BGP-4
RFC 2385, Protection of BGP Sessions via the TCP MD5 Signature Option
RFC 2439, BGP Route Flap Damping
RFC 2545, Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing
RFC 2796, BGP Route Reflection – An Alternative to Full Mesh IBGP
RFC 2858, Multiprotocol Extensions for BGP-4
RFC 2918, Route Refresh Capability for BGP-4
RFC 3065, Autonomous System Confederations for BGP
RFC 3107, Carrying Label Information in BGP-4
RFC 3345, Border Gateway Protocol (BGP) Persistent Route Oscillation Condition
RFC 3392, Capabilities Advertisement with BGP-4
RFC 4271, A Border Gateway Protocol 4 (BGP-4)
RFC 4273, Definitions of Managed Objects for BGP-4
RFC 4360, BGP Extended Communities Attribute
RFC 4364, BGP/MPLS IP Virtual Private Networks (VPNs)
RFC 4456, BGP Route Reflection: An Alternative to Full Mesh Internal BGP (IBGP)
RFC 4486, Subcodes for BGP Cease Notification Message
RFC 4659, BGP-MPLS IP Virtual Private Network (VPN) Extension for IPv6 VPN
RFC 4632, Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan
RFC 4684, Constrained Route Distribution for Border Gateway Protocol/MultiProtocol Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual Private Networks (VPNs)
RFC 4724, Graceful Restart Mechanism for BGP
RFC 4760, Multiprotocol Extensions for BGP-4
RFC 4781, Graceful Restart Mechanism for BGP with MPLS
RFC 4798, Connecting IPv6 Islands over IPv4 MPLS Using IPv6 Provider Edge Routers (6PE)
Option 4b (eBGP redistribution of labeled IPv6 routes from AS to neighboring AS) is not supported.
RFC 4893, BGP Support for Four-octet AS Number Space
RFC 5004, Avoid BGP Best Path Transitions from One External to Another
RFC 5065, Autonomous System Confederations for BGP
RFC 5082, The Generalized TTL Security Mechanism (GTSM)
RFC 5291, Outbound Route Filtering Capability for BGP-4 (partial support)
RFC 5292, Address-Prefix-Based Outbound Route Filter for BGP-4 (partial support)
Devices running Junos OS can receive prefix-based ORF messages.
RFC 5396, Textual Representation of Autonomous System (AS) Numbers
RFC 5492, Capabilities Advertisement with BGP-4
RFC 5512, The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute
RFC 5549, Advertising IPv4 Network Layer Reachability Information with an IPv6 Next Hop
RFC 5575, Dissemination of flow specification rules
RFC 5668, 4-Octet AS Specific BGP Extended Community
RFC 6368, Internal BGP as the Provider/Customer Edge Protocol for BGP/MPLS IP Virtual Private Networks (VPNs)
RFC 6810, The Resource Public Key Infrastructure (RPKI) to Router Protocol
RFC 6811, BGP Prefix Origin Validation
RFC 6996, Autonomous System (AS) Reservation for Private Use
RFC 7300, Reservation of Last Autonomous System (AS) Numbers
RFC 7752, North-Bound Distribution of Link-State and Traffic Engineering (TE) Information Using BGP
RFC 7854, BGP Monitoring Protocol (BMP)
RFC 7911, Advertisement of Multiple Paths in BGP
RFC 8326, Graceful BGP session Shutdown
Internet draft draft-idr-rfc8203bis-00, BGP Administrative Shutdown Communication (expires October 2018)
Internet draft draft-ietf-grow-bmp-adj-rib-out-01, Support for Adj-RIB-Out in BGP Monitoring Protocol (BMP) (expires September 3, 2018)
Internet draft draft-ietf-idr-aigp-06, The Accumulated IGP Metric Attribute for BGP (expires December 2011)
Internet draft draft-ietf-idr-as0-06, Codification of AS 0 processing (expires February 2013)
Internet draft draft-ietf-idr-link-bandwidth-06.txt, BGP Link Bandwidth Extended Community (expires July 2013)
Internet draft draft-ietf-sidr-origin-validation-signaling-00, BGP Prefix Origin Validation State Extended Community (partial support) (expires May 2011)
The extended community (origin validation state) is supported in Junos OS routing policy. The specified change in the route selection procedure is not supported.
Internet draft draft-kato-bgp-ipv6-link-local-00.txt, BGP4+ Peering Using IPv6 Link-local Address
The following RFCs and Internet draft do not define standards, but provide information about BGP and related technologies. The IETF classifies them variously as “Experimental” or “Informational.”
RFC 1965, Autonomous System Confederations for BGP
RFC 1966, BGP Route Reflection—An alternative to full mesh IBGP
RFC 2270, Using a Dedicated AS for Sites Homed to a Single Provider
Internet draft draft-ietf-ngtrans-bgp-tunnel-04.txt, Connecting IPv6 Islands across IPv4 Clouds with BGP (expires July 2002)