Overview
The Border Gateway Protocol (BGP) provides loop-free interdomain routing between ASs. This section describes some of the main concepts of BGP.
Conventions in This Chapter
Certain terms used with BGP, such as the names of attributes and messages, are typically expressed in all uppercase letters in the RFCs. For improved readability, those terms are represented in lowercase in this chapter. Table 1-1 lists the terms and their variant spellings.
Table 1-1 Conventions for BGP terms
Autonomous Systems
An autonomous system (AS) is a set of routers that use the same routing policy while running under a single technical administration. An AS runs interior gateway protocols (IGPs) such as RIP, OSPF, and IS-IS within its boundaries. ASs use exterior gateway protocols (EGPs) to exchange routing information with other ASs. BGP is an EGP.
The outside world views an AS as a single entity, even though it could be a collection of IGPs working together to provide routing within its interior.
Each AS has an identification number provided by an Internet registry or by an Internet service provider (ISP) that uniquely identifies it to the outside world.
BGP Speaker
A router that has been configured to run the BGP routing protocol is called a BGP speaker.
BGP Peers and Neighbors
Unlike some other routing protocols, BGP speakers do not automatically discover each other and begin exchanging information. Instead, each BGP speaker must be explicitly configured with a set of BGP peers with which it exchanges routing information. BGP peers do not have to be directly connected to each other in order to share a BGP session. Another term for BGP peer is BGP neighbor. A BGP peer group consists of two or more BGP peers that share a common set of update policies.
In Figure 1-1, router NY and router Chicago are peers. Router NY and router LA are peers. Router NY and router Boston are peers. Router NY and router Philly are not peers. Router Chicago and router LA are not peers.
Note: The figures in this chapter indicate a BGP session with a dotted line. A physical link is represented by a solid line.![]()
![]()
BGP Session
When two BGP speakers have both been configured to be BGP peers of each other, they will establish a BGP session to exchange routing information. A BGP session is simply a TCP connection over which routing information is exchanged according to the rules of the BGP protocol.
Because BGP relies on TCP to provide reliable and flow-controlled transmission of routing information, the BGP protocol itself is very simple. However it also implies that two routers can be BGP peers of each other only if they are reachable from each other in the sense that they can exchange IP packets.
In practice this means that either of the following must be true:
- The BGP peers must be connected to a common IP subnet.
- The BGP peers must be in the same AS, which runs an IGP enabling the BGP peers to reach each other.
IBGP and EBGP
When two BGP speakers are in the same autonomous system, the BGP session is called an internal BGP session, or IBGP session. When two BGP speakers are in different autonomous systems, the BGP session is called an external BGP session, or EBGP session. BGP uses the same types of message on IBGP and EBGP sessions, but the rules for when to send which message and how to interpret each message differ slightly; for this reason some people refer to IBGP and EBGP as two separate protocols.
IBGP requires that BGP speakers within an autonomous system be fully meshed, meaning that there must be a BGP session between each pair of peers within the AS. IBGP does not require that all the peers be physically connected. EBGP does not require full meshing of BGP speakers. EBGP sessions typically exist between peers that are physically connected.
Figure 1-2 shows an example of the exchange of information between routers running IBGP and EBGP across multiple ASs.
![]()
Internal Gateway Protocols
Not all the routers within an AS have to be BGP peers. For example, in some large enterprise networks, ASs generally have many more non-BGP routers. These routers communicate using an internal gateway protocol (IGP) such as the following:
- Intermediate System-to-Intermediate System (IS-IS)
- Open Shortest Path First (OSPF)
- Routing Information Protocol (RIP)
Figure 1-3 shows that the routers in AS 53 all communicate with each other using an IGP. Routing information internal to AS 53 is redistributed from the IGP into BGP at router Chicago. Router Chicago redistributes into the IGP the routing information it receives from its external BGP peer, router Atlanta. Router Atlanta has an internal BGP link within its AS, and an external BGP link to router Topeka.
![]()
BGP Messages
BGP speakers exchange routing information with each other by exchanging BGP messages over a BGP session. BGP uses the following five message types:
- Open BGP messages - When two BGP speakers establish a BGP session with each other, the first message they exchange after the underlying TCP session has been established is an open message. This message contains various bits of information that enable the two BGP peers to determine whether they want to establish a BGP session with each other—for example, the AS number of the BGP speaker—and to negotiate certain parameters for the BGP session—for example, how often to send a keepalive message.
- Update messages - The update message is the most important message in the BGP protocol. A BGP speaker sends update messages to announce routes to prefixes that it can reach and to withdraw routes to prefixes that it can no longer reach.
- Keepalive messages - BGP speakers periodically exchange keepalive messages to check whether the underlying TCP connection is still up.
- Notification messages - If a BGP speaker wishes to terminate a BGP session (either because it has been configured to do so or because it has detected some error condition), it will send a notification message to its peer specifying the reason for terminating the BGP session.
- Route-refresh messages - BGP speakers can send route-refresh messages to peers that advertise the route-refresh capability. The messages contain a request for the peer to resend its routes to the system. This feature enables the BGP speaker to apply modified or new policies to the routes when it receives them again.
BGP Route
A BGP route consists of two parts, a prefix and a set of path attributes. It is not uncommon to use the term path to refer to a BGP route, although that term technically refers to one of the path attributes of that route.
Routing Information Base
BGP routes are stored in a BGP speaker's routing information base (RIB), which conceptually consists of the following three parts:
- Adj-RIBs-In store unprocessed routes learned from update messages received by the BGP speaker.
- Loc-RIB contains local routes resulting from the BGP speaker applying its local policies to the routes contained in its Adj-RIBs-In.
- Adj-RIBs-Out store routes that the BGP speaker will advertise to its peers via the update messages it sends.
Prefixes and CIDR
A prefix describes a set of IP addresses that can be reached using the route. For example, the prefix 10.1.1.0/24 indicates all IP addresses whose first 24 bits contain the value 10.1.1. The term network is sometimes used instead of prefix to describe a set of addresses. To reduce confusion, this chapter restricts network to its more common usage, to refer to a physical structure of routers and links.
Prefixes are made possible by classless interdomain routing (CIDR). CIDR addresses have largely replaced the concept of classful addresses (such as Class A, Class B, and Class C) in the Internet. Classful addresses have an implicit, fixed-length mask corresponding to the predefined class boundaries. For example, 192.56.0.0 is a Class B address with an implicit (or natural) mask of 255.255.0.0.
CIDR uses network prefixes and explicit masks, represented by a prefix length, enabling network prefixes of arbitrary lengths. CIDR represents the sample address above as 192.56.0.0/16. The /16 indicates that the high-order 16 bits (the first 16 bits counting from left to right) in the address mask are all 1s.
CIDR enables you to aggregate multiple classful addresses into a single classless advertisement, reducing the number of advertisements that must be made to provide full access to all the addresses. Suppose an ISP has customers with the following addresses:
Without CIDR, the ISP would have to advertise a route to each address, as shown in Figure 1-4.
![]()
With CIDR, the ISP can aggregate the routes as 192.168.128.0/17 and advertise a single address to that prefix, as shown in Figure 1-5.
![]()
Path Attributes
A path attribute provides some additional information about a route. If a BGP speaker has more than one route to the same destination prefix, it selects one those routes to use (the "best" route) based on the path attributes. BGP as implemented on the ERX system specifies detailed and complex criteria for picking the best route; this helps ensure that all routers will converge to the same routing table, a necessary behavior to avoid routing loops. See Selecting the Best Path on p 1-87 for more information.
The following are some of the most important path attributes:
- AS-path specifies the sequence of autonomous systems that must be crossed to reach a certain destination. This path attribute is used to avoid routing loops and to prefer shorter routes over longer routes.
- Next-hop specifies the IP address of the ingress router in the next autonomous system on the path to the destination.
- Local-pref and multiexit discriminator (MED) are metrics that administrators can tune to ensure that certain routes are more attractive over other routes. The local-pref attribute specifies a degree of preference that enables a router to select among multiple routes to the same prefix. The MED is used for ASs that have more than one connection to each other. The administrator of one AS sets the MED to express a degree of preference for one link versus another; the BGP peer in the other AS uses this MED to optimize traffic.
- Originator-ID specifies the IP address of the router that originates the route. The system ignores updates that have this attribute set to its own IP address.
- Atomic-aggregate and aggregator inform peers about actions taken by a BGP speaker regarding aggregation of routes. If a BGP speaker aggregates routes that have differing path attributes, it includes the atomic-aggregate attribute with the aggregated prefix to inform update recipients that they must not deaggregate the prefix. A BGP speaker aggregating routes can include the aggregator attribute to indicate the router and AS where the aggregation was performed.
- Community and extended community identify prefixes as sharing some common attribute, providing a means of grouping prefixes and enacting routing policies on the group of prefixes. A prefix can belong to more than one community. You can specify a community name as a 32-bit string, a standards-defined well-known community, or an AS number combined with a 32-bit number to create a unique identifier. An extended community name consists of either an IP address or an AS number, combined with a 32-bit or 16-bit number to create a unique identifier.
Transit and Nontransit Service
While an ISP provides connectivity to its customers, it also provides connectivity to customers of other ISPs. In doing this, an ISP must be able to ensure the appropriate use of its resources.
For example, Figure 1-6 shows three ISPs and three customers. ISP 1, ISP 2, and ISP 3 are directly connected to one another through a physical link and a corresponding EBGP session (represented here by a single line). Customer 1 is connected to ISP 1 through a physical link and corresponding EBGP session. Customer 2 is similarly connected to ISP 2, and Customer 3 is similarly connected to ISP 3. Each ISP provides transit service to its own customers. Figure 1-6 illustrates how the ISP permits traffic to transit across its backbone from its own customers or to its own customers.
![]()
Each ISP provides nontransit service to other ISPs. For example, Figure 1-7 shows that ISP 1 does not permit traffic between ISP 2 and ISP 3 to cross its backbone. If ISP 1 permitted such traffic, it would be squandering its own resources with no benefit to its customers or itself.
![]()