Type 5 EVPN-VXLAN implementation – Control Plane
A Type 5 EVPN-VXLAN fabric consists of a dual-layer BGP architecture made up of an underlay and an overlay.
The underlay serves as the IP transport between VXLAN Tunnel Endpoints (VTEPs)—located at the leaf nodes—and provides IP reachability using EBGP sessions. These sessions are established between directly connected leaf and spine nodes and exchange plain IPv4 or IPv6 unicast routes for the leaf nodes’ loopback interfaces.
The overlay provides IP reachability between gpu-facing ethernet segments using multihop EBGP sessions. These sessions are established between the leaf and spine nodes using their loopback addresses (IPv4 or IPv6) and include the information required to encapsulate and forward tenant traffic across the fabric while maintaining traffic separation between customers.
EBGP is preferred in overlay routing for Clos fabrics because it enforces loop-free, hop-by-hop forwarding without requiring route reflectors. By using unique ASNs per device, it aligns with Valley-Free Routing principles, ensuring traffic flows cleanly up to the spine and down to the leaf, avoiding loop and maintaining symmetry.
In a Type 5 implementation, the key routes exchanged in the overlay are EVPN Route Type 5 (IP Prefix routes), which enable the distribution of IP routing information decoupled from MAC learning. These routes carry Layer 3 prefixes along with their associated route targets, route distinguishers, and next-hop VTEP information, enabling scalable IP routing across the fabric without the need for ARP or MAC learning for every endpoint. As a result, the control plane supports efficient distribution of tenant IP routes while preserving isolation through BGP extended communities.
Fabric Underlay Control Implementation Options
There are different ways to implement the underlay in an EVPN-VXLAN fabric, depending on design goals, operational preferences, and hardware capabilities.
- IPv4 addresses (/31 subnet masks) numbered interfaces
- IPv6 addresses (/127 subnet masks) numbered interfaces
- IPv6 link-local addresses, where BGP operates in unnumbered mode (as defined in RFC 5549)
Table 8. Comparison of Underlay Control Plane Implementation Options in EVPN-VXLAN Fabrics
Implementation Options | IPv4 /31 | IPv6 /127 | IPv6 Link-Local (RFC 5549) |
---|---|---|---|
Leaf to Spine Interface Addressing | /31 IPv4 addresses | (routable) /127 IPv6 addresses | Link-local IPv6 (no global addressing needed) |
BGP Peer Configuration | Explicit neighbor config per interface using IPv4 addresses of directly connected interface. | Explicit neighbor config per interface using (routable) IPv6 addresses of directly connected interface. |
No explicit neighbor config required Uses interface-scoped link-local discovery Link-local IPv6 + interface-scoped BGP config (fe80::1%et-0/0/0) |
Benefits |
Simple Widely supported Low config overhead |
Avoids IPv4 exhaustion IPv6-native underlay Aligns with modern fabrics |
Zero IP allocation needed Ideal for massive fabrics Minimal IPAM |
Drawbacks | No IPv6 ready | Needs dual-stack address planning |
Traceroute visibility reduced RFC 5549 knowledge required |
Use cases / Industry Trend |
IPv4 /31 remains the most widely used in enterprise and service provider fabrics. For most enterprise and traditional data center fabrics, IPv4 /31 remains the recommended and most straightforward underlay option. |
IPv6 /127 is gaining traction in dual-stack environments and in organizations preparing for IPv6 transitions. It conserves IPv4 space and offers a clean separation between infrastructure and tenant traffic. |
IPv6 Link-Local / RFC 5549 is trending in hyperscaler, telco, and modern leaf-spine fabrics, especially where address scale or IPAM simplicity is critical. Many cloud providers (Azure, AWS internal fabrics, Meta, etc.) use unnumbered underlays. This option is becoming the preferred option for hyper scaling, IP-scarce, or automation-heavy fabrics with experienced ops teams, where managing IPs is a burden. |
While all three options are described below, this JVD and the associated testing focused on the IPv6 Link-Local (RFC 5549) underlay, which is the preferred design choice.
Choosing IPv6 link-local addressing with RFC 5549 (unnumbered underlay) simplifies network design and reduces operational complexity—especially in large-scale fabrics where scalability and automation are key. By removing the need to assign and manage global IP addresses on leaf-to-spine links, this approach eliminates a major source of administrative overhead and prevents IPv4 address exhaustion. Each link automatically generates unique addresses, streamlining both deployment and ongoing maintenance. This is a significant advantage in environments with thousands of interfaces, where traditional IP management becomes a scaling bottleneck.
From a design and operational perspective, unnumbered underlays align perfectly with modern data center and AI fabric principles. They reduce configuration touch points, lower the chance of human error, and support dynamic, automation-friendly environments. For customers prioritizing agility, scalability, and simplified management—such as hyperscalers, telcos, and AI-driven infrastructures—IPv6 link-local fabrics not only meet today’s technical requirements but also provide a future-proof foundation that supports growth without requiring continuous IP planning. While this model requires familiarity with RFC 5549 and may introduce minor trade-offs in traceability, the operational and business benefits far outweigh the challenges for teams that are ready to embrace modern networking best practices.
Fabric Overlay Control Plane Implementation
After loopback addresses have been exchanged, EBGP sessions are formed between leaf and spine nodes. In the overlay, BGP is configured with an export policy to advertise the /31 point-to-point IP addresses used between GPU servers and leaf nodes.
Because these interfaces are part of tenant-specific IP VRFs, the prefixes are advertised as EVPN Type 5 routes, using VNI 1 for Tenant A. These routes provide L3 reachability between GPUs assigned to the same tenant, even when distributed across multiple servers and racks. The routes are installed in the corresponding VRF routing table.
EVPN Type 5 routes, also known as IP Prefix routes, are used to advertise Layer 3 (routed) prefixes across an EVPN-VXLAN fabric. Unlike Type 2 routes, which carry MAC and IP bindings for bridging or IRB use cases, Type 5 routes are designed for pure L3 routing and are ideal for deployments where no MAC learning, IRBs, or anycast gateways are required.
In this design, Type 5 routes are used to advertise the /31 point-to-point IP addresses between GPU servers and leaf switches, allowing routed connectivity between GPUs assigned to the same tenant, even when they are hosted on different servers or connected to different leaf nodes. Each Type 5 route includes the information shown in Table 9.
Table 9. EVPN Type 5 Route Fields Description
Field | Description |
---|---|
Route Type | IP Prefix Route (Type 5) |
Route Distinguisher (RD) | RD of advertising PE (e.g., based on loopback IP) to make the route unique across the fabric |
Ethernet Tag ID | 0 (because it’s not associated with a specific VLAN or MAC-VRF) |
IP Prefix | The advertised IPv4 or IPv6 prefix being advertised (e.g., 10.200.0.2/31) |
Prefix Length | The length of the IP prefix (e.g., 31) |
Label | VXLAN VNI (e.g., 1) identifying the virtual routing domain |
Next-hop | Loopback address of the advertising leaf node |
Extended Community | Route-target to identify the associated tenant VRF (e.g., target:65000:1) |
Other BGP Attributes | BGP attributes like origin, AS-path, local-pref, etc. |
Control plane implementation with IPv4 underlay and IPv4 overlay
This model provides an IPv4 transport underlay and IPv4-based EVPN-VXLAN in the overlay that can still support IPv4-only devices communicating across the fabric. This model aligns with traditional IP fabric designs, where interface addressing is fully controlled and visible, and neighbor relationships are explicitly defined.
The interfaces between leaf and spine nodes are configured with explicit /31 IPv4 addresses assigned from a pool of IPv4 addresses reserved for the underlay. Each device on the point-to-point link is configured with one of the two usable IPv4 addresses in the corresponding /31 subnet. This allows efficient address assignment for the point-to-point links between leaf and spine nodes. All leaf and spine nodes are also configured with both IPv4 addresses (under lo0.0).
The underlay EBGP sessions are set up between the leaf and spine nodes, by explicitly configuring each neighbor, using the /31 IPv4 addresses assigned between them.
The EBGP configuration for this model includes each neighbor’s IPv4 address and Autonomous System (AS) number, the local Autonomous System (AS) number, and the export policy that allows the advertisement of routes to reach all the leaf and spine nodes in the fabric. These routes are standard IPv4 unicast advertising the IPv4 addresses assigned to the loopback interface (lo0.0).
The overlay EBGP sessions are also set up by explicitly configuring each neighbor, using the IPv4 addresses of the loopback interfaces advertised by the underlay EBGP sessions. These EBGP sessions are also established between the leaf and spine nodes. The leaf nodes act as VTEPs, and advertise the IPv4 prefixes assigned to the links between the GPU servers and the leaf nodes using EVPN Type 5 routes.
EXAMPLE:
Consider the example depicted in Figure 30.
For the underlay, STRIPE1 LEAF 1 in AS 201 establishes an EBGP session with SPINE 1 in AS 101, over the directly connected IPv4 link 10.2.1.2/31 <=> 10.2.1.1/31. Similarly, STRIPE2 LEAF 1 in AS 209 establishes an EBGP session with SPINE 1 over the link 10.2.9.2/31 <=> 10.2.9.1/31.
Figure 30. IPv4 underlay and IPv4 overlay example
These sessions exchange IPv4 unicast routes advertising the address of the loopback interface (lo0.0) of STRIPE1 LEAF 1 (10.0.1.1), STRIPE2 LEAF 1 (10.0.1.9) and SPINE 1 (10.0.0.1).
Although it is not shown in the diagram, STRIPE1 LEAF 1 and STRIPE2 LEAF 1 will also establish EBGP sessions with SPINE 2, SPINE 3, and SPINE 4 to ensure multiple paths are available for traffic.
EBGP sessions are established between the leaf nodes and SPINE 1 using their loopback addresses (10.0.1.1, 10.0.1.9, and 10.0.0.1, respectively).
The leaf nodes acting as VTEP advertise the links connecting the GPU servers and leaf nodes as /31 EVPN type 5 routes.
For example, STRIPE1 LEAF 1 advertises routes to the IPv4 addresses on the links connecting SERVER 1 GPU1 and SERVER 2 GPU1 to STRIPE1 LEAF 1 (10.1.1.0/31 and 10.1.1.16/31 respectively). Similarly, STRIPE2 LEAF 1 advertises router to the IPv4 addresses on the links connecting SERVER 3 GPU1 and SERVER 4 GPU1 (10.1.1.32/31 and 10.1.1.40/31 respectively).
Assuming all four GPUs in the example belong to the same tenant, their associated interfaces are mapped to the same VRF, RT5-IPVRF_TENANT-A.
RT5-IPVRF_TENANT-A is configured on both STRIPE1 LEAF 1 and STRIPE2 LEAF 1 with the same VXLAN Network Identifier (VNI) and route targets. STRIPE1 LEAF 1 advertises the prefixes 10.1.1.0/31 and 10.1.1.16/31 to SPINE 1 as EVPN Route Type 5, with its own loopback (10.0.1.1) as the next-hop VTEP. STRIPE2 LEAF 1 advertises 10.1.1.32/31 and 10.1.1.40/31 with 10.0.1.9 as the next-hop.
When SERVER 1 GPU1 sends traffic to SERVER 3 GPU1, the destination prefix 10.1.1.32/31 is matched in the VRF routing table on STRIPE1 LEAF 1. The route points to STRIPE2 LEAF 1 (VTEP at 10.0.1.9) as the next-hop, and specifies VNI 1 as the VXLAN encapsulation ID. The packet to be VXLAN-encapsulated and tunneled across the fabric to its destination over the IPv4 underlay.
Control plane implementation with IPv6 underlay and IPv6 overlay
This model provides an IPv6 transport underlay and IPv6-based EVPN-VXLAN in the overlay that can still support IPv4-only devices communicating across the fabric. This aligns with traditional dual stack IP fabric designs, where interface addressing is fully controlled and visible, and neighbor relationships are explicitly defined and both IPv4 and IPv6 is supported.
The interfaces between leaf and spine nodes are configured with explicit /127 IPv6 addresses assigned from a pool of IPv6 addresses reserved for the underlay. These addresses can be global or site local routable IPv6 addresses. Each device on the point-to-point link is configured with one of the two usable IPv6 addresses in the corresponding /127 subnet. This allows efficient address assignment for the point-to-point links between leaf and spine nodes. All leaf and spine nodes are also configured with both IPv4 and IPv6 loopback addresses (under lo0.0).
The underlay EBGP sessions are set up between the leaf and spine nodes, by explicitly configuring each neighbor, using the /127 IPv6 addresses assigned between them.
The EBGP configuration for this model includes each neighbor’s IPv6 address and Autonomous System (AS) number, the local Autonomous System (AS) number, and the export policy that allows the advertisement of routes to reach all the leaf and spine nodes in the fabric. These routes are standard IPv6 unicast advertising the IPv6 addresses assigned to the loopback interface (lo0.0).
The overlay EBGP sessions are also set up by explicitly configuring each neighbor, using the IPv6 addresses of the loopback interfaces advertised by the underlay EBGP sessions.
These EBGP sessions are also established between the leaf and spine nodes. The leaf nodes act as VTEPs, and exchange EVPN Type 5 routes advertising the IPv4 prefixes assigned to the links between the GPU servers and the leaf nodes.
EXAMPLE:
Consider the example depicted in Figure 31.
For the underlay, STRIPE1 LEAF 1 in AS 201 establishes an EBGP session with SPINE 1 in AS 101 over the directly connected IPv6 point-to-point link 2001:0:2:1::2/127 <=> 2001:0:2:1::1/127. Similarly, STRIPE2 LEAF 1 in AS 209 establishes an EBGP session with SPINE 1 over the link 2001:0:2:9::2/127 <=> 2001:0:2:9::1/127.
Figure 31. IPv6 underlay and IPv6 overlay example
These sessions exchange IPv6 unicast routes advertising the address of the loopback interface (lo0.0) of STRIPE1 LEAF 1 (2001:10::1:1), STRIPE2 LEAF 1 (2001:10::1:9) and SPINE 1 (2001:10::1).
Although it is not shown in the diagram, STRIPE1 LEAF 1 and STRIPE2 LEAF 1 will also establish EBGP sessions with SPINE 2, SPINE 3, and SPINE 4 to ensure multiple paths are available for traffic.
EBGP sessions are established between the leaf nodes and SPINE 1 using their loopback addresses (2001:10::1:1, 2001:10::1:9, and 2001:10::1 respectively).
The leaf nodes acting as VTEP advertise the links connecting the GPU servers and leaf nodes as /31 EVPN type 5 routes.
For example, STRIPE1 LEAF 1 advertises routes to the IPv4 addresses on the links connecting SERVER 1 GPU1 and SERVER 2 GPU1 to STRIPE1 LEAF 1 (10.1.1.0/31 and 10.1.1.16/31 respectively). Similarly, STRIPE2 LEAF 1 advertises router to the IPv4 addresses on the links connecting SERVER 3 GPU1 and SERVER 4 GPU1 (10.1.1.32/31 and 10.1.1.40/31 repectively).
Assuming all four GPUs in the example belong to the same tenant, their associated interfaces are mapped to the same VRF, RT5-IPVRF_TENANT-A.
RT5-IPVRF_TENANT-A is configured on both STRIPE1 LEAF 1 and STRIPE2 LEAF 1 with the same VXLAN Network Identifier (VNI) and route targets. STRIPE1 LEAF 1 advertises the prefixes 10.1.1.0/31 and 10.1.1.16/31 to SPINE 1 as EVPN Route Type 5, with its own loopback (10.0.1.1) as the next-hop VTEP. STRIPE2 LEAF 1 advertises 10.1.1.32/31 and 10.1.1.40/31 with 10.0.1.9 as the next-hop.
When SERVER 1 GPU1 sends traffic to SERVER 3 GPU1, the destination prefix 10.1.1.32/31 is matched in the VRF routing table on STRIPE1 LEAF 1. The route points to STRIPE2 LEAF 1 (VTEP at 10.0.1.9) as the next-hop, and specifies VNI 1 as the VXLAN encapsulation ID. The packet to be VXLAN-encapsulated and tunneled across the fabric to its destination over
Control plane implementation with IPv6 Link-Local (RFC 5549) underlay and IPv4/IPv6 overlay
This model also provides an IPv6 transport underlay and IPv4/IPv6-based EVPN-VXLAN in the overlay that can still support IPv4-only devices communicating across the fabric. This aligns with traditional dual stack IP fabric designs, where interface addressing is fully controlled and visible, and neighbor relationships are explicitly defined and both IPv4 and IPv6 are supported.
The interfaces between leaf and spine nodes do not require explicitly configured IP addresses. It is sufficient to enable IPv6 (e.g. family inet6) or both IPv4 and IPv6 (e.g. family inet and family inet6). Enabling IPv6 on an interface automatically assigns a link-local IPv6 address, which is then advertised through standard router advertisements as part of the IPv6 Neighbor Discovery process. This simplifies configuration and eliminates the need for manual IP addressing on leaf–spine links. All leaf and spine nodes are also configured with both IPv4 and IPv6 loopback addresses (under lo0.0).
The underlay EBGP sessions are set up using unnumbered links (RFC 5549), also referred to as BGP auto-discovery or BGP auto-peering, which allows devices to dynamically discover directly connected neighbors and form BGP sessions using IPv6 link-local addresses. This design leverages Junos OS support for:
- RFC 5549: Advertising IPv4 Network Layer Reachability information with an IPv6 Next Hop
- RFC 4861: Neighbor Discovery for IP version 6 (IPv6)
- RFC 2462: IPv6 Stateless Address Autoconfiguration
Traditionally, BGP requires explicit configuration of neighbors, Autonomous System (AS) numbers, and routing policies to control route exchanges. With BGP unnumbered peering, neighbors are discovered dynamically, and BGP sessions are established automatically, eliminating the need for manual configuration and enabling faster, more scalable underlay deployments in EVPN-VXLAN data center fabrics.
Neighbor discovery uses standard IPv6 mechanisms to learn the link-local addresses of directly connected neighbors. These addresses are then used to automatically establish EBGP sessions, which can advertise both IPv6 and IPv4 reachability using IPv6 next hops, in accordance with RFC 5549.
The EBGP configuration for this model includes the local Autonomous System (AS) number, a list of accepted remote Autonomous System (AS) numbers, the list of interfaces where dynamic BGP neighbors with be accepted, and the export policy that allows the advertisement of routes to reach all the leaf and spine nodes in the fabric. These routes are standard IPv6 unicast advertising the IPv6 addresses assigned to the loopback interface (lo0.0). Peer-auto-discovery using ipv6-nd must also be enabled for the BGP group.
The overlay EBGP sessions are also set up by explicitly configuring each neighbor, using either the IPv4 or IPv6 addresses of the loopback interfaces advertised by the underlay EBGP sessions.
These EBGP sessions are also established between the leaf and spine nodes. The leaf nodes act as VTEPs, and exchange EVPN Type 5 routes advertising the IPv4 or IPv6 prefixes assigned to the links between the GPU servers and the leaf nodes.
Although this approach introduces more configuration complexity and requires deeper familiarity with IPv6 link-local addressing and Junos routing policy, it offers significant operational advantages in highly scalable environments. By eliminating the need to assign and manage IP addresses on point-to-point links, this model simplifies IP planning and is ideal for large-scale, automated EVPN-VXLAN deployments.
By using IPv6 link-local addresses and BGP unnumbered peering, this implementation provides a modern and efficient underlay model that minimizes manual configuration, reduces IP planning effort, and enables dynamic peer discovery — all while maintaining full compatibility with IPv4 or IPv6 overlays using RFC 5549 behavior.
EXAMPLE:
Consider the example depicted in Figure 32.
For the underlay, STRIPE1 LEAF 1 in AS 201 automatically establishes an EBGP session with SPINE 1 in AS 101, over the directly connected link FE80::1 <=> FE80::2. Similarly, STRIPE2 LEAF 1 in AS 209 establishes an EBGP session with SPINE 1 over the link FE80::1 <=> FE80::2. The addresses used are the link local addresses automatically assigned to the interfaces based on their MAC address, (shown here as FE80::1 and FE80::2 for simplicity) and are auto discovered using standard IPv6 neighbor discover mechanisms.
Figure 32: IPv6 Link-Local (RFC 5549) underlay and IPv4 overlay
example
These sessions exchange IPv4 and IPv6 unicast routes advertising the address of the loopback interface (lo0.0) of STRIPE1 LEAF 1 (10.1.1.1/2001:10::1:1), STRIPE2 LEAF 1 (10.1.1.9/2001:10::1:9) and SPINE 1 (10.10.0.1/2001:10::1).
Although it is not shown in the diagram, STRIPE1 LEAF 1 and STRIPE2 LEAF 1 will also establish EBGP sessions with SPINE 2, SPINE 3, and SPINE 4 to ensure multiple paths are available for traffic.
EBGP sessions are established between the leaf nodes and SPINE 1 using their loopback addresses either IPv4 (10.1.1.1, 10.1.1.9, and 10.1.0.1 respectively) or IPv6 (2001:10::1:1, 2001:10::1:9, and 2001:10::1 respectively). The leaf nodes, acting as VTEP, advertise the links connecting the GPU servers and leaf nodes as /31 EVPN type 5 routes.
For example, STRIPE1 LEAF 1 advertises routes to the IPv4 addresses on the links connecting SERVER 1 GPU1 and SERVER 2 GPU1 to STRIPE1 LEAF 1 (10.1.1.0/31 and 10.1.1.16/31 respectively). Similarly, STRIPE2 LEAF 1 advertises router to the IPv4 addresses on the links connecting SERVER 3 GPU1 and SERVER 4 GPU1 (10.1.1.32/31 and 10.1.1.40/31 respectively).
Assuming all four GPUs in the example belong to the same tenant, their associated interfaces are mapped to the same VRF, RT5-IPVRF_TENANT-A.
RT5-IPVRF_TENANT-A is configured on both STRIPE1 LEAF 1 and STRIPE2 LEAF 1 with the same VXLAN Network Identifier (VNI) and route targets. STRIPE1 LEAF 1 advertises the prefixes 10.1.1.0/31 and 10.1.1.16/31 to SPINE 1 as EVPN Route Type 5, with its own loopback (10.0.1.1) as the next-hop VTEP. STRIPE2 LEAF 1 advertises 10.1.1.32/31 and 10.1.1.40/31 with 10.0.1.9 as the next hop.
When SERVER 1 GPU1 sends traffic to SERVER 3 GPU1, the destination prefix 10.1.1.32/31 is matched in the VRF routing table on STRIPE1 LEAF 1. The route points to STRIPE2 LEAF 1 (VTEP at 10.0.1.9) as the next-hop, and specifies VNI 1 as the VXLAN encapsulation ID. The packet to be VXLAN-encapsulated and tunneled across the fabric to its destination over the IPv6 underlay.