Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Appendix A – Fabric Implementation with IPv4 Underlay and IPv4 Overlay

This section outlines the configuration components for an IPv4 underlay and IPv4 overlay deployment.

Spine nodes to leaf connections

The interfaces between the leaf and spine nodes do not require explicitly configured IP addresses and are configured as untagged interfaces with only family inet and family inet6 to enable processing of IPv4 and IPv6 traffic as shown in Figure 40.

Figure 40. IPv4 underlay and IPv4 overlay configuration example

A diagram of a diagram AI-generated content may be incorrect.

The interfaces between the leaf and spine nodes are configured with /31 addresses as shown in table 50.

Table 50. IPv4 Address Assignments for Leaf-to-Spine Interfaces (/31 Subnetting)

LEAF NODE INTERFACE LEAF NODE IPv4 ADDRESS SPINE NODE INTERFACE SPINE IPv4 ADDRESS
Stripe 1 Leaf 1 - et-0/0/30:0 10.0.2.65/31 Spine 1 – et-0/0/0:0 10.0.2.64/31
Stripe 1 Leaf 1 - et-0/0/31:0 10.0.2.83/31 Spine 2 – et-0/0/1:0 10.0.2.82/31
Stripe 1 Leaf 1 - et-0/0/32:0 10.0.2.99/31 Spine 3 – et-0/0/2:0 10.0.2.98/31
Stripe 1 Leaf 1 - et-0/0/33:0 10.0.2.115/31 Spine 4 – et-0/0/3:0 10.0.2.114/31
Stripe 1 Leaf 5 - et-0/0/30:0 10.0.2.69/31 Spine 1 – et-0/0/0:0 10.0.2.68/31
Stripe 1 Leaf 2 - et-0/0/31:0 10.0.2.85/31 Spine 2 – et-0/0/1:0 10.0.2.84/31
Stripe 1 Leaf 2 - et-0/0/32:0 10.0.2.101/31 Spine 3 – et-0/0/2:0 10.0.2.100/31
Stripe 1 Leaf 2 - et-0/0/33:0 10.0.2.119/31 Spine 4 – et-0/0/3:0 10.0.2.118/31

.

.

.

     
       

These interfaces are configured as untagged interfaces, with family inet and static IPv4 addresses, as shown in the example for the link between Stripe 1 leaf 1 and Spine 1 below:

Table 51. Example Junos Configuration for Leaf-Spine IPv4 Interface

A screenshot of a computer AI-generated content may be incorrect.

The loopback and Autonomous System numbers for all devices in the fabric are included in table 52:

Table 52. Loopback IPv4 Addresses and Autonomous System Numbers for Fabric Devices

LEAF NODE INTERFACE lo0.0 IPv4 ADDRESS Local AS #
Stripe 1 Leaf 1 10.0.1.1/32 201
Stripe 1 Leaf 2 10.0.1.2/32 202
Stripe 1 Leaf 3 10.0.1.3/32 203
Stripe 1 Leaf 4 10.0.1.4/32 204
Stripe 1 Leaf 5 10.0.1.5/32 205
Stripe 1 Leaf 6 10.0.1.6/32 206
Stripe 1 Leaf 7 10.0.1.7/32 207
Stripe 1 Leaf 8 10.0.1.8/32 208

.

.

.

   
SPINE1 10.0.0.1/32 101
SPINE2 10.0.0.2/32 102
SPINE3 10.0.0.3/32 103
SPINE4 10.0.0.4/32 104

Table 53. Example Junos Configuration for Loopback Interfaces and Routing OptionsA close-up of a computer screen AI-generated content may be incorrect.

GPU Backend Fabric Underlay with IPv4

The underlay EBGP sessions are configured between the leaf and spine nodes using the IP addresses of the directly connected links, as shown in the example between Stripe1 Leaf 1 and the spine nodes below:

Table 54. EBGP Underlay Configuration Example: Stripe 1 Leaf 1 to Spine 1

Table 55. EBGP Underlay Configuration Example: Stripe 1 Leaf 1 to Spine 2

A screenshot of a computer program AI-generated content may be incorrect.

All the BGP sessions are configured with multipath multiple-as, which allows multiple paths (to the same destination) with different AS paths to be considered for ECMP (Equal-Cost Multi-Path) routing, and with BFD to improve convergence in case of failures.

To control the propagation of routes, export policies are applied to these EBGP sessions as shown in the example in table 56.

Table 56. Export policy example to advertise IPv4 routes over IPv4 Underlay

A screenshot of a computer program AI-generated content may be incorrect.

These policies ensure loopback reachability is advertised cleanly and without the risk of route loops.

On the spine nodes, routes are exported only if they are accepted by both the SPINE_TO_LEAF_FABRIC_OUT and BGP-AOS-Policy export policies.

  • The SPINE_TO_LEAF_FABRIC_OUT policy has no match conditions and accepts all routes unconditionally, tagging them with the FROM_SPINE_FABRIC_TIER community (0:15).
  • The BGP-AOS-Policy accepts BGP-learned routes as well as any routes accepted by the nested AllPodNetworks policy.
  • The AllPodNetworks policy, in turn, matches directly connected IPv4 routes and tags them with the DEFAULT_DIRECT_V4 community (1:20007 and 21001:26000 on Spine1).
  • As a result, each spine advertises both its directly connected routes (including its loopback interface) and any routes it has received from other leaf nodes.

Example:

On the leaf nodes, routes are exported only if they are accepted by both the LEAF_TO_SPINE_FABRIC_OUT and BGP-AOS-Policy export policies.

  • The LEAF_TO_SPINE_FABRIC_OUT policy accepts all routes except those learned via BGP that are tagged with the FROM_SPINE_FABRIC_TIER community (0:15). These routes are explicitly rejected to prevent re-advertisement of spine-learned routes back into the spine layer. As described earlier, spine nodes tag all routes they advertise to leaf nodes with this community to facilitate this filtering logic.
  • The BGP-AOS-Policy accepts all routes allowed by the nested AllPodNetworks policy, which matches directly connected IPv4 routes and tags them with the DEFAULT_DIRECT_V4 community (5:20007 and 21001:26000 for Stripe1-Leaf1).

As a result, leaf nodes will advertise only their directly connected interface routes—including their loopback interfaces—to the spines.

GPU Backend Fabric Overlay with IPv4

The overlay EBGP sessions are configured between the leaf and spine nodes using the IPv4 addresses of the loopback interfaces, as shown in the example between Stripe1 Leaf 1 and Spines.

Table 57. EVPN Overlay EBGP Configuration Example: Stripe 1 Leaf 1 to Spine 1

A screenshot of a computer program AI-generated content may be incorrect.

Table 58. EVPN Overlay EBGP Configuration Example: Stripe 2 Leaf 1 to Spine 1

A screenshot of a computer program AI-generated content may be incorrect.

The overlay BGP sessions use family evpn signaling to enable EVPN route exchange. The multihop ttl 1 statement allows EBGP sessions to be established between the loopback interfaces.

As with the underlay BGP sessions, these sessions are configured with multipath multiple-as, allowing multiple EVPN paths with different AS paths to be considered for ECMP (Equal-Cost Multi-Path) routing. BFD (Bidirectional Forwarding Detection) is also enabled to improve convergence time in case of failures.

The no-nexthop-change knob is used to preserve the original next-hop address, which is critical in EVPN for ensuring that the remote VTEP can be reached directly. The vpn-apply-export statement is included to ensure that the export policies are evaluated for VPN address families, such as EVPN, allowing fine-grained control over which routes are advertised to each peer.

To control the propagation of routes, export policies are applied to these EBGP sessions as shown in the example in table 59.

Table 59. Export Policy example to advertise EVPN routes over IPv4 overlay

A screenshot of a computer AI-generated content may be incorrect.

These policies are simpler in structure and are intended to enable end-to-end EVPN reachability between tenant GPUs, while preventing route loops within the overlay.

Routes will only be advertised if EVPN routing-instances have been created. Example:

Table 60. EVPN Routing-Instances for a single tenant example across different leaf nodes.A screenshot of a computer AI-generated content may be incorrect.

On the spine nodes, routes are exported if they are accepted by the SPINE_TO_LEAF_EVPN_OUT policy.

  • The SPINE_TO_LEAF_EVPN_OUT policy has no match conditions and accepts all routes. It tags each exported route with the FROM_SPINE_EVPN_TIER community (0:14).
  • As a result, the spine nodes export EVPN routes received from one leaf to all other leaf nodes, allowing tenant-to-tenant communication across the fabric.

Example:

On the leaf nodes, routes are exported if they are accepted by both the LEAF_TO_SPINE_EVPN_OUT and EVPN_EXPORT policies.

  • The LEAF_TO_SPINE_EVPN_OUT policy rejects any BGP-learned routes that carry the FROM_SPINE_EVPN_TIER community (0:14). These routes are explicitly rejected to prevent re-advertisement of spine-learned routes back into the spine layer. As described earlier, spine nodes tag all routes they advertise to leaf nodes with this community to facilitate this filtering logic.
  • The EVPN_EXPORT policy accepts all routes without additional conditions.

As a result, the leaf nodes export only locally originated EVPN routes for the directly connected interfaces between GPU servers and the leaf nodes. These routes are part of the tenant routing instances and are required to establish reachability between GPUs belonging to the same tenant.

Configuration and verification example

Consider the following scenario where Tenant-A has been assigned GPU 0 on Server 1 and GPU1 on Server 2, and Tenant-B has been assigned GPU 0 on Server 2 and GPU1 on Server 1 as shown in figure 41.

Figure 41. GPU Assignment Across Servers for Tenant-A and Tenant-B

A diagram of a network AI-generated content may be incorrect.

Both Stripe 1 Leaf 1 and Leaf 2 have been configured for Tenant-A and Tenant-B as shown below:

Table 61. EVPN Routing-Instance for Tenant-A and Tenant-B Across Stripe 1 and Stripe 2A screenshot of a computer program AI-generated content may be incorrect.

Table 62. Policies Examples for Tenant-A and Tenant-B Across Stripe 1 and Stripe 2

A screenshot of a computer AI-generated content may be incorrect.

The routing instances create separate routing spaces for the two tenants, providing full route and traffic isolation across the EVPN-VXLAN fabric. Each routing instance has been configured with the following key elements:

  1. Interfaces: The interfaces listed under each tenant VRF (e.g. et-0/0/0:0.0 and et-0/0/1:0.0) are explicitly added to the corresponding routing table. By placing these interfaces under the VRF, all routing decisions and traffic forwarding associated with them are isolated from other tenants and from the global routing table. Assigning an interface that connects a particular GPU to the leaf node effectively maps that GPU to a specific tenant, isolating it from GPUs assigned to other tenants.
  2. Route-distinguisher (RD):

    10.0.1.1:2001 and 10.0.1.1:2002 uniquely identify EVPN routes from Tenant-A and Tenant-B, respectively. Even if both tenants use overlapping IP prefixes, the RD ensures their routes remain distinct in the BGP control plane. Although the GPU to leaf links use unique /127 prefixes, an RD is still required to advertise these routes over EVPN.

  3. Route target (RT) community:

    VRF targets 20001:1 and 20002:1 control which routes are exported from and imported into each tenant routing table. These values determine which routes are shared between VRFs that belong to the same tenant across the fabric and are essential for enabling fabric-wide tenant connectivity—for example, when a tenant has GPUs assigned to multiple servers across different stripes.

  4. Protocols evpn parameters:
    • The ip-prefix-routes controls how IP Prefix Routes (EVPN Type 5 routes) are advertised.
    • The advertise direct-nexthop enables the leaf node to send IP prefix information using EVPN pure Type 5 routes, which includes a router MAC extended community. These routes include a Router MAC extended community, which allows the remote VTEP to resolve the next-hop MAC address without relying on Type 2 routes.
    • The encapsulation vxlan indicates that the payload traffic for this tenant will be encapsulated using VXLAN. The same type of encapsulation must be used end to end.
    • The VXLAN Network Identifier (VNI) acts as the encapsulation tag for traffic sent across the EVPN-VXLAN fabric. When EVPN Type 5 (IP Prefix) routes are advertised, the associated VNI is included in the BGP update. This ensures that remote VTEPs can identify the correct VXLAN segment for returning traffic to the tenant’s VRF.

      Unlike traditional use cases where a VNI maps to a single Layer 2 segment, in EVPN Type 5 the VNI represents the tenant-wide Layer 3 routing domain. All point-to-point subnets—such as the /127 links between GPU servers and the leaf—that belong to the same VRF are advertised with the same VNI.

    • In this configuration, VNIs 20001 and 20002 are mapped to the Tenant-A and Tenant-B VRFs, respectively. All traffic destined for interfaces in Tenant-A will be forwarded using VNI 20001, and all traffic for Tenant-B will use VNI 20002.
    • Notice that the same VNI is configured for the tenant on both Stripe1-Leaf1 and Stripe2-Leaf1.
    • The export policy BGP-AOS-Policy-Tenant-A controls which prefixes from this VRF are allowed to be advertised into EVPN.
  5. Export Policy Logic
    • EVPN Type 5 routes from Tenant-A are exported if they are accepted by the BGP-AOS-Policy-Tenant-A export policy, which references a nested policy named AllPodNetworks-Tenant-A.
    • Policy BGP-AOS-Policy-Tenant-A accepts any route that is permitted by the AllPodNetworks-Tenant-A policy and explicitly rejects all other routes.
    • Policy AllPodNetworks-Tenant-A accepts directly connected IPV6 routes (family inet6, protocol direct) that are part of the Tenant-A VRF. It tags these routes with the TENANT-A_COMMUNITY_V4 (5:20007 21002:26000 ) community before accepting them. All other routes are rejected.

As a result, only the directly connected IPV6 routes from the Tenant-A (/127 links between GPU servers and the leaf) are exported as EVPN Type 5 routes.

To verify the interface assignments to the different tenants, use show interfaces routing-instance <tenant-name> terse.

You can also check the direct routes installed to the correspondent routing table:

To verify evpn l3 contexts including encapsulation, VNI, router MAC address use show evpn l3-context .

Use <tenant-name> extensive for mode details.

When EVPN Type 5 is used to implement L3 tenant isolation across a VXLAN fabric, multiple routing tables are instantiated on each participating leaf node. These tables are responsible for managing control-plane separation, enforcing tenant boundaries, and supporting the overlay forwarding model. Each routing instance (VRF) creates its own set of routing and forwarding tables, in addition to the global and EVPN-specific tables used for fabric-wide communication. These tables are listed in table 63

Table 63. Routing and Forwarding Tables for EVPN Type 5

TABLE DESCRIPTON
bgp.evpn.0

Holds EVPN route information received via BGP, including Type 5 (IP Prefix) routes and other EVPN route types.

This is the control plane source for EVPN-learned routes

:vxlan.inet.0

Used internally for VXLAN tunnel resolution.

Maps VTEP IP addresses to physical next hops.

<tenant>.inet.0

The tenant-specific IPV6 unicast routing table.

Contains directly connected and EVPN-imported Type 5 prefixes for that tenant.

Used for routing data plane traffic.

<tenant>.evpn.0 The tenant-specific EVPN table.

The protocol next hop is extracted from each EVPN route, is extracted and resolved in inet.0. The EVPN route is added to the bgp.evpn.0 table. The result is placed in :vxlan.inet.0.

The route-target community value is used to determine which tenant the route belongs to, and the route is placed in tenant.evpn.0. From there, IPv4 routes are imported into tenant.inet.0 to be used for route lookups when traffic arrives at the interfaces belonging to the VRF.

IPv4 EBGP sessions advertising evpn routes for Tenant-A and Tenant-B should be established. The routes should be installed in both the bgp.evpn.0 table and the <Tenant>.inet.0 table.

To check that evpn routes are being advertised use show route advertising-protocol bgp <neighbor>. For a specific route use the match-prefix option and include the entire evpn prefix as shown in the example below:

The /248 prefixes represent EVPN route type 5 advertising each IPv4 prefix connecting the GPU servers and leaf nodes.

For example: 5:10.0.1.2:2001::0::10.200.0.0::31/248 is an EVPN route type 5 for prefix 10.200.0.0/31 where:

Table 64. EVPN Type 5 Route Advertisement Fields Description.

Name Value Description
Route type 5: Indicates the route is a Type 5 (IP Prefix) route
Route Distinguisher 10.0.1.2:2001 Uniquely identifies the routes
Placeholder fields ::0:: For MAC address and other Type 2-related fields (not used here)
IP Prefix 10.200.0.4::31 The actual prefix being advertised
VNI 20001 VNI to push for traffic to the destination
Advertising router 10.0.0.1 (Spine 1) Spine the route was received from.

To check that evpn routes are being received use show route receive-protocol bgp <neighbor> . For a specific route use the match-prefix option and include the entire evpn prefix as shown in the example below:

Note:

The examples show routes received from Spine 1, but each route is received from all 4 spines nodes, which you can also confirm by entering:

Additional information for a given route can be found using the extensive keyword:

Table 65. EVPN Type 5 Route Advertisement Fields Description - Extensive

Name Value Description
Route type 5: Indicates the route is a Type 5 (IP Prefix) route
Route Distinguisher 10.0.1.2:2001 Uniquely identifies the routes
Placeholder fields ::0:: For MAC address and other Type 2-related fields (not used here)
IP Prefix 10.200.105.0::24 The actual prefix being advertised
VNI 20001 VNI to push for traffic to the destination
Advertising router 10.0.0.1 Spine the route was received from.
Protocol next hop 10.0.1.2 (Stripe 1 Leaf 2) Router that originated the EVPN route (remote VTEP)
Encapsulation Type: 0x08 standardized IANA-assigned value for VXLAN encapsulation in the EVPN Encapsulation extended community (RFC 9014)
Route target target:20001:1 Identifies the route as belonging to Tenant-A

To check that the routes are being imported into the correspondent tenant’s routing tables use show route table <tenant-name>.inet.0 protocol evpn, as shown in the example below: