EVPN Maintenance Mode for Multihomed Leaf Isolation
EVPN Maintenance Mode enables seamless maintenance and upgrades in EVPN environments by isolating multihomed ERB Leaf nodes, rerouting traffic to maintain network stability.
EVPN maintenance mode streamlines maintenance and upgrade operations in EVPN environments by minimizing traffic disruption. By isolating devices in a controlled manner, service interruptions during upgrades can be prevented. This feature operates seamlessly with multihomed edge-routed bridging (ERB) leaf nodes, rerouting traffic to alternate nodes to maintain network stability. It includes pre-checks and validations to ensure multihomed configurations, service tracking, and non-revertive designated forwarder (DF) settings are correctly configured. Non-revertive DF and egress link protection (ELP) enhance resilience, prevent traffic loss and improve convergence times, while network isolation (NISO) profiles aid in maintaining service continuity through core isolation management. You use the command-line interface (CLI) commands to configure, validate, and monitor maintenance mode, and to ensure that all pre-check requirements are met before network changes.
Benefits of Maintenance Mode CLI
-
Minimizes network disruption by isolating multihomed ERB leaf nodes during upgrades, ensuring continued network operations.
-
Enhances system resilience with non-revertive DF election, maintaining stability even during node outages.
-
Improves route convergence times through ELP, ensuring efficient traffic rerouting and reduced downtime.
-
Supports multihoming configurations, providing robust connectivity options that maintain network performance during maintenance activities.
-
Ensures compatibility and optimal performance by adhering to recommended pre-check and post-check procedures, reducing the risk of upgrade failures.
-
Facilitates service continuity and stability during maintenance using NISO profiles to manage core isolation and interface actions effectively.
Overview
EVPN maintenance mode is a sophisticated feature that facilitates seamless maintenance and
upgrades in EVPN networks by isolating multihomed ERB leaf nodes and shutting down
access-side interfaces. Activating this mode minimizes traffic disruption by ensuring
effective traffic rerouting, maintaining network performance during maintenance and system
upgrades. You initiate maintenance mode using the CLI command set protocols evpn
maintenance-mode erb-leaf action-type
(validate-only|validate-and-set|force). The options
validate-only, validate-and-set, or
force give you precise control over the process, ensuring that pre-checks
and configurations are accurate before proceeding.
You can validate configurations before and after starting maintenance mode with the
request evpn
validate-maintenance-mode-pre-checks command. Then use show evpn
maintenance-mode-status to view results of the pre-checks and monitor
the current status. If you change any maintenance mode related configurations, rerun
request evpn
validate-maintenance-mode-pre-checks to update the validation.
The egress link protection (ELP) and fast reroute (FRR) features are integral to maintaining network reliability, especially during link failures. When a link to a customer edge device fails, these features facilitate rapid route convergence, rerouting traffic swiftly to avoid significant data loss. This proactive approach ensures that during maintenance or unexpected failures, the network remains stable and operational. The non-revertive DF election further stabilizes the network by maintaining consistent designated forwarder status even after node reboots, preventing unnecessary traffic switching and ensuring a seamless transition back to normal operations. Network isolation (NISO) profiles also play a crucial role in core isolation management, vital during maintenance activities. These profiles manage interface actions and service-tracking mechanisms to ensure continuity and stability.
EVPN Maintenance Mode Prerequisites
Before initiating EVPN maintenance mode, prepare your network by ensuring out-of-band management capabilities and bandwidth availability to handle potential traffic loads when traffic is diverted to the multihomed peer. This feature does not support single-homed configurations or in-band network management during maintenance.
Adhering to operational guidelines is crucial; these include conducting pre-maintenance health checks, backing up configurations and cleaning system storage. After completing the maintenance activity verify the system and network status to confirm stability and performance.
The maintenance mode CLI feature has the following prerequisites for implementation.
-
Multihomed EVPN instances with the ESI configured with
df-election-type preference non-revertive. -
NISO profiles with
service-tracking core-isolationandservice-tracking-action link-downconfigurations. -
Access interfaces configured with
hold-time upfor better traffic convergence post maintenance. -
(EVPN-MPLS) ELP (
evpn-egress-link-protection) configured under therouting-options forwarding-tablehierarchy. -
(EVPN-VXLAN) FRR (
reroute-address) configured under theforwarding-options evpn-vxlanhierarchy. -
Minimum software version of Junos OS 25.3R1 or later.
maintenance-mode
configurations the hold-time configured in the NISO groups is not
triggered. Therefore, the hold-time up should be configured on the access
interfaces.Non-revertive DF
The non-revertive option for Preference-Based DF Election ensures that once a DF is elected, it will not be preempted by the previously designated DF coming back online after a failure. This non-revertive mode is key in maintaining a stable network environment and avoiding unnecessary service disruptions.
interfaces {
ge-0/1/0 {
esi {
00:01:02:03:04:05:06:07:08:09;
single-active | all-active;
df-election-type {
preference {
value (0-65535);
non-revertive;
least;
}
}
}
}
}
Network Isolation Groups and Service Tracking
Network isolation (NISO) groups define what to track to detect core isolation and what action to take on layer 2 interfaces. Core isolation tracking can be based on core side link tracking or service tracking.
You define NISO groups using the set protocols network-isolation
group statement. Then you associate the group name with CE facing interfaces
using the network-isolation-profile
network-isolation-group-name statement under the
edit interfaces interface name hierarchy. This
configures those interfaces for fast switchover of traffic in event of core isolation.
The maintenance mode CLI disables access-facing interfaces by sending an EVPN
maintenance-mode message to the Layer 2 Address Learning Daemon (L2ALD). This causes the
interfaces connected to the multihomed CE to go into an LACP out-of-sync (OOS) state.
However, the LACP interfaces may not go down right away. So, you must configure the
service-tracking core-isolation and service-tracking-action link-down statements in the NISO group to
bring down the maintenance node's interfaces immediately, allowing better and faster
traffic convergence. Additionally, setting the interfaces name
hold-time down to 0 ensures the interface shuts down immediately when the
Maintenance Mode CLI is enabled.
You can use the interfaces name hold-time up
statement to configure a short delay when bringing the access facing interfaces back up.
This improves traffic convergence by enabling the core facing interfaces to come up and
synchronize routes with the neighboring PEs first. The hold-time configured in NISO group
is not triggered when removing the maintenance mode CLI configuration.
interfaces {
ae1 {
hold-time up 240000 down 0;
esi {
00:11:00:11:11:11:11:11:11:11;
(all-active | single-active);
}
network-isolation-profile network isolation group name;
}
}
protocols {
network-isolation {
group network isolation group name {
detection {
hold-time {
up time in milliseconds;
}
service-tracking {
core-isolation;
}
}
service-tracking-action {
(link-down | lacp-oos);
}
}
}
}
ELP and FRR Configurations
You configure the evpn-egress-link-protection statement for EVPN-MPLS or
the reroute-address statement for EVPN-VXLAN to enable the egress link
protection fast reroute (ELP/FRR) feature on multihoming peer provider edge (PE) devices.
This feature helps to minimize load-balanced traffic loss when the link from a PE device
to a multihomed CE device goes down.
The ELP/FRR check passes when there is a valid EVPN-MPLS or EVPN-VXLAN style configuration as shown below.
EVPN-MPLS configuration:
routing-options {
forwarding-table {
evpn-egress-link-protection;
}
}
EVPN-VXLAN configuration:
forwarding-options {
evpn-vxlan {
reroute-address {
inet {
re-route IP address;
}
}
}
}
Command Utilization and Best Practices
To effectively implement the maintenance mode CLI, it is essential to understand the commands and their usage.
The request
evpn validate-maintenance-mode-pre-checks erb-leaf command verifies that
the device configuration meets the maintenance-mode prerequisites. This
command initiates the same validation checks as the set protocols evpn maintenance-mode
erb-leaf action-type (validate-only|validate-and-set) configuration and can be
ran before, during or after configuring maintenance-mode. You should run
this command after any configuration changes to ensure those changes are considered during
re-validation. Then use the show evpn
maintenance-mode-status command to check the status.
The show evpn
maintenance-mode-status displays the results of the most recent
validation checks and the whether the current EVPN Maintenance Status:
status is Under Maintenance or Not-under Maintenance.
-
Under Maintenance—Indicates the access facing interfaces are disabled/down and all EVPN T2 and T4 routes have been withdrawn and the node is ready for maintenance or upgrade.
-
Not-under Maintenance—Indicates the node is not in
maintenance-mode. Use theset protocols evpn maintenance-mode erb-leaf action-typecommand with thevalidate-and-setorforceoption to entermaintenance-mode.
You should regularly consult show evpn
maintenance-mode-status to assess the current mode and review validation
results. These insights are crucial for making informed decisions during maintenance
activities.
show evpn maintenance-mode-status
user@Leaf-1> show evpn maintenance-mode-status
Pre-check Validation Status : PASS
ELP / FRR : PASS
NISO-Profile : PASS
Non-revertive DF : PASS
EVPN Maintenance Status :(Under Maintenance | Not-under Maintenance)
The show evpn
maintenance-mode-status extensive output provides status details on
individual EVPN instances. This output also displays untracked interfaces. These are
non-EVPN interfaces that might go down during the maintenance if they have a NISO profile
configured.
show evpn maintenance-mode-status extensive
user@Leaf-1> show evpn maintenance-mode-status extensive
Pre-check Validation Status : PASS
ELP / FRR : PASS
NISO-Profile : PASS
Non-revertive DF : PASS
EVPN Maintenance Status : Under Maintenance
Instance: rd_pr_mac_vrf_1
Encapsulation type: VXLAN
FRR : FAIL
Instance: test
Encapsulation type: VXLAN
FRR : FAIL
Interfaces : CE-facing-Intf-Name
ae0.0 ESI: 00:01:01:01:01:01:01:01:01:01
Multihomed : Yes (all-active)
NISO Status : PASS
Non-revertive DF Status : PASS
Untracked Interfaces: ae1
ae2
ge-0/0/1
You configure the set protocols evpn maintenance-mode
erb-leaf action-type (validate-only|validate-and-set|force) statement to initiate
maintenance-mode. The validate-only option allows you to
run pre-checks without altering the network, while validate-and-set
executes the checks and transitions the node into maintenance-mode if the
checks succeed. The force option bypasses the validation options. Use this
option with caution as it can cause traffic loss if the required configurations are
missing.
The validation options certify that your network is adequately configured before entering maintenance mode. These pre-checks verify multihomed configurations, enforce service tracking actions, and confirm non-revertive DF settings. By ensuring these conditions are met, the risk of service disruption during maintenance is reduced. Additionally, utilizing non-revertive DF election and ELP with fast reroute (FRR) enhances network convergence times and resilience, preventing traffic loss during maintenance operations. NISO profiles play a crucial role in core isolation management, vital during maintenance activities. These profiles manage interface actions and service-tracking mechanisms to ensure continuity and stability.
set protocols evpn
maintenance-mode and the set protocols l2-learning niso-maintenance
down commands cannot be configured at the same time.Use the delete protocols evpn maintenance-mode command to remove the
maintenance-mode configuration and bring the EVPN instances back online
and resume normal operations. You can implement this once maintenance is completed to
restart EVPN immediately. Or you can leave the maintenance-mode configured
and remove it after a reboot if necessary.
For example, if you reboot the node and maintenance-mode is configured
with the force option all the CE facing interfaces would remain down. If
maintenance-mode is configured with the validate-and-set
option it would validate the pre-checks and act accordingly.
Alternatively, if you change the maintenance-mode to use the
validate-only option it will clear the maintenance-mode
and bring the CE facing interfaces up.
Upgrade an ERB Leaf Node
In the edge-routed bridging (ERB) design the spine devices provide only IP connectivity between the leaf devices. They require no VXLAN functionality and are referred to as lean spines. The leaf devices provide connectivity to attached workloads and provide layer 2 (L2) and layer 3 (L3) VXLAN functionality in the overlay network. L2 gateways provide bridging within the same VLAN and L3 gateways handle the traffic between VLANs using integrated routing and bridging (IRB) interfaces.
You define NISO groups and attach them to layer 2 or CE facing interfaces for fast switchover of traffic in the event of core isolation. These groups define what to track to detect core isolation and what action to take on layer 2 interfaces. You can use the maintenance mode CLI to leverage the NISO groups during maintenance or upgrades to simplify isolating multihomed nodes and minimize traffic disruption.
The Junos OS Upgrade for EVPN VXLAN Network document provides detailed recommended procedures for upgrading an EVPN-VXLAN ERB leaf node. Chapter 2 covers the guidelines for planning the upgrade and includes the recommended pre-upgrade and post-upgrade health checks. Chapter 3 provides an overview and specific steps for upgrading an ERB leaf node. You can use the maintenance mode CLI to isolate the multihomed access interfaces for the steps that disable and enable the end-device facing access interfaces.
Enter EVPN Maintenance Mode
EVPN maintenance mode leverages the NISO configurations to disable the end-device facing access interfaces and divert traffic towards multihomed peer devices. This helps to isolate the node and prepare it for maintenance.
Use the following steps to enable maintenance mode.
-
Use the
request evpn validate-maintenance-mode-pre-checks erb-leafandshow evpn maintenance-mode-statuscommands to ensure the device configuration and state meets the prerequisites for EVPN maintenance mode. -
Enter maintenance mode by configuring the
set protocols evpn maintenance-mode erb-leaf action-type (validate-only|validate-and-set|force)statement with either thevalidate-and-setor theforceoption.-
validate-only—initiate the EVPNmaintenance-modepre-check process and provide the validation results. -
validate-and-set—initiate the EVPNmaintenance-modepre-checks and if the validations pass then entermaintenance-mode. -
force—skip the EVPNmaintenance-modepre-checks and entermaintenance-modeimmediately.
-
Exit EVPN Maintenance Mode
You exit maintenance mode to enable the end-device facing access interfaces that maintenance mode disabled. The maintenance mode CLI configurations remain active as long as they are configured, even after a reboot. For example:
-
If you configure
maintenance-modewith thevalidate-and-setoption, it will validate the pre-checks after rebooting and act accordingly. -
If you configure
maintenance-modewith theforceoption, it will skip the pre-checks after rebooting and re-entermaintenance-modeimmediately.
You can exit maintenance-mode by changing the
maintenance-mode option to validate-only or by
removing the maintenance-mode CLI configuration as follows:
-
Use
set protocols evpn maintenance-mode erb-leaf action-type validate-onlyto exitmaintenance-modebut still run the validation checks. -
Use the
delete protocols evpn maintenance-modecommand to exitmaintenance-modeby removing the configuration.
hold-time configured in the NISO groups is not triggered. Therefore,
the hold-time up should be configured on the access interfaces.