Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

QFX Series MC-LAG Fabric Upgrade Procedure

About This Network Configuration Example

This network configuration example (NCE) shows how to manually upgrade an MC-LAG pair of QFX series devices. This process minimizes service disruption and has minimal impact on data center workloads.

Use Case Overview

To eliminate the access switch as a single point of failure in a data center environment, multichassis link aggregation groups (MC-LAGs) enable a client device to form a logical LAG interface between two MC-LAG peers. An MC-LAG provides redundancy and load balancing between the two MC-LAG peers, multihoming support, and a loop-free Layer 2 network without running STP. This example uses a basic MC-LAG configuration, but you can use this process for many different use cases.

This example does not cover how to perform a non-stop software upgrade (NSSU).

Technical Overview

Manually upgrading MC-LAG peers is similar to an NSSU. The manual upgrade process uses a high-availability design to systematically remove one device from service in order to perform the upgrade and then reboot. When servers are dual-homed to each MC-LAG peer, the network can handle the removal of one of the MC-LAG peers during the upgrade window. There’s a reduction of overall network bandwidth during the process, but the network remains available.

The MC-LAG is in active-active state and uses the ICCP protocol to keep the device state synchronized between the members of the MC-LAG. While one peer handles the traffic, the other peer is taken offline to upgrade the software.

Figure 1 illustrates a basic MC-LAG topology.

Figure 1: Basic MC-LAG TopologyNetwork topology diagram with a core device, two Juniper QFX5100 switches labeled QFX5100-A and QFX5100-B, and a server. Aggregated Ethernet interfaces ae0 and ae1 connect the devices for link aggregation.

Here’s the sequence of events that occur during an upgrade between two MC-LAG peers (Node 1 and Node 2):

  1. All traffic is shifted from Node 1 to Node 2.

  2. Node 1 is no longer handling traffic, so the MC-LAG is no longer operational.

  3. Software is installed on Node 1 and then reboots.

  4. Node 1 comes online, and all traffic is shifted from Node 2 to Node 1.

  5. Software is installed on Node 2 and then reboots.

  6. When Node 2 is online, the MC-LAG interfaces are re-enabled between the Node 1 and Node 2.

How to Perform a QFX Series MC-LAG Fabric Upgrade

Requirements

This example uses the following hardware and software components:

  • Two QFX5100 devices running Junos OS Release 18.2R3-S3

  • Junos OS Release 18.4R3.3

  • A test server running Ubuntu Linux 16.04

Overview

To ensure a minimum of downtime, upgrading between software releases requires a sequence of steps coordinated among all of the network elements This topology uses servers with redundant connections to the MC-LAG to achieve high-availability during the switch over between MC-LAG peers.

To upgrade the fabric to a new version of Junos OS with minimal traffic disruption, you need to disable the MC-LAG and upgrade the MC-LAG peers as standalone units. After the software has been upgraded on both MC-LAG peers, you will re-connect them and re-establish the MC-LAG.

Topology

Figure 2 illustrates the MC-LAG topology referred to in this example.

Figure 2: TopologyNetwork topology diagram with two Juniper QFX5100 switches, a server, and Demo Core Device. Switches connected via aggregated Ethernet ae0. VRRP configured with virtual IP 10.1.1.1/24; Node 1 is primary. Server connected to switches via ae1, part of VLANs v100 and v500, IP 10.1.1.11/24. Demonstrates redundant high-availability network design.

QFX Series MC-LAG Fabric Upgrade Configuration

Prepare for the Upgrade

Step-by-Step Procedure

Use this procedure to upgrade both peers of a MC-LAG fabric consisting of QFX5100 switches to the same Junos OS Release version. We strongly recommend that both members of the MC-LAG are the same platform.

This configuration example shows how to manually upgrade MC-LAG peers from Junos OS Release 18.2R3-S3 to Junos OS Release 18.4R3.3.

  1. Verify that the MC-LAG state is operational between both MC-LAG peers by checking the MC-LAG parameters.

Upgrade the QFX Series MC-LAG Fabric

Procedure

Step-by-Step Procedure
  1. Copy the new Junos OS software image to the /var/tmp directories on both peers.

    Copying the software on both MC-LAG peers stages the software for the upgrade procedure. The copy operation takes some time to complete while it transfers the Junos OS software images from the server to the MC-LAG peers.

  2. Disable the server-facing interfaces on QFX5100-A to minimize disruption during the switch over to QFX5100-B.

    Figure 3: Disabling the Server-Facing Interface on QFX5100-ANetwork topology diagram with two switches QFX5100-A and QFX5100-B, a server, and a core device. Core device connects to both switches via et-0/0/52. Node 1 has a failed xe-0/0/10 interface, IP 10.1.1.9/24, priority 200. Node 2 has IP 10.1.1.10/24, priority 100. Server connects through interfaces enp7s0f0 and enp7s0f1 aggregated into ae1 with IP 10.1.1.11/24. VRRP virtual IP is 10.1.1.1/24. VLANs v100 and v500 configured network-wide. Aggregated Ethernet ae0 links Node 1 and Node 2.
  3. Disable the uplink interfaces on QFX5100-A.

    Figure 4: Disabling the Uplink Interface on QFX5100-ANetwork topology with two Juniper QFX5100 switches, a server, and a Demo Core Device using VLANs, VRRP, and aggregated Ethernet interfaces for redundancy and high availability.
  4. Disable the interfaces between the QFX5100-A and QFX5100-B.

    This breaks up the MC-LAG.

    Figure 5: Disabling Interfaces Between QFX5100-A and QFX5100-BNetwork topology diagram with two Juniper QFX5100 switches, a server, and a core device. Node 1 has priority 200; Node 2 has priority 100. VRRP IP is 10.1.1.1/24. Server IP is 10.1.1.11/24. VLANs v100 and v500 configured. Aggregated Ethernet interfaces ae0 and ae1 provide redundancy. Red X marks indicate failed connections.
  5. Upgrade QFX5100-A.

    Figure 6: Upgrading QFX5100-ANetwork diagram of two nodes QFX5100-A and QFX5100-B using VRRP with IP 10.1.1.1/24 and VLANs v100, v500. Node 1 priority 200, Node 2 priority 100. Server IP 10.1.1.11/24 connected via enp7s0f0 and enp7s0f1. Red crosses mark disabled connections.
  6. To redirect the traffic from QFX5100-B to QFX5100-A, re-enable the server-facing and uplink interfaces on QFX5100-A.

    Figure 7: Re-enabling Server-Facing and Uplink InterfacesNetwork topology diagram with two Juniper QFX5100 switches, a server, and Demo Core Device. Node 1, QFX5100-A, interfaces xe-0/0/8, xe-0/0/9, xe-0/0/10 in subnet 10.1.1.9/24, is VRRP master. Node 2, QFX5100-B, interfaces xe-0/0/8, xe-0/0/9, xe-0/0/10 in subnet 10.1.1.10/24, is VRRP backup. Server interfaces enp7s0f0, enp7s0f1 aggregated to ae1 in VLANs v100, v500, IP 10.1.1.11/24. Red X indicates issue with ae0 link between switches.
  7. Disable the server-facing interfaces on QFX5100-B.

    Figure 8: Disabling Server-Facing Interfaces on QFX5100-BNetwork topology with two Juniper QFX5100 switches Node 1 and Node 2, a core device, and a server. Node 1, VRRP master, connects to the core and server, with VRRP priority 200. Node 2 acts as backup, VRRP priority 100. Both switches and server use VLANs v100 and v500. Aggregated Ethernet interfaces ae0 and ae1 enable link aggregation. Red X marks indicate failed/disconnected links between the switches and core device.
  8. Disable the uplink interfaces on QFX5100-B, so that the traffic goes through QFX5100-A.

  9. Upgrade QFX5100-B.

    Figure 10: Upgrading QFX5100-BNetwork topology with two Juniper QFX5100 switches, a server, and core device. VLANs, VRRP with virtual IP 10.1.1.1/24, aggregated Ethernet interfaces ae0 and ae1 are shown. Node 1 is active with VRRP priority 200. Red X marks indicate link failures.
  10. Re-enable the ICCP-PL interface between QFX5100-A and QFX5100-B.

    Network topology with two Juniper QFX5100 switches, server, and core device. Node 1 is VRRP master with IP 10.1.1.9/24. Node 2 has links down, IP 10.1.1.10/24. Server connects via aggregated links, IP 10.1.1.11/24. VLANs v100 and v500 configured. VRRP virtual IP is 10.1.1.1/24.
  11. Re-enable the server-facing and uplink interfaces on QFX5100-B.

    Network topology diagram with two Juniper QFX5100 switches, core device, server, VLANs v100 and v500, link aggregation ae0 and ae1, and VRRP for redundancy.

Verification

Verify that the MC-LAG Fabric is Operational
Purpose

Verify that the MC-LAG Fabric is operational.

Action
Meaning

You can see that the MC-LAG is operational because the MC-AE interface and ICCP connections are up.

Verify that the New Version of Junos OS is Installed
Purpose

Verify that the new version of Junos OS is installed on QFX5100-A and QFX5100-B.

Action
Meaning

You can see that Junos OS 18.4R3.3 is installed on QFX5100-A and QFX5100-B.

Conclusion

Manually upgrading QFX Series MC-LAG Fabric

Step-by-Step Procedure

Device Configuration Details

Procedure

Step-by-Step Procedure

This is the MC-LAG configuration used in this example.

QFX5100-A

Step-by-Step Procedure

QFX5100-B