Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

How to Upgrade a Four-Member QFX Series VCF

About This Network Configuration Example

This network configuration example (NCE) shows how to upgrade a four-member QFX Series Virtual Chassis Fabric (VCF) when the nonstop software upgrade (NSSU) process is either not available or undesirable. This process minimizes service disruption and has minimal impact on data center workloads.

Configuration Example

Requirements

We use the following in this example:

  • A two-spine and two-leaf VCF composed of QFX5100 switches running Junos OS Release 14.1X53-D47.6

  • Pre-provisioned mode VCF that is configured using VCF best practices such as Virtual Chassis graceful Routing Engine switchover (GRES) and nonstop bridging (NSB)

  • Layer 2-only VCF

  • MX Series router as the uplink device

  • Serial console access (mandatory)

  • Junos OS Release 18.4R1.8

You can use this approach for an upgrade between any releases as long as all devices in the VCF are running the same release version.

You can use this procedure for the following QFX Series VCFs:

  • A four-member QFX5100 VCF consisting of only QFX5100s

  • A four-member QFX5110 VCF consisting of:

    • Only QFX5110s, or

    • Two QFX5110s as spine devices and two QFX5100s in mixed mode as leaf devices, or

    • Two QFX5110s as spine devices and one QFX5100 and one QFX5110 in mixed mode as leaf devices

The uplink device can be any device with routing functions.

Overview

Sometimes it is either not possible or not desirable to upgrade a VCF to another software release using NSSU. This document shows an alternative method to upgrade a four-member QFX Series VCF with minimal downtime. This method is not a replacement for NSSU, but a minimally invasive method that must be implemented when necessary with proper planning as noted in the following steps.

To upgrade the VCF, first divide it into two VCFs that each consist of one Routing Engine and one line card. After rerouting traffic through one VCF, upgrade the other pair of devices. Reroute traffic through the upgraded VCF before upgrading the remaining pair of devices. Restore the four-member VCF by reconnecting the devices one at a time to the new two-member VCF.

You might see alerts during this procedure, including SNMP traps and system log messages.

Topology

Figure 1 illustrates the topology of the VCF. Members 1 and 0 are connected to the uplink device, while the line cards are connected to the server.

Figure 1: Topology of the VCFTopology of the VCF

Configuration

Prepare for the Upgrade

Step-by-Step Procedure
  1. Log into the device using the root user or another login user with administrative privileges you have configured.

  2. Check the status of the VCF before you begin the upgrade. Note the serial numbers, member IDs, and associated roles of the devices.

  3. Check the Virtual Chassis ports (VCPs) and create a topology diagram for reference. Figure 1 shows the topology of the VCF in this example.

  4. Check that all four members are present. Check the Junos OS image running on each device. Each device must be running the same Junos OS version. If there is a version mismatch, the device should show as inactive.

  5. Use FTP to copy the new Junos OS image to the primary Routing Engine. Then copy the new image from the primary Routing Engine to the other VCF members. See Remote Access Overview for how to configure FTP.

    Figure 2 illustrates how the new Junos OS image is distributed among the members.

    Figure 2: Copy Junos OS Image to VCF membersCopy Junos OS Image to VCF members

    To copy the image from the /var/tmp directory on the primary Routing Engine to Member 3, also called fpc 3’s /var/tmp, use the following statement:

    Note:

    Copying the image can take a little while, so be patient.

    Do the same for the other members. The FPC number is the same as the member number.

  6. Access each member from the VCF primary Routing Engine and confirm that the file was copied to each member. For example, to access Member 3:

    Next, check the /var/tmp directory on this VCF member for the new Junos OS image.

    When you are done, use exit to get back to the primary device.

    Repeat the image check on each device in the VCF.

  7. When you split the VCF in half, you will temporarily form two Virtual Chassis with two members each. We recommend disabling split detection whenever you form a Virtual Chassis with only two members. If you do not disable split detection, the primary device may take on a linecard role and stop the control and data planes when you disconnect it from the backup Routing Engine later in this example.

    Disable split detection on the primary device.

  8. To check for any traffic loss during the procedure, start a continuous ping from the server to IRB 192.168.100.1 on the uplink MX Series router.

Reroute Traffic Through Member 1 and Member 3

Step-by-Step Procedure
  1. Using the figure above, identify the Link Aggregation Control Protocol (LACP) member interfaces and VCPs you will need to disable on Members 0 and 2 to isolate them from the rest of the VCF. The VCPs you will be disabling are port 2 on Member 0 and port 53 on Member 2.

    Use the command below on the primary Routing Engine (Member 1) to determine the names of the relevant interfaces. You will be disabling the LACP member interfaces towards the uplink device and servers. In this case, et-0/0/23.0 is the Member 0 upstream interface and xe-2/0/46.0 is the Member 2 downstream interface.

  2. Access the primary device (Member 1) console and do the following:

    Disable the interface on Member 0 to the uplink device.

    Disable the interface from Member 2 to the server.

    Commit the configuration for it to take effect.

  3. On Member 1:

    Delete the VCP from Member 0 towards Member 3.

    Refer to Step 3 in Prepare for the Upgrade and your topology diagram to determine which VCP you need to disable. Under fpc0 in the table, identify the VCP in the Interface Type or PIC/Port column going to Neighbor ID 3. In this case, disable the VCP identified as PIC/Port 0/2, which is vcp-255/0/2.

    Delete the VCP from Member 2 towards Member 1.

  4. Check that the members were removed from the VCF and marked as NotPrsnt.

Upgrade Member 0 and Member 2

Step-by-Step Procedure
  1. Access the consoles for Members 0 and 2. Enter the following command to upgrade the members to the Junos OS image that was copied onto the devices.

  2. Once each isolated member is upgraded, verify that the isolated members, Member 0 and Member 2, are present.

    A new VCF automatically formed because Member 0 was already configured as a backup Routing Engine, so it took on the primary Routing Engine role when it was disconnected from the original primary device. Member 2 was already configured in the linecard role.

    The output above displays the VCP interfaces that link the devices. If the output does not display VCP interfaces in the last column, complete Step 3.

  3. If the output in the previous step does not show that Member 0 and Member 2 are connected and that they are the members of a new VCF, configure the VCP link between them.

    On Member 0, enable VCP 10.

    On Member 2, enable VCP 52.

  4. Confirm the upgrade was successful.

Reroute Traffic Through Member 0 and Member 2

Step-by-Step Procedure
  1. Simultaneously disable the uplink and server-facing ports on the old VCF pair (Member 1 and Member 3) and enable the server and uplink interfaces in the new VCF that we formed from the upgraded Member 0 and Member 2. This redirects the traffic through the new VCF.

    It is very important to commit the configuration at the same time on both devices so that the LACP states on the host and the uplink MX Series router are maintained. You can do this with scripting, for example with Ansible tooling.

    If you do not commit the configurations at the same time, traffic will be dropped and service impacted for as long as it takes you to disable the ports on the old VCF and enable the interfaces on the new VCF.

    On Member 1, remove the residual configuration from when it was the primary device of the four-member VCF.

    Disable the uplink and server-facing ports on Member 1.

    Enable the uplink and server-facing ports on Member 0.

  2. Run commit at the same time on Member 1 and Member 0.

  3. Check that the continuous ping from the server to IRB 192.168.100.1 on the uplink MX Series router is still running successfully. This confirms the traffic path was switched successfully.

Upgrade Member 1 and Member 3

Step-by-Step Procedure
  1. Check that the old VCF consists of one primary device and one device in a linecard role.

  2. Break the old VCF by deleting the VCPs between Member 1 and Member 3. Since Member 1 is the primary device, you can run these commands on Member 1.

    To delete the VCP from Member 3 towards Member 1:

    To delete the VCP from Member 1 towards Member 3:

    Verify this was successful on each device.

    On Member 1:

    Access the Member 3 console:

  3. Upgrade Member 3 to Junos OS Release 18.4R1.

    Confirm the upgrade was successful.

  4. Upgrade Member 1 to Junos OS Release 18.4R1.

    Confirm the upgrade was successful.

Restore the Four-Member VCF

Step-by-Step Procedure
  1. Add Member 3 to the new VCF by enabling VCP 49 on Member 3 and VCP 2 on Member 0. Figure 3 shows the status of the new VCF after these ports have been enabled.

    Figure 3: Add Member 3 to the New VCFAdd Member 3 to the New VCF

    On Member 3, enable the VCP towards Member 0:

    On Member 0:

    • Enable the VCP towards Member 3:

    • Verify that vcp-255/0/2 is enabled on Member 0 and vcp-255/0/49 is enabled on Member 3:

  2. Since Member 1 was the primary Routing Engine of the original VCF, it will have some residual configurations for Members 0, 2, and 3. These configurations might interfere with the VCF when you add it to the new VCF, especially if Member 1 preempts Member 0 as the primary device for the new VCF.

    On Member 0, the primary device of the new VCF, use the following command to re-enable the server-facing interface for Member 3 and keep Member 1 from accidently shutting it down.

  3. On Member 1, keep the uplink-facing interface et-1/0/23 disabled. Traffic passes to the uplink MX Series router from the neighboring new VCF primary device.

    Note:

    If Member 1 preempts Member 0 as the primary device of the new VCF during the next step, the set interfaces et-1/0/23 disable statement is carried forward to the new VCF. That could cause traffic disruption, in which case this statement would need to be removed immediately.

  4. To add Member 1 to the new VCF, restore the VCP links from Member 1 to Members 2 and 3, as shown in Figure 4.

    Figure 4: Add Member 1 to the New VCFAdd Member 1 to the New VCF

    On Member 1:

    • Set the VCP that connects to Member 3.

    • Set the VCP that connects to Member 2.

    On Member 0, the new VCF primary device:

    • Set the VCP on Member 2 that connects to Member 1.

    • Set the VCP on Member 3 that connects to Member 1.

  5. In most cases, the configuration of the new VCF primary Routing Engine is applied to the newly joined backup Routing Engine. Sometimes the newly joined backup Routing Engine (which was the original VCF primary Routing Engine) may preempt and take over the primary role from the newer VCF primary device. Check whether this has occurred.

    Member 1 has taken over the primary role. This may disrupt the traffic flow. If you observe this, quickly enable the uplink shown in the next step.

  6. On Member 1, enable the uplink-facing interface et-1/0/23 on the new VCF.

    You have now formed a four-member VCF, as shown in Figure 5.

    Figure 5: Restore the Four-Member VCFRestore the Four-Member VCF
  7. Expect less than a minute of traffic disruption as the LACP states reset when Member 1 joins the new VCF. Monitor the ongoing ping from the server.

  8. Since Member 1 has re-taken primary role, check that the uplink and server-facing interfaces were not automatically disabled because of a residual configuration. Run the following commands on Member 1 and check that the LACP child interfaces are up and back in the collecting distributing state.

  9. On Member 1, the new primary device of the VCF, confirm that all VCF members are running the intended Junos OS release.

  10. On the ongoing ping, verify that traffic from the server is flowing normally through the VCF. Expect a downtime of 40-50 seconds.

    The traffic is flowing normally through the VCF. Your four-member VCF is upgraded and fully functional.

Conclusion

This procedure outlines one of the recommended ways to upgrade an entire VCF with minimal impact to data center workloads when NSSU is not available or not desirable.