Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Guide That Contains This Content
[+] Expand All
[-] Collapse All

    High Availability

    This design meets the high availability requirements of hardware redundancy and software redundancy.

    Hardware Redundancy

    To provide hardware redundancy in the virtualized IT data center, this solution uses:

    • Redundant server hardware—Two IBM 3750 standalone servers and two IBM Pure Flex System Chassis
    • Redundant access and aggregation PODs—Two QFX3000-M QFabric systems
    • Redundant core switches—Two EX9214 switches
    • Redundant edge firewalls—Two SRX3600 Services Gateways
    • Redundant edge routers—Two MX240 Universal Edge routers
    • Redundant storage—Two EMC VNX5500 unified storage
    • Redundant load balancers—Two F5 LTM 4200v load balancers
    • Out-of-band management switches use Virtual Chassis technology—Four EX4300 switches

    Software Redundancy

    To provide software redundancy in the virtualized IT data center, this solution uses:

    • Graceful restart—Helper routers assist restarting devices in restoring routing protocols, state, and convergence.
    • Graceful Routing Engine switchover—Keeps the operating system state synchronized between the master and backup Routing Engines in a Juniper Network device.
    • In-service software upgrade(for the core switches and edge routers)—Enables the network operating system to be upgraded without downtime.
    • MC-LAG—Enables aggregated Ethernet interface bundles to contain interfaces from more than one device.
    • Nonstop active routing—Keeps the Layer 3 protocol state synchronized between the master and backup Routing Engines.
    • Nonstop bridging—Keeps the Layer 2 protocol state synchronized between the master and backup Routing Engines.
    • Nonstop software upgrade—(for the QFX3000-M QFabric system PODs)—Enables the network operating system to be upgraded with minimal impact to forwarding.
    • Virtual Router Redundancy Protocol (VRRP)—Provides a virtual IP address for traffic and forwards the traffic to one of two peer routers, depending on which one is operational.

    MC-LAG Design Considerations

    To allow all the links to forward traffic without using Spanning Tree Protocol (STP), you can configure MC-LAG on edge routers and core switches. The edge routers use MC-LAG toward the edge firewalls, and the core switches use MC-LAG toward each QFabric POD, application load balancer (F5), and out-of-band (OOB) management switch.

    Multichassis link aggregation group (MC-LAG) is a feature that supports aggregated Ethernet bundles spread across more than one device. Link Aggregation Control Protocol (LACP) supports MC-LAG and is used for dynamic configuration and monitoring on links. The available options for MC-LAG include Active/Standby (where one device is active and the other assists if the active device fails) or Active/Active (where both devices actively participate in the MC-LAG connection).

    For this solution, MC-LAG Active/Active is preferred because it provides link-level and node-level protection for Layer 2 networks and Layer 2/Layer 3 combined hybrid environments.

    Highlights of MC-LAG Active/Active

    MC-LAG Active/Active has the following characteristics:

    • Both core switches have active aggregated Ethernet member interfaces and forward the traffic. If one of the core switches fails, the other core switch will forward the traffic. Traffic is load balanced by default, so link-level efficiency is 100 percent.
    • The Active/Active method has faster convergence than the Active/Standby method. Fast convergence occurs because information is exchanged between the routers during operations. After a failure, the remaining operational core switch does not need to relearn any routes and continues to forward the traffic.
    • Routing protocols (such as OSPF) can be used over MC-LAG/IRB interfaces for Layer 3 termination.
    • If you configure Layer 3 protocols in the core, you can use an integrated routing and bridging (IRB) interface to offer a hybrid Layer 2 and Layer 3 environment at the core switch.
    • Active/Active also offers maximum utilization of resources and end-to-end load balancing.

    To extend a link aggregation group (LAG) across two devices (MC-LAG):

    • Both devices need to synchronize their aggregated Ethernet LACP configurations
    • Learned MAC address and ARP entries must be synchronized.

    The above MC-LAG requirements are achieved by using the following protocols/mechanisms as shown in Figure 1:

    1. Interchassis Control Protocol (ICCP)
    2. Interchassis Link Protection Link (ICL-PL)

    Figure 1: MC-LAG – ICCP and ICL Design

    MC-LAG – ICCP and ICL Design
    1. ICCP
      • ICCP is a control plane protocol for MC-LAG. It uses TCP as a transport protocol and Bidirectional Forwarding Detection (BFD) for fast convergence. When you configure ICCP, you must also configure BFD.
      • ICCP synchronizes configurations and operational states between the two MC-LAG peers.
      • ICCP also synchronizes MAC address and ARP entries learned from one MC-LAG node and shares them with the other peer.
      • • Peering with the ICCP peer loopback IP address is recommended to avoid any direct link failure between MC-LAG peers. As long as the logical connection between the peers remains up, ICCP stays up.
      • Although you can configure ICCP on either a single link or an aggregated bundle link, an aggregated Ethernet LAG is preferred.
      • • You can also configure ICCP and ICL links on a single aggregated Ethernet bundle under multiple logical interfaces using flexible VLAN tagging supported on MX Series platforms.
    2. ICL-PL
      • ICL is a special layer 2 link for Active-Active only between the MC-LAG peers
      • ICL-PL is needed to protect connectivity of MC-LAG in case of failure of all core facing links corresponding to one MC-LAG node.
      • If the traffic receiver is single homed to one of the MC-LAG nodes (N1), ICL is used to forward the packets received by way of the MC-LAG interface to the other MC-LAG nodes (N2).
      • Split horizon is enabled to avoid loop on the ICL
      • There is no data plane MAC learning over ICL.

    MC-LAG Specific Configuration Parameters

    Redundancy group ID—ICCP uses a redundancy group to associate multiple chassis that perform similar redundancy functions. A redundancy group establishes a communication channel so that applications on ICCP peers can reach each other. A redundancy group ID is similar to a mesh group identifier.

    MC-AE ID—The multi-chassis aggregated Ethernet (MC-AE) ID is a per-multi-chassis interface. For example, if one MC-AE interface is spread across multiple core switches, you should assign the same redundancy group ID. When an application wants to send a message to a particular redundancy group, the application provides the information and ICCP delivers it to the members of that redundancy group.

    Service ID—A new service ID object for bridge domains overrides any global switch options configuration for the bridge domain. The service ID is unique across the entire network for a given service to allow correct synchronization. For example, a service ID synchronizes applications like IGMP, ARP, and MAC address learning for a given service across the core switches. (Note: Both MC-LAG peers must share the same service ID for a given bridge domain.)

    MC-LAG Active/Active Layer 3 Routing Features

    MC-LAG Active/Active is a Layer 2 logical link. IRB interfaces are used to create integrated Layer 2 and Layer 3 links. As a result, you have two design options when assigning IP addresses across MC-LAG peers:

    • Option 1: VRRP MC-LAG Active/Active provides common virtual IP and MAC addresses and unique physical IP and MAC addresses. Both address types are needed if you configure routing protocols on MC-LAG Active/Active interfaces. The VRRP data forwarding logic has been modified in Junos OS if you configure both MC-LAG Active/Active and VRRP. When configured simultaneously, both the MC-LAG and VRRP peers forward traffic and load-balance the traffic between them, as shown in Figure 2.

      Figure 2: VRRP and MC-LAG – Active/Active Option

      VRRP and MC-LAG – Active/Active

      Data packets received by the backup VRRP peer on the MC-LAG member link are forwarded to the core link without sending them to the master VRRP peer.

    • Option 2: MAC address synchronization Figure 3 provides a unique IP address per peer, but shares a MAC address between the MC-LAG peers. You should use option 2 if you do not plan to configure routing protocols on the MC-LAG Active/Active interfaces.

      Figure 3: MC-LAG – MAC Address Synchronization Option

      MC-LAG – MAC Address Synchronization
      • You configure the same IP address on the IRB interfaces of both node.
      • The lowest MAC address is selected as the gateway MAC address.
      • • The peer with the higher IRB MAC address learns the peer’s MAC address through ICCP and installs the peer MAC address as its own MAC address.
      • On MX Series platforms, configure mcae-mac-synchronize in the bridge domain configuration.
      • On EX9214 switches, configure mcae-mac-synchronize in a VLAN configuration.

    We recommend Option 1 as the preferred method for the MetaFabric 1.0 solution for the following reasons:

    • The solution requires OSPF as the routing protocol between the QFabric PODs and the core switches on the MC-LAG IRB interfaces and only Option 1 supports routing protocols.
    • Layer 3 extends to the QFabric PODs for some VLANs for hybrid Layer 2/Layer 3 connectivity to the core.

    MC-LAG Active/Active Traffic Forwarding Rules

    Figure 4: MC-LAG – Traffic Forwarding Rules

    MC-LAG – Traffic Forwarding Rules

    As shown in Figure 4, the following forwarding rules apply to MC-LAG Active/Active:

    • Traffic received on N1 from MCAE1 could be flooded to the ICL link to reach N2. When it reaches N2, it must not be flooded back to MCAE1.
    • Traffic received on SH1 could be flooded to MCAE1 and ICL by way of N1. When N2 receives SH1 traffic across the ICL link, it must not be again flooded to MCAE1. N2 also receives the SH1 traffic by way of the MC-AE link.
      • When receiving a packet from the ICL link, the MC-LAG peers forward the traffic to all local SH links. If the corresponding MCAE link on the peer is down, the receiving peer also forwards the traffic to its MCAE links.

      Note: ICCP is used to signal MCAE link state between the peers.

    • When N2 receives traffic from the ICL link and the N1 core link is up, the traffic should not be forwarded to the N2 core link.

    MC-LAG Active/Active High Availability Events

    ICCP is down, when ICL is up:

    Figure 5: MC-LAG – ICCP Down

    MC-LAG – ICCP Down

    Here are the actions that happen when the ICCP link is down and the ICL link is up:

    • By default, if the ICCP link fails, as shown in Figure 5, the peer defaults to its own local LACP system ID and the links for only one peer (whichever one negotiates with the customer edge [CE] router first) are attached to the bundle. Until LACP converges with a new system ID, there will be minimum traffic impact.
    • One peer stays active, while the other enters standby mode (but this is nondeterministic).
    • The access switch selects a core switch and establishes LACP peering.

    To optimize for this condition, include the prefer-status-control-active statement on the active peer.

    • With the prefer-status-control-activestatement configured on the active peer, the peer remains active and retains the same LACP system ID.
    • With the force-icl-down statement, the ICL link shuts down when the ICCP link fails.
    • By configuring these statements, traffic impact is minimized during an ICCP link failure.

    ICCP is up and ICL goes down:

    Figure 6: MC-LAG – ICL Down

    MC-LAG – ICL Down

    Here are the actions that happen when the ICCP link is up and the ICL link is down:

    • If you configure a peer with the prefer-status-control-standby statement, the MC-AE interfaces shared with the peer and connected to the ICL go down.
    • This configuration ensures a loop-free topology because it does not forward duplicate packets in the Layer 2 network.

    Active MC-LAG node down with ICCP loopback peering with prefer-status-control-active on both peers:

    Figure 7: MC-LAG – Peer Down

    MC-LAG – Peer Down

    Here are the actions that happen when both MC-LAG peers are configured with the prefer-status-control-active statement and the active peer goes down:

    • When you configure MC-LAG Active/Active between SW1/SW2 and the QFabric POD, SW1 becomes active and SW2 becomes standby. During an ICCP failure event, if SW1 has the prefer-status-control active statement and it fails, SW2 is not aware of the ICCP or SW1 failures. As a result, SW2 mcae-id switches to the default LACP system ID, which causes the MC-LAG link to go down and up, and results in long traffic reconvergence times.
    • To avoid this situation, configure the prefer-status-control-active statement on both SW1 and SW2. Also, you should prevent ICCP failures by configuring ICCP on a loopback interface.
    • Configure backup-liveness-detection on both the active and standby peers. BFD helps to detect peer failures and enable sub-second reconvergence.

    The design for high availability in the MetaFabric 1.0 solution meets the requirements for hardware redundancy and software redundancy.

    Published: 2015-04-20