MC-LAG Technical Overview

 

Multichassis link aggregation groups (MC-LAGs) enable a client device to form a logical LAG interface between two MC-LAG peers. An MC-LAG provides redundancy and load balancing between the two MC-LAG peers, multihoming support, and a loop-free Layer 2 network without running the Spanning Tree Protocol (STP).

Figure 1 illustrates the basic MC-LAG topology. On one end of the MC-LAG, there are two MC-LAG peers. Each of the MC-LAG peers has one or more physical links connected to the client device, such as a server or access switch. The client device, which is at the other end of the MC-LAG link, does not need to have an MC-LAG configured and does not need to be aware of MC-LAG. From its perspective, it is connecting to a single device through a LAG. The MC-LAG peers use the Inter-chassis Control Protocol (ICCP) to exchange control information and coordinate with each other to ensure that data traffic is forwarded properly.

Figure 1: Basic MC-LAG Topology
Basic MC-LAG Topology

This topic provides an overview of MC-LAG and discusses the following:

ICCP and ICL

The MC-LAG peers use the Inter-Chassis Control Protocol (ICCP) to exchange control information and coordinate with each other to ensure that data traffic is forwarded properly. ICCP replicates control traffic and forwarding states across the MC-LAG peers and communicates the operational state of the MC-LAG members. It uses TCP as a transport protocol and requires Bidirectional Forwarding Detection (BFD) for fast convergence. Because ICCP uses TCP/IP to communicate between the peers, the two peers must be connected to each other. ICCP messages exchange MC-LAG configuration parameters and ensure that both peers use the correct LACP parameters.

The interchassis link (ICL), also known as the interchassis link-protection link (ICL-PL), is used to forward data traffic across the MC-LAG peers. This link provides redundancy when a link failure (for example, an MC-LAG trunk failure) occurs on one of the active links. The ICL can be a single physical Ethernet interface or an aggregated Ethernet interface.

You can configure multiple ICLs between MC-LAG peers. Each ICL can learn up to 512K MAC addresses. You can configure additional ICLs for virtual switch instances.

When configuring ICCP and the ICL, we recommend that you:

  • Use the peer loopback address to establish ICCP peering. Doing so avoids any direct link failure between MC-LAG peers. As long as the logical connection between the peers remains up, ICCP stays up.

  • Use separate ports and choose different FPCs for the ICL and ICCP interfaces. Although you can use a single link for the ICCP interface, an aggregated Ethernet interface is preferred.

  • Configure the ICCP liveness-detection interval (the BFD timer) to be at least 8 seconds, if you have configured ICCP connectivity through an IRB interface. A liveness-detection interval of 8 seconds or more allows graceful Routing Engine switchover (GRES) to work seamlessly. By default, ICCP liveness detection uses multihop BFD, which runs in centralized mode.

    This recommendation does not apply if you have configured ICCP connectivity through a dedicated physical interface. In this case, you can configure single-hop BFD.

  • Configure a session establishment hold time for ICCP. Doing so results in faster ICCP connection establishment. The recommended value is 50 seconds.

  • Configure a hold-down timer on the ICL member links that is greater than the configured BFD timer for the ICCP interface. This prevents the ICL from being advertised as being down before the ICCP link is down. If the ICL goes down before the ICCP link, this causes a flap of the MC-LAG interface on the status-control standby node, which leads to a delay in convergence.

Active/Standby and Active/Active Modes

MC-LAG can be configured in active/standby mode, in which only one device actively forwards traffic, or in active/active mode, in which both devices actively forward traffic.

In active/standby mode, only one of the MC-LAG peers is active at any given time. The other MC-LAG peer is in backup (standby) mode. The active MC-LAG peer uses the Link Aggregation Control Protocol (LACP) to advertise to client devices that its child link is available for forwarding data traffic.

In active/active mode, all member links are active on the MC-LAG. In this mode, media access control (MAC) addresses learned on one MC-LAG peer are propagated to the other MC-LAG peer.

Figure 2 illustrates the difference between active/standby and active/active.

Figure 2: MC-LAG Active/Standby Versus Active/Active
MC-LAG Active/Standby
Versus Active/Active

This network configuration example uses active/active as the preferred mode for the following reasons:

  • Traffic is load-balanced in active/active mode, resulting in a link-level efficiency of 100 percent.

  • Convergence is faster in active/active mode than in active/standby mode. In active/active mode, information is exchanged between devices during operations. After a failure, the operational switch or router does not need to relearn any routes and continues to forward traffic.

  • Active/active mode enables you to configure Layer 3 protocols on integrated routing and bridging (IRB) interfaces, providing a hybrid Layer 2 and Layer 3 environment on the core switch.

MC-LAG Interface

You configure an MC-LAG interface under the same configuration hierarchy as a LAG interface. You must configure the following:

  • LACP—Configure LACP on the LAG. LACP is a subcomponent of the IEEE 802.3ad standard. LACP is used to discover multiple links from a client device connected to an MC-LAG peer. LACP must be configured on all member links for an MC-LAG to work correctly.

  • LACP system ID—Configure the same LACP system ID for the MC-LAG on each MC-LAG peer.

  • MC-LAG specific options—MC-LAG specific options are configured under the mc-ae option. Table 1 describes the mc-ae options.

Table 1: mc-ae Statement Options

mc-ae Option

Description

mc-ae-id

Specifies which MC-LAG group the aggregated Ethernet interface belongs to.

redundancy-group

Used by ICCP to associate multiple chassis that perform similar redundancy functions and to establish a communication channel so that applications on peering chassis can send messages to each other.

We recommend that you configure only one redundancy group between MC-LAG nodes. The redundancy group represents the domain of high availability between the MC-LAG nodes. One redundancy group is sufficient between a pair of MC-LAG nodes. If you are using logical systems, this recommendation applies to each logical system—that is, configure one redundancy group between MC-LAG nodes in each logical system.

init-delay-time

Specifies the number of seconds by which to delay bringing the MC-LAG interface back to the up state when the MC-LAG peer is rebooted. By delaying the bring-up of the interface until after protocol convergence, you can prevent packet loss during the recovery of failed links and devices.

This network configuration example uses a delay time of 520 seconds. This delay time might not be optimal for your network and should be adjusted to fit your network requirements.

chassis-id

Used by LACP for calculating the port number of the MC-LAG physical member links. Each MC-LAG peer should have a unique chassis ID.

mode

Indicates whether an MC-LAG is in active/standby mode or active/active mode. Chassis that are in the same group must be in the same mode. In this configuration example, the mode is active/active.

status-control

Specifies whether this node becomes active or goes into standby mode when an ICL failure occurs. Must be active on one node and standby on the other node.

events iccp-peer-down force-icl-down

Forces the ICL down if the peer of this node goes down.

events iccp-peer-down prefer-status-control-active

Allows the LACP system ID to be retained during a reboot, which provides better convergence after a failover. Note that if you configure both nodes as prefer-status-control-active, as this configuration example shows, you must also configure ICCP peering using the peer’s loopback address to make sure that the ICCP session does not go down due to physical link failure.

Additional MC-LAG Specific Configuration

In addition to configuring ICCP, the ICL, and the MC-LAG interfaces, you must configure the following:

  • Multichassis link protection—Configure multichassis link protection on each MC-LAG peer. Multichassis link protection provides link protection between the two MC-LAG peers hosting an MC-LAG. If the ICCP connection is up and the ICL comes up, the peer configured as standby brings up the MC-LAG interfaces shared with the peer.

    You can configure multichassis link protection under the multi-chassis hierarchy or under the logical interface configuration for each MC-LAG.

  • Service ID—You must configure the same service ID on each MC-LAG peer when the MC-LAG logical interfaces are part of a bridge domain, as they are in this example. The service ID, which is configured under the switch-options hierarchy, is used to synchronize applications such as IGMP, ARP, and MAC learning across MC-LAG members. If you are configuring virtual switch instances, configure a different service ID for each virtual switch instance.

Data Traffic Forwarding Rules in Active/Active MC-LAG Topologies

In active/active MC-LAG topologies, network interfaces can be categorized into three interface types, as follows:

  • Single-homed link terminating on an MC-LAG peer device

  • MC-LAG links

  • ICL

These links are shown in Figure 3, which is used to illustrate the traffic forwarding rules that apply to MC-LAG active/active.

Figure 3: MC-LAG Traffic Forwarding Rules
MC-LAG Traffic Forwarding Rules

The traffic forwarding rules are:

  • Traffic received on MC-LAG peer N1 from the MC-LAG interface could be flooded to the ICL link to reach N2. When it reaches N2, it is not flooded back to the MC-LAG interface.

  • Traffic received on SH1 could be flooded to the MC-LAG interface and the ICL by way of N1. When N2 receives SH1 traffic across the ICL link, it is not flooded to the MC-LAG interface.

  • When receiving a packet from the ICL link, the MC-LAG peers forward the traffic to all local SH links. If the corresponding MC-LAG link on the peer is down, the receiving peer also forwards the traffic to its MC-LAG links.

    Note

    ICCP is used to signal MC-LAG link state between the peers.

  • When N2 receives traffic from the ICL link, the traffic is not forwarded to the N2 upstream link if the upstream link is an MC-LAG link and the corresponding MC-LAG link on N1 is up.

Failure Handling During a Split-Brain State

Configuring ICCP adjacency over aggregated links with child links on multiple FPCs mitigates the possibility of a split-brain state. A split-brain occurs when ICCP adjacency is lost between the MC-LAG peers. To work around this problem, enable backup liveness detection. With backup liveness detection enabled, the MC-LAG peers establish an out-of-band channel through the management network in addition to the ICCP channel.

During a split-brain state, both active and standby peers change LACP system IDs. Because both MC-LAG peers change the LACP system ID, the CE device accepts the LACP system ID of the first link that comes up and brings down other links carrying different LACP system IDs. When the ICCP connection is active, both of the MC-LAG peers use the configured LACP system ID. If the LACP system ID is changed during failures, the server that is connected over the MC-LAG removes these links from the aggregated Ethernet bundle.

When the ICL is operationally down and the ICCP connection is active, the LACP state of the links with status control configured as standby is set to the standby state. When the LACP state of the links is changed to standby, the server that is connected over the MC-LAG makes these links inactive and does not use them for sending data.

Recovery from the split-brain state occurs automatically when the ICCP adjacency comes up between the MC-LAG peers.

If only one physical link is available for ICCP, then ICCP might go down due to link failure or FPC failure, while the node is still up. This results in a split-brain state. If you do not set a special configuration to avoid this situation, the MC-LAG interfaces change the LACP system ID to their local defaults, thus ensuring that only one link (the first) comes up from the downstream device. A convergence delay results from the LACP state changes on both active and standby nodes.

To avoid this problem of the split-brain state and resultant convergence delays, configure one of the following two options:

  • Enable backup liveness detection on the management (fxp0) interface. This is the preferred option.

    For example:

    When you configure backup-liveness-detection, an out-of-band channel is established between the nodes, through the management network, to test the liveness of the Routing Engine. When both ICCP and backup liveness detection fail, the remote node is considered down, so the LACP system ID is not changed on the local node.

    You must also configure the master-only statement on the IP address of the fxp0 interface for backup liveness detection, on both the master and backup Routing Engines, to ensure that the connection is not reset during GRES in the remote peer.

  • Configure prefer-status-control-active under the mc-ae options for the MC-LAG on both nodes.

    For example:

    When you configure prefer-status-control-active, if ICCP goes down and backup liveness detection is up, the LACP system ID is not changed. Thus, if ICCP alone fails, the LACP system ID is not changed on the active node but it is changed on the standby node.

Layer 2 Feature Support

Support for the following Layer 2 features are discussed in this section:

MAC Address Management

Without proper MAC address management, an MC-LAG configuration could result in unnecessary flooding. For example:

  • When an MC-LAG is configured to be active/active, upstream and downstream traffic could go through different MC-LAG peer devices. This means that the MAC address learned on one peer would have to be relearned on the other peer, causing unnecessary flooding.

  • A single-homed client's MAC address is learned only on the MC-LAG peer that it is attached to. If a client attached to the peer MC-LAG network device needs to communicate with that single-homed client, then traffic would be flooded on the peer MC-LAG network device.

To avoid unnecessary flooding, whenever a MAC address is learned on one of the MC-LAG peers, the address is replicated to the other MC-LAG peer. MAC address replication is performed as follows:

  • MAC addresses learned on an MC-LAG of one MC-LAG peer are replicated as learned on the same MC-LAG of the other MC-LAG peer.

  • MAC addresses learned on single-homed clients of one MC-LAG peer are replicated as learned on the ICL interface of the other MC-LAG peer.

  • MAC address learning from the data path is disabled on the ICL. MAC address learning on the ICL depends on software installing MAC addresses replicated through ICCP.

MAC Aging

MAC aging support in the Juniper Networks Junos® operating system (Junos OS) extends aggregated Ethernet logic for a specified MC-LAG. Aging of MAC addresses occurs when the MAC address is not seen on both of the MC-LAG peers. A MAC address in software is not deleted until all Packet Forwarding Engines have deleted the MAC address.

Spanning Tree Protocol

STP can be used to prevent loops in MC-LAG topologies. A potential loop, such as one that can happen due to miscabling at the core or access switching layer or due to a bug in server software, is broken by STP blocking one of the interfaces in the downstream network.

If your network topology requires RSTP or VSTP to prevent loops, configure the two MC-LAG nodes with same Spanning Tree Protocol (STP) virtual root ID using the Reverse Layer 2 Gateway Protocol (RL2GP). This root ID should be superior to all bridges in the downstream network while downstream bridges have to be capable of running STP. Because both the MC-LAG nodes are root bridges (virtual), the MC-LAG interface remains in the forwarding state. A downstream bridge receives bridge protocol data units (BPDUs) from both the nodes and thus receives twice the number of BPDUs on its aggregated Ethernet interface. If both MC-LAG nodes use the same aggregated Ethernet interface name, the STP port number will be identical, which reduces the STP load on the downstream bridge.

This network configuration example provides an example of configuring RSTP with RL2GP.

Note

STP is not supported on the ICL. If you enable STP globally, disable it on the ICL. This also means RSTP and VSTP cannot be configured on the ICL or ICL-PL.

Note

When configuring RSTP or VSTP in Junos, the MC-AE nodes must have the same system identifier configured as well as the highest bridge priority in the topology.

Layer 2 Multicast Feature Support

Layer 2 unknown multicast and IGMP snooping are supported. Key elements of this support are as follows:

  • Flooding happens on all links across peers if both peers have virtual LAN membership. Only one of the peers forwards traffic on a given MC-LAG link.

  • Known and unknown multicast packets are forwarded across the peers by adding the ICL as a multicast router port.

  • IGMP membership learned on MC-LAG links is propagated across peers.

  • During an MC-LAG peer reboot, known multicast traffic is flooded until the IGMP snooping state is synced with the peer.

IGMP Snooping on an Active/Active MC-LAG

IGMP snooping controls multicast traffic in a switched network. When IGMP snooping is not enabled, the Layer 2 device broadcasts multicast traffic out of all of its ports, even if the hosts on the network do not want the multicast traffic. With IGMP snooping enabled, a Layer 2 device monitors the IGMP join and leave messages sent from each connected host to a multicast router. This enables the Layer 2 device to keep track of the multicast groups and associated member ports. The Layer 2 device uses this information to make intelligent decisions and to forward multicast traffic to only the intended destination hosts. IGMP uses Protocol Independent Multicast (PIM) to route the multicast traffic. PIM uses distribution trees to determine which traffic is forwarded.

In an active/active MC-LAG configuration, IGMP snooping replicates the Layer 2 multicast routes so that each MC-LAG peer has the same routes. If a device is connected to an MC-LAG peer by way of a single-homed interface, IGMP snooping replicates join messages to its IGMP snooping peer. If a multicast source is connected to an MC-LAG by way of a Layer 3 device, the Layer 3 device passes this information to the IRB that is configured on the MC-LAG. The first hop designated router (DR) is responsible for sending the register and register-stop messages for the multicast group. The last hop DR is responsible for sending PIM join and leave messages toward the rendezvous point and source for the multicast group. The routing device with the smallest preference metric forwards traffic on transit LANs.

When configuring IGMP snooping, keep these guidelines in mind:

  • You must configure the ICL interface as a multicast router interface (by configuring the multicast-router-interface statement) for multicast forwarding to work in an MC-LAG environment. For the scenario in which traffic arrives by way of a Layer 3 interface, you must enable PIM and IGMP on the IRB interface configured on the MC-LAG peers.

  • You must configure the multichassis-lag-replicate-state statement under the multicast-snooping-options hierarchy for Internet Group Management Protocol (IGMP) snooping to work properly in an MC-LAG environment.

Layer 3 Feature Support

To provide Layer 3 routing functions to downstream clients, the MC-LAG network peers must be configured to provide the same gateway address to the downstream clients. To the upstream routers, the MC-LAG network peers could be viewed as either equal-cost multipath (ECMP) or two routes with different preference values. The following two methods can be used to enable Layer 3 functionality across an MC-LAG:

  • VRRP over IRB—Configure different IP addresses on IRB interfaces on the MC-LAG peers and run the Virtual Router Redundancy Protocol (VRRP) over the IRB interfaces. The virtual IP address is the gateway IP address for the MC-LAG clients.

  • MAC address synchronization—Configure the same IP address on the IRB interfaces on the MC-LAG peers, and configure the MAC address synchronization feature using the mcae-mac-synchronize statement. The IP address will be the gateway IP address for the MC-LAG clients.

We recommend that you use the VRRP over IRB method. Use MAC address synchronization only when you cannot configure VRRP over IRB. This network configuration example uses VRRP over IRB.

The following Layer 3 features are supported:

VRRP over IRB

Junos OS supports active/active MC-LAGs by using VRRP in active/standby mode. VRRP in active/standby mode enables Layer 3 routing over the multichassis aggregated Ethernet (MC-AE) interfaces on the MC-LAG peers. In this mode, the MC-LAG peers act as virtual routers. The peers share the virtual IP address that corresponds to the default route configured on the host or server connected to the MC-LAG. This virtual IP address (of the IRB interface) maps to either of the VRRP MAC addresses or to the logical interfaces of the MC-LAG peers. The host or server uses the VRRP MAC address to send any Layer 3 upstream packets.

At any time, one of the VRRP devices is the master (active), and the other is a backup (standby). Usually, a VRRP backup node does not forward incoming packets. However, when VRRP over IRB is configured in an MC-LAG active/active environment, both the VRRP master and the VRRP backup forward Layer 3 traffic arriving on the MC-AE interface, as shown in Figure 4. If the master fails, all the traffic shifts to the MC-AE link on the backup.

Figure 4: VRRP Forwarding in MC-LAG Configuration
VRRP Forwarding in MC-LAG Configuration
Note

You must configure VRRP on both MC-LAG peers for both the active and standby members to accept and route packets.

Routing protocols run on the primary IP address of the IRB interface, and both of the MC-LAG peers run routing protocols independently. The routing protocols use the primary IP address of the IRB interface and the IRB MAC address to communicate with the MC-LAG peers. The IRB MAC address of each MC-LAG peer is replicated on the other MC-LAG peer and is installed as a MAC address that has been learned on the ICL.

Note

If you are using the VRRP over IRB method to enable Layer 3 functionality, you must configure static ARP entries through the ICL for the IRB interface of the remote MC-LAG peer to allow routing protocols to run over the IRB interfaces.

For example, the following configures static ARP entries for IRB.21, where ae0.21 is the ICL interface:

MAC Address Synchronization

MAC address synchronization enables an MC-LAG peer to forward Layer 3 packets arriving on MC-AE interfaces with either its own IRB MAC address or its peer’s IRB MAC address. Each MC-LAG peer installs its own IRB MAC address as well as the peer’s IRB MAC address in the hardware. Each MC-LAG peer treats the packet as if it were its own packet. If MAC address synchronization is not enabled, the IRB MAC address is installed on the MC-LAG peer as if it was learned on the ICL.

Note

Use MAC address synchronization only if you are not planning to run routing protocols on the IRB interfaces. MAC address synchronization does not support routing protocols on the IRB interfaces. If you need routing capability, configure both VRRP and routing protocols on each MC-LAG peer.

Control packets destined for a particular MC-LAG peer that arrive on an MC-AE interface of its MC-LAG peer are not forwarded on the ICL interface. Additionally, using the gateway IP address as a source address when you issue either a ping, traceroute, telnet, or FTP request is not supported.

Note

Gratuitous ARP requests are not sent when the MAC address on the IRB interface changes.

To enable the MAC address synchronization feature, issue the set vlan vlan-name mcae-mac-synchronize command on each MC-LAG peer. Configure the same IP address on both MC-LAG peers. This IP address is used as the default gateway for the MC-LAG servers or hosts.

Additional guidelines for implementing MAC address synchronization include:

  • Make sure that you configure the primary IP address on both MC-LAG peers. Doing this ensures that both MC-LAG peers cannot become assert winners.

  • Using Bidirectional Forwarding Detection (BFD) and MAC address synchronization together is not supported because ARP fails.

Address Resolution Protocol Synchronization for Active/Active MC-LAG Support

The Address Resolution Protocol (ARP) maps IP addresses to MAC addresses. Junos OS uses ARP response packet snooping to support active/active MC-LAGs, providing easy synchronization without the need to maintain any specific state. Without synchronization, if one MC-LAG peer sends an ARP request, and the other MC-LAG peer receives the response, ARP resolution is not successful. With synchronization, the MC-LAG peers synchronize the ARP resolutions by sniffing the packet at the MC-LAG peer receiving the ARP response and replicating this to the other MC-LAG peer. This ensures that the entries in ARP tables on the MC-LAG peers are consistent.

When one of the MC-LAG peers restarts, the ARP destinations on its MC-LAG peer are synchronized. Because the ARP destinations are already resolved, its MC-LAG peer can forward Layer 3 packets out of the MC-AE interface.

Note

In some cases, ARP messages received by one MC-LAG peer are replicated to the other MC-LAG peer through ICCP. This optimization feature is applicable only for ARP replies, not ARP requests, received by the MC-LAG peers.

Note

Dynamic ARP resolution over the ICL interface is not supported. Consequently, incoming ARP replies on the ICL are discarded. However, ARP entries can be populated on the ICL interface through ICCP exchanges from a remote MC-LAG peer.

Note

During graceful Routing Engine switchover (GRES), ARP entries that were learned remotely will be purged and then learned again.

Note

ARP and MAC address tables normally stay synchronized in MC-LAG configurations, but might get out of sync under certain network conditions (such as link flapping). To ensure these tables remain in sync while those conditions are being resolved, we recommend enabling the arp-l2-validate statement on IRB interfaces in an MC-LAG configuration, as follows:

This option turns on validation of ARP and MAC table entries, automatically applying updates if they become out of sync.

DHCP Relay with Option 82

Note

DHCP relay is not supported with MAC address synchronization. If DHCP relay is required, configure VRRP over IRB for Layer 3 functionality.

DHCP relay with option 82 provides information about the network location of DHCP clients. The DHCP server uses this information to implement IP addresses or other parameters for the client. With DHCP relay enabled, DHCP request packets might take the path to the DHCP server through either of the MC-LAG peers. Because the MC-LAG peers have different hostnames, chassis MAC addresses, and interface names, you need to observe these requirements when you configure DHCP relay with option 82:

  • Use the interface description instead of the interface name.

  • Do not use the hostname as part of the circuit ID or remote ID strings.

  • Do not use the chassis MAC address as part of the remote ID string.

  • Do not enable the vendor ID.

  • If the ICL interface receives DHCP request packets, the packets are dropped to avoid duplicate packets in the network.

    A counter called Due to received on ICL interface has been added to the show helper statistics command, which tracks the packets that the ICL interface drops.

    An example of the CLI output follows:

    user@switch> show helper statistics

    The output shows that six packets received on the ICL interface have been dropped.

Layer 3 Multicast Feature Support

The Protocol Independent Multicast (PIM) protocol and the Internet Group Management Protocol (IGMP) provide support for Layer 3 multicast.

PIM Operation

In standard mode of designated router election, one of the MC-LAG peers becomes the designated router through the PIM designated router election mechanism. The elected designated router maintains the rendezvous-point tree (RPT) and shortest-path tree (SPT) so it can receive data from the source device. The elected designated router participates in periodic PIM join and prune activities toward the rendevous point (RP) or the source.

The trigger for initiating these join and prune activities is the IGMP membership reports that are received from interested receivers. IGMP reports received over MC-AE interfaces (potentially hashing on either of the MC-LAG peers) and single-homed links are synchronized to the MC-LAG peer through ICCP.

Both MC-LAG peers receive traffic on their incoming interface (IIF). The non-designated router receives traffic by way of the ICL interface, which acts as a multicast router (mrouter) interface.

If the designated router fails, the non-designated router has to build the entire forwarding tree (RPT and SPT), which can cause multicast traffic loss.

Layer 3 Multicast Configuration Guidelines

When you configure Layer 3 multicast, keep in mind the following guidelines:

  • Enable PIM on the IRB interfaces on both MC-LAG nodes.

  • Configure the ICL interface as a router-facing interface (by configuring the multicast-router-interface statement) for multicast forwarding to work in an MC-LAG environment.

  • On the MC-LAG peer that has status-control-active configured, configure a high IP address or a high DR priority.

MC-LAG Upgrade Guidelines

Upgrade the MC-LAG peers according to the following guidelines.

Note

After a reboot, the MC-LAG interfaces come up immediately and might start receiving packets from the server. If routing protocols are enabled, and the routing adjacencies have not been formed, packets might be dropped.

To prevent this scenario, issue the set interfaces interface-name aggregated-ether-options mc-ae init-delay-time time command to set a time by which the routing adjacencies are formed.

  1. Make sure that both of the MC-LAG peers (node1 and node2) are in the active/active state using the following command on any one of the MC-LAG peers:
    user@switch> show interfaces mc-ae id 1
  2. Upgrade node1 of the MC-LAG.

    When node1 is upgraded it is rebooted, and all traffic is sent across the available LAG interfaces of node2, which is still up. The amount of traffic lost depends on how quickly the neighbor devices detect the link loss and rehash the flows of the LAG.

  3. Verify that node1 is running the software you just installed by issuing the show version command.
  4. Make sure that both nodes of the MC-LAG (node1 and node2) are in the active/active state after the reboot of node1.
  5. Upgrade node2 of the MC-LAG.

    Repeat Step 1 through Step 3 to upgrade node2.

You can also use unified in-service software upgrade (ISSU) to upgrade the MC-LAG peers. On a device with dual Routing Engines, such as an EX9200, unified ISSU enables you to upgrade between two different Junos OS releases with no disruption on the control plane and with minimal disruption of traffic.

The guidelines for upgrading an MC-LAG using unified ISSU are similar to those for a regular upgrade:

  • You must upgrade each MC-LAG peer independently. A unified ISSU performed on one peer does not trigger an upgrade of the other peer.

  • We recommend that you upgrade each peer sequentially. Wait until one peer is fully upgraded before initiating a unified ISSU on the other peer.

In addition, graceful Routing Engine switchover (GRES) and nonstop active routing (NSR) must be enabled on each peer.

Summary of MC-LAG Configuration Guidelines

Table 2 summarizes key configuration guidelines for an active/active MC-LAG configuration.

Table 2: Summary of Configuration Guidelines

Element

Configuration Guidelines

ICCP and ICL

  • Use the peer loopback IP address for peering to avoid any direct link failure between MC-LAG peers. As long as the logical connection between the peers remains up, ICCP stays up.

  • Use separate ports and choose different FPCs for the ICL and ICCP interfaces. Although you can use a single link for the ICCP interface, an aggregated Ethernet interface is preferred.

  • Configure the ICCP liveness-detection interval (the BFD timer) to be at least 8 seconds, if you have configured ICCP connectivity through an IRB interface.

  • Configure a session establishment hold time for ICCP. Doing so results in faster ICCP connection establishment. The recommended value is 50 seconds.

  • Configure a hold-down timer on the ICL member links that is greater than the configured BFD timer for the ICCP interface. Doing so can minimize convergence delay.

MC-LAG Interface

  • Configure LACP on all member links.

  • Configure only one redundancy group between MC-LAG nodes.

  • Configure either backup liveness detection or prefer-status-control-active on both MC-LAG peers to avoid LACP system ID flap during a reboot. Backup liveness detection is the preferred method of avoiding the LACP system ID flap. Use the prefer-status-control-active method only when you can ensure that ICCP goes down only when the node goes down.

Layer 2 Multicast

  • Configure the ICL as a multicast router interface.

  • Configure the multichassis-lag-replicate-state statement under the multicast-snooping-options hierarchy.

Layer 3

  • Use VRRP over IRB or MAC address synchronization to enable Layer 3 routing. We recommend using VRRP over IRB. If you use MAC address synchronization, routing protocols on IRBs are not supported.

  • Configure static ARP on the IRB peers to enable IRB-to-IRB connectivity across the ICL.

  • We recommend enabling the arp-l2-validate statement on IRBs as follows:

    user@host# set interfaces irb arp-l2-validate

Layer 3 Multicast

  • Enable PIM on the IRB interfaces on both MC-LAG nodes.

  • Configure the ICL as a multicast router interface.

  • On the MC-LAG peer that has status-control-active configured, configure the IP address with a high IP address or a high DR priority.

Understanding Multichassis Link Aggregation Group (MC-LAG) Configuration Synchronization

Starting with Junos OS Release 16.1R1, configuration synchronization enables you to easily propagate, synchronize, and commit configurations from one MC-LAG peer to another. You can log into any one of the MC-LAG peers to manage both MC-LAG peers, thus having a single point of management. Using configuration groups to simplify the configuration process, create one configuration group for the local MC-LAG peer, one for the remote MC-LAG peer, and one for the global configuration, which is essentially a configuration that is common to both MC-LAG peers.

In addition, using conditional groups help to specify when a configuration is synchronized with another MC-LAG peer. Enable the peers-synchronize statement at the [edit system commit] hierarchy to synchronize the configurations across the MC-LAG peers by default. NETCONF over SSH provides a secure connection between the MC-LAG peers, and Secure Copy Protocol (SCP) copies the configurations securely between them.

To enable MC-LAG configuration synchronization, perform the following steps:

  1. Create configuration groups for local, remote, and global configurations.

  2. Create conditional groups.

  3. Create apply groups.

  4. Enable NETCONF over SSH.

  5. Configure MC-LAG peer details and user authentication details for MC-LAG configuration synchronization.

  6. Enable the peers-synchronize statement or issue the commit peers-synchronize command to synchronize and commit the configurations between local and remote MC-LAG peers.

Understanding Configuration Groups

You can create configuration groups for local, remote, and global configurations. A local configuration group is used by the local MC-LAG peer, a remote configuration group is used by the remote MC-LAG peer, and a global configuration group is shared between the local and remote MC-LAG peers.

For example, you could create a local configuration group called Group A, which would include the configuration used by the local MC-LAG peer (Switch A), a remote configuration group called Group B, which would include the configuration used by the remote MC-LAG peer (Switch B), and a global configuration group called Group C, which would include the configuration that is common to both MC-LAG peers.

Create configuration groups at the [edit groups] hierarchy level.

Note

MC-LAG configuration synchronization does not support nested groups.

Understanding Conditional Groups

You can create conditional groups to specify when a particular configuration group is applied. To do this, issue the set groups name-of-group when peers [ static-hostname-of-local-peer static-hostname-of-remote-peer ] command. The when statement defines the conditions under which the configuration group is applied. The peers statement enables you to specify the conditions, which in this case, are the static hostames of the MC-LAG peers. For example, to specify that peers Switch A and Switch B will apply the configuration group called Group C, issue the set groups GroupC when peers [ SwitchA SwitchB ] command.

Understanding Apply Groups

To apply the configuration groups, enable the apply-groups statement at the [edit] hierarchy level. For example, to apply the local configuration group (Group A, for example), remote configuration group (Group B), and global configuration group (Group C), issue the set apply-groups [ GroupA GroupB GroupC ] command.

Understanding Peer Configuration Details for MC-LAG Configuration Synchronization

To synchronize configurations between two MC-LAG peers, you need to configure the hostname or IP address, username, and password for both the local and remote MC-LAG peers. To do this, issue the set peers <hostname-of-remote-peer> user <name-of-user> authentication <plain-text-password-string> command at the [edit system commit] hierarchy on each MC-LAG peer. For example, to synchronize a configuration from Switch A to Switch B, issue the set peers SwitchB user administrator authentication test123 command on Switch A. To synchronize a configuration from Switch B to Switch A, issue the set peers SwitchA user administrator authentication test123 command on Switch B. If you only want to synchronize configurations from Switch A to Switch B, you do not need to configure the peers statement on Switch B.

The configuration details from the peers statements are also used to establish a NETCONF over SSH connection between the MC-LAG peers. To enable NETCONF over SSH, issue the set system services netconf ssh command on both MC-LAG peers.

Understanding How Configurations Are Synchronized Between MC-LAG Peers

The local (or requesting) MC-LAG peer on which you enable the peers-synchronize statement or issue the commit peers-synchronize command copies and loads its configuration to the remote (or responding) MC-LAG peer. Each MC-LAG peer then performs a syntax check on the configuration file being committed. If no errors are found, the configuration is activated and becomes the current operational configuration on both MC-LAG peers. The commits are propagated using a remote procedural call (RPC).

The following events occur during configuration synchronization:

  1. The local MC-LAG peer sends the sync-peers.conf file (the configuration that will be shared with the peer specified in the conditional group) to the remote MC-LAG peer.

  2. The remote MC-LAG peer loads the configuration, sends the results of the load to the local MC-LAG peer, exports its configuration to the local MC-LAG peer, and replies that the commit is complete.

  3. The local MC-LAG peer reads the reply from the remote MC-LAG peer.

  4. If successful, the configuration is committed.

Configuration synchronization is not successful if either a) the remote MC-LAG peer is unavailable or b) the remote MC-LAG peer is reachable, but there are failures due to the following reasons:

  • SSH connection fails because of user and authentication issues.

  • Junos OS RPC fails because a lock cannot be obtained on the remote database.

  • Loading the configuration fails because of syntax problems.

  • Commit check fails.

The peers-synchronize statement uses the hostname or IP address, username, and password for the MC-LAG peers you configured in the peers statement. With the peers-synchronize statement enabled, you can simply issue the commit command to synchronize the configuration from one MC-LAG peer to another. For example, if you configured the peers statement on the local MC-LAG peer, and want to synchronize the configuration with the remote MC-LAG peer, you can simply issue the commit command on the local MC-LAG peer. However, if you issue the commit command on the local MC-LAG peer and the remote MC-LAG peer is not reachable, you will receive a warning message saying that the remote MC-LAG peer is not reachable and only the configuration on the local MC-LAG peer is committed:

Here is an example warning message:

If you do not have the peers statement configured with the remote MC-LAG peer information and you issue the commit command, only the configuration on the local MC-LAG peer is committed. If the remote MC-LAG peer is unreachable and there are other failures, the commit is unsuccessful on both the local and remote MC-LAG peers.

Note

When you enable the peers-synchronize statement and issue the commit command, the commit might take longer than a normal commit. Even if the configuration is the same across the MC-LAG peers and does not require synchronization, the system still attempts to synchronize the configurations.

The commit peers-synchronize command also uses the hostname or IP address, username, and password for the MC-LAG peers configured in the peers statement. If you issue the commit peers-synchronize command on the local MC-LAG peer to synchronize the configuration with the remote MC-LAG peer and the remote MC-LAG peer is reachable but there are other failures, the commit fails on both the local and remote MC-LAG peers.

Understanding Multichassis Link Aggregation Group (MC-LAG) Configuration Consistency Check

Starting with Junos OS Release 16.1R1, the configuration consistency check feature was introduced. Configuration consistency check uses the Inter-Chassis Control Protocol (ICCP) to exchange MC-LAG configuration parameters (chassis ID, service ID, and so on) and checks for any configuration inconsistencies across MC-LAG peers. An example of an inconsistency is configuring identical chassis IDs on both peers instead of configuring unique chassis IDs on both peers. When there is an inconsistency, you are notified and can take action to resolve it. Configuration consistency check is invoked after you issue a commit on an MC-LAG peer.

The following events take place during configuration consistency check after you issue a commit on the local MC-LAG peer:

  1. Commit an MC-LAG configuration on the local MC-LAG peer.

  2. ICCP parses the MC-LAG configuration and then sends the configuration to the remote MC-LAG peer.

  3. The remote MC-LAG peer receives the MC-LAG configuration from the local MC-LAG peer and compares it with its own MC-LAG configuration.

    If the there is a severe inconsistency between the two MC-LAG configurations, the MC-LAG interface is brought down, and syslog messages are issued.

    If there is a moderate inconsistency between the two configurations, syslog messages are issued.

The following events take place during configuration consistency check after you issue a commit on the remote MC-LAG peer:

  • Commit an MC-LAG configuration on the remote MC-LAG peer.

  • ICCP parses the MC-LAG configuration and then sends the configuration to the local MC-LAG peer.

  • The local MC-LAG peer receives the configuration from the remote MC-LAG peer and compares it with its own configuration.

    If the there is a severe inconsistency between the two configurations, the MC-LAG interface is brought down, and syslog messages are issued.

    If there is a moderate inconsistency between the two configurations, syslog messages are issued.

There are different configuration consistency requirements depending on the MC-LAG parameters. The consistency requirements are either identical or unique, which means that some parameters must be configured identically and some must be configured uniquely on the MC-LAG peers. For example, the chassis ID must be unique on both peers, whereas the LACP mode must be identical on both peers.

Note

For information on the MC-LAG parameters that are checked for consistency, as well as the commands you can issue to verify configuration consistency check, see Understanding Multichassis Link Aggregation Group Configuration Consistency Check .

The enforcement level of the consistency requirements (identical or unique) is either mandatory or desired. When the enforcement level is mandatory, and you configure the MC-LAG parameter incorrectly, the system brings down the MC-LAG interface and issues a syslog message. For example, you receive a syslog message that says, “Some of the Multichassis Link Aggregation (MC-LAG) configuration parameters between the peer devices are not consistent. The concerned MC-LAG interfaces were explicitly brought down to prevent unwanted behavior.” When you correct the inconsistency, and issue a successful commit, the system will bring up the interface. When the enforcement is desired, and you configure the MC-LAG parameter incorrectly, you receive a syslog message that says, "Some of the Multichassis Link Aggregation(MC-LAG) configuration parameters between the peer devices are not consistent. This may lead to sub-optimal performance of the feature." As noted in the syslog message, performance will be sub-optimal in this situation. You can also issue the show interfaces -mc-ae command to display the configuration consistency check status of the multichassis aggregated Ethernet interface. If there are multiple inconsistencies, only the first inconsistency is shown. If the enforcement level for an MC-LAG parameter is mandatory, and you did not configure that parameter correctly, the command shows that the MC-LAG interface is down.

When you issue a commit on the local peer, and the remote peer is not reachable, configuration consistency check will pass so that the local peer can come up in standalone mode. When the remote peer becomes reachable, ICCP exchanges the configurations between the peers. If the consistency check fails, the MC-LAG interface goes down, and the system notifies you of the parameter that caused the inconsistency. When you correct the inconsistency, and issue a successful commit, the system brings up the interface.

Consistency check is not enabled by default. To enable consistency check, issue the set multi-chassis mc-lag consistency-check command.