Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Configuring a Virtual Chassis Heartbeat Connection

Starting in Junos OS Release 14.1, you must configure an IP-based, bidirectional “heartbeat” packet connection between the primary router and backup router in an MX Series Virtual Chassis. The heartbeat connection determines the health and availability of member routers in the Virtual Chassis. The member routers forming this heartbeat connection exchange heartbeat packets that provide critical information about the availability and health of each member router. During a disruption or split in the Virtual Chassis configuration, the heartbeat connection prevents the member routers from changing primary role roles unnecessarily. Without the heartbeat connection, a change in primary role roles in such a situation can produce undesirable results, such as having two Virtual Chassis primary routers or no Virtual Chassis primary router.

Benefits of Configuring a Virtual Chassis Heartbeat Connection

Configuring a Virtual Chassis heartbeat connection provides the following benefits for an MX Series Virtual Chassis:

  • Improved resiliency during failure scenarios

    Configuring the heartbeat connection improves resiliency of the Virtual Chassis in the event of an adjacency disruption or split caused by a failure of the Virtual Chassis port interfaces, or when one of the member routers goes out of service. If the heartbeat connection detects that the Virtual Chassis primary router (VC-P) is still operating and able to respond during a split, the software maintains primary role on the existing VC-P, isolates the Virtual Chassis backup router (VC-B) until the Virtual Chassis recovers, and resumes the backup role on the VC-B when the Virtual Chassis forms again. As a result, the heartbeat connection prevents the member routers from unnecessarily changing primary role roles, which consumes system resources and causes unexpected and undesirable results.

    When the VC-B is isolated during a disruption, the software immediately restarts all line cards and powers off all network ports until the disruption is resolved and the Virtual Chassis forms again. This behavior supports network applications with external equipment that requires a physical link-down condition to switch the traffic paths to other connections.

  • Enhanced primary-role election process

    The Virtual Chassis Control Protocol (VCCP) controls primary-role election in a Virtual Chassis. When you configure the heartbeat connection in an MX Series Virtual Chassis, the VCCP software assesses the health information collected from the heartbeat connection to help determine which member router should become the global primary (VC-P) in the event of an adjacency disruption or split. When the heartbeat connection detects that the peer member router is responsive, the VCCP software suppresses unnecessary changes in primary role roles.

    By contrast, when the heartbeat connection is not configured, the VCCP software does not have this additional health information when determining the appropriate primary role roles after a disruption or split.

  • Ability to easily view and clear statistics related to the heartbeat connection

    Operational commands for the Virtual Chassis enable you to display the status of the heartbeat connection, review detailed statistics and latency measurements related to the heartbeat connection, and clear heartbeat-related statistics counters and timestamp fields for one or both member routers.

Configuration Requirements for the Heartbeat Connection

To establish a heartbeat connection for an MX Series Virtual Chassis, you must configure a secure and reliable route between the primary router and backup router for the exchange of TCP/IP heartbeat packets. Specifically, you must ensure that the primary Routing Engine in the Virtual Chassis backup router (VC-Bp) can make a TCP/IP connection to the master-only IP address of the primary Routing Engine in the Virtual Chassis primary router (VC-Pp).

The following additional requirements apply when you configure the heartbeat connection:

  • Configure the heartbeat connection only between Virtual Chassis member routers eligible to become the Virtual Chassis primary router, also known as the protocol primary or global primary.

    In a two-member MX Series Virtual Chassis configuration, you assign the routing-engine role to each router as part of the preprovisioned configuration. The routing-engine role enables the router to function either as the primary router or backup router of the Virtual Chassis as needed. As a result, you can configure the heartbeat connection between both member routers in a two-member MX Series Virtual Chassis configuration.

  • Use the router’s Ethernet management interface (fxp0) as the heartbeat path.

    The management interface is generally available earlier than the line card interfaces, and is typically connected to a more secure network than the other interfaces.

  • Configure a master-only IP address for the fxp0 management interface to ensure consistent access to the VC-Pp, regardless of which Routing Engine is currently active.

    The master-only address is active only on the management interface for the VC-Pp. During a switchover, the master-only address moves to the new Routing Engine currently functioning as the VC-Pp.

  • Ensure TCP connectivity between the VC-Pp and VC-Bp member routers

    The Virtual Chassis heartbeat connection opens a proprietary TCP port numbered 33087 on the VC-Pp to listen for heartbeat messages. If your network design includes firewalls or filters, make sure the network allows traffic between TCP port 33087 on the VC-Pp and the dynamically allocated TCP port on the VC-Bp.

  • When using a heartbeat connection, do not configure the no-split-detection statement as part of the preprovisioned Virtual Chassis configuration.

    The no-split-detection statement suppresses any action when a split is detected in the Virtual Chassis. Using the no-split-detection statement is prohibited when you configure a heartbeat connection, and the software prevents you from configuring both the no-split-detection and heartbeat-address statements at the same time. If you attempt to do so, the software displays an error message and causes the commit operation to fail.

In a two-member MX Series Virtual Chassis, you can configure a heartbeat connection with both member routers in the same subnet, or with each member router in a different subnet. Table 1 summarizes the important differences between the configuration procedures for member routers in the same subnet and member routers in different subnets.

Table 1: Comparison of Heartbeat Connection Configuration Tasks for Member Routers in Same Subnet and Member Routers in Different Subnets

Task

Heartbeat Connection for Member Routers in Same Subnet

Heartbeat Connection for Member Routers in Different Subnets

Configure the master-only IP address for fxp0 management interface.

Configure the same fxp0 master-only IP address for all four member Routing Engines.

Configure two different master-only IP addresses for the fxp0 management interface: one address for the subnet in which the Virtual Chassis primary router resides, and one for the subnet in which the backup router resides.

Configure a network path for the heartbeat connection.

Provide a path for the member routers to reach each other by means of a TCP/IP connection.

For example, in a Virtual Chassis with member routers in the same subnet, you can use the router’s default gateway. Alternatively, you can create a global static route as described in Example: Determining Member Health Using an MX Series Virtual Chassis Heartbeat Connection with Member Routers in the Same Subnet.

Provide a path for the member routers to reach each other by means of a TCP/IP connection. In a Virtual Chassis with member routers in different subnets, you must ensure that both member routers can reach each other’s network.

For example, you can create static routes to both subnets on each member Routing Engine, as described in Example: Determining Member Health Using an MX Series Virtual Chassis Heartbeat Connection with Member Routers in Different Subnets.

Configure the heartbeat address to establish the heartbeat connection.

Configure a single (global) master-only IP address for the fxp0 management interface as the heartbeat address to establish the connection.

Configure a heartbeat address for each member Routing Engine to cross-connect to the master-only IP address for the corresponding Routing Engine in the other subnet.

For example, assume that member0-re0 and member0-re1 reside in subnet 10.4.0.0, and member1-re0 and member1-re1 reside in subnet 10.5.0.0. In this configuration, you would set the heartbeat address for member0-re0 to the master-only IP address for member1-re0 to cross-connect member0-re0 and member1-re0. You would cross-connect member0-re1 and member1-re1 in a similar manner.

How the Heartbeat Connection Works

When the Virtual Chassis is operating properly, the heartbeat connection periodically sends heartbeat packets over the TCP/IP path between the primary Routing Engine in the Virtual Chassis primary router and the primary Routing Engine in the Virtual Chassis backup router.

When an adjacency disruption or split is detected in the Virtual Chassis, each member router sends a final heartbeat message to determine whether the other member is able to respond, and stops sending additional periodic messages until the Virtual Chassis forms again. The other member must respond to the heartbeat message within the default heartbeat timeout period (2 seconds), or within a configured heartbeat timeout period in the range 1 through 60 seconds. To determine the time period that elapses in your network between transmission of a heartbeat request message and receipt of a heartbeat response message, you can issue the show virtual-chassis heartbeat detail command to view the number of seconds reported in the Maximum latency and Minimum latency fields.

Best Practice:

If your network is congested or has a round-trip latency that exceeds 2 seconds, we recommend that you increase the value of the heartbeat timeout period to account for this delay during a Virtual Chassis adjacency disruption or split.

Heartbeat Connection and Virtual Chassis Failure Conditions

Configuring the heartbeat connection prevents unnecessary primary role changes between the Virtual Chassis member routers when an adjacency disruption or split occurs. Table 2 describes the effects on primary role for common failure conditions when you enable the heartbeat connection in a two-member MX Series Virtual Chassis.

Table 2: Effect of Heartbeat Connection on Common Virtual Chassis Failure Conditions

Failure Condition

Result on Virtual Chassis Primary Router (VC-P)

Result on Virtual Chassis Backup Router (VC-B)

Virtual Chassis port interfaces go down.

Retains VC-P role.

If the VC-P is in service but the Virtual Chassis port interfaces are down, the VC-B goes offline after the heartbeat timeout period expires because the Routing Engine state is invalid.

VC-P chassis fails.

Goes out of service.

Becomes VC-P.

VC-B chassis fails.

Retains VC-P role.

Goes out of service.

Heartbeat connection fails.

Retains VC-P role.

Retains VC-B role.

In all cases except when the VC-P chassis fails, primary role of the Virtual Chassis is maintained on the existing VC-Pp if the heartbeat connection detects that the VC-P is still operating and able to respond during a split. Preventing an unnecessary role change minimizes the system load caused by a protocol primary role switch, and reduces the likelihood of unpredictable results.

Lack of Virtual Chassis Heartbeat connection and VCP adjacency loss is a double-fault condition that effectively returns to “no-split-detection” behavior. The two members are unable to verify the condition of their peer member router. Virtual Chassis Heartbeat uses the “no-split-detection” reactions, that requires VC-P to remain in the protocol master role and VC-B as VC-P. This “split-master” condition is not ideal for routing protocols and other topology management mechanisms. In this scenario, the split-master condition is better than operating without any protocol master member.

Virtual Chassis Heartbeat communication is active only when the Virtual Chassis is properly formed with successful election of VC-P and VC-B member chassis roles. Roles determined during a VCP adjacency loss are maintained until the Virtual Chassis is properly formed again. Disruption of Virtual Chassis Heartbeat connectivity does not impact protocol roles in the Virtual Chassis through the duration of “split” conditions.

Heartbeat Connection Compared to Split Detection

In certain Virtual Chassis failure conditions, the split detection setting (enabled by default, or explicitly disabled) can cause unpredictable and undesirable results such as a Virtual Chassis with two primary routers, or a Virtual Chassis with no primary router.

Best Practice:

It is compulsory that you use the heartbeat connection instead of the split detection feature in an MX Series Virtual Chassis to avoid unnecessary primary role changes during an adjacency disruption or split, and to provide additional member health information for the primary-role election process.

Table 3 compares the effects of split detection and the heartbeat connection for two common failure conditions: failure of the Virtual Chassis port interfaces and failure of the VC-B chassis.

Table 3: Comparison of Heartbeat Connection and Split Detection for Virtual Chassis Failure Conditions

Failure Condition

Results with Heartbeat Connection

Results with Split Detection

Virtual Chassis port interfaces go down.

  • VC-P chassis retains VC-P role.

  • If the VC-P chassis is in service but the Virtual Chassis port interfaces are down, the VC-B chassis goes offline after the heartbeat timeout period expires because the Routing Engine state is invalid.

When split detection is disabled:

  • VC-P chassis retains VC-P role.

  • VC-B chassis also takes VC-P role.

  • Virtual Chassis has two primary routers, each of which maintains subscriber state information. The effect on subscribers, traffic patterns, behavior of external applications, and subscriber login and logout operations is unpredictable while the Virtual Chassis port interfaces are disconnected.

VC-B chassis fails.

  • VC-P chassis retains VC-P role.

  • VC-B chassis is out of service.

When split detection is enabled:

  • VC-P chassis takes line-card (VC-L) role, which isolates and removes it from the Virtual Chassis until connectivity is restored.

  • VC-B chassis is out of service.

  • Virtual Chassis does not have a primary router. This state halts interchassis routing and effectively disables the Virtual Chassis configuration.

Release History Table
Release
Description
14.1
Starting in Junos OS Release 14.1, you must configure an IP-based, bidirectional “heartbeat” packet connection between the primary router and backup router in an MX Series Virtual Chassis.