The control plane software, which operates in active/backup mode, is an integral part of JUNOS software that is active on the primary node of a cluster. It achieves redundancy by communicating state, configuration, and other information to the inactive Routing Engine on the secondary node. If the primary Routing Engine fails, the secondary one is ready to assume control.
Before You Begin |
|---|
This topic includes:
The control plane software:
Information from the control plane software follows two paths:
The control plane software running on the primary Routing Engine maintains state for the entire cluster, and only processes running on its node can update state information. The primary Routing Engine synchronizes state for the secondary node and also processes all host traffic.
The control link relies on a proprietary protocol to transmit session state, configuration, and liveliness signals across the nodes.
On SRX 5600 and SRX 5800 devices, by default, all control ports are disabled. Each SPC in a device has two control ports, and each device can have multiple SPCs plugged into it. To set up the control link in a chassis cluster with SRX 5600 or SRX 5800 devices, you connect and configure the control ports that you will use on each device (fpcn and fpcn) and then initialize the device in cluster mode.
For SRX 3400 and SRX 3600 devices, there are dedicated chassis cluster (HA) control ports on the switch fabric board. No control link configuration is needed for SRX 3400 and SRX 3600 devices.
For SRX 210 devices, the fe-0/0/7 interface is used for the control link.
In a J-series chassis cluster, the control link is a physical connection between the ge-0/0/3 ports on each device, with both transformed into fxp1.
To set up the control link on J-series devices, you connect the control interfaces on the two devices back-to-back. When you initialize a device in cluster mode, JUNOS software renames the control interface to fxp1 and uses that interface for the cluster control link. To enable the control link to transmit data, the system provides each fxp1 control link interface with an internal IP address.
JUNOS software transmits heartbeat signals over the control link at a configured interval. The system uses heartbeat transmissions to determine the “health” of the control link. If the number of missed heartbeats has reached the configured threshold, the system assesses whether a failure condition exists.
You specify the heartbeat threshold and heartbeat interval when you configure the chassis cluster.
The system monitors the control link's status by default.
If the control link fails, JUNOS software disables the secondary node to prevent the possibility of each node becoming primary for all redundancy groups, including redundancy group 0.
In the event of a legitimate control link failure, redundancy group 0 remains primary on the node on which it is currently primary, inactive redundancy groups x on the primary node become active, and the secondary node enters a disabled state in which it is not handling traffic.
![]() |
Note: When the secondary node is disabled, you can still log in to the management port and run diagnostics. |
To determine if a legitimate control link failure has occurred, the system relies on redundant liveliness signals sent across the control link and the data link.
The system periodically transmits probes over the fabric data link and heartbeat signals over the control link. Probes and heartbeat signals share a common sequence number that maps them to a unique time event. The software identifies a legitimate control link failure if the following two conditions exist:
When a legitimate control link failure occurs, the following conditions apply:
If the system cannot determine which Routing Engine is primary, the node with the higher priority value for redundancy group 0 is primary and its Routing Engine is active. (You configure the priority for each node when you configure the redundancy-group statement for redundancy group 0.)
To recover a device from disabled mode, you must reboot the device. When you reboot the disabled node, the node will synchronize its dynamic state with the primary node.
![]() |
Note: If you make any changes to the configuration while the secondary node is disabled, execute the commit command to synchronize the configuration after you reboot the node. If you did not make configuration changes, the configuration file remains synchronized with that of the primary node. |
You cannot enable preemption for redundancy group 0. If you want to change the primary node for redundancy group 0, you must do a manual failover.
You can specify that control link recovery be done automatically by the system by setting the control-link-recovery statement. In this case, once the system determines that the control link is healthy, it issues an automatic reboot on the disabled node. When the disabled node reboots, the node joins the cluster again.