Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Troubleshooting a Redundancy Group that Does Not Fail Over in an SRX Chassis Cluster

Problem

Description

A redundancy group (RG) in a high-availability (HA) SRX chassis cluster does not fail over.

Environment

SRX chassis cluster

Diagnosis

From the command prompt of the SRX Series Services Gateway that is part of the chassis cluster, run the show chassis cluster status command.

Sample output:

In the sample output check the priority of the redundancy group that does not fail over.

Resolution

Redundancy Group Manual Failover

  1. Check whether a manual failover of the redundancy group was initiated earlier by using the show chassis cluster status command.

    Sample output:

    In the sample output, Priority value of redundancy group 1 (RG1) is 255 and the status of Manual failover is yes, which means that a manual failover of the redundancy group was initiated earlier. You must reset the redundancy group priority.

    Note:

    After a manual failover of a redundancy group, we recommend that you reset the manual failover flag in the cluster status to allow further failovers.

  2. Reset the redundancy group priority by using the request chassis cluster failover reset redundancy-group <1-128>.

    For example:

  3. This must resolve the issue and allow further redundancy group failovers. If these steps do not resolve the issue, proceed to section Whats Next.

  4. If you want to initiate a redundancy group x (redundancy groups numbered 1 through 128) failover manually, see Understanding Chassis Cluster Redundancy Group Manual Failover.

Redundancy Group Auto Failover

  1. Check the configuration and link status of the control and fabric links by using the show chassis cluster interfaces command.

    Sample output for a branch SRX Series Services Gateway:

    Sample output for a high-end SRX Series Services Gateway:

  2. Proceed to Step 3 if both the control link and fabric link are up.

  3. Check the interface monitoring or IP monitoring configurations that are up. If the configurations are not correct rectify the configurations. If the configurations are correct proceed to step 4.

  4. Check the priority of each node in the output of the show chassis cluster status command.

    • If the priority is 0, see KB article KB16869 for JSRP (Junos OS Services Redundancy Protocol) chassis clusters and KB article KB19431 for branch SRX Series Firewalls.

    • If the priority is 255, see Redundancy Group Manual Failover.

    • If the priority is between 1 and 254 and if still the redundancy group does not fail over, proceed to the section Whats Next.

What's Next

  1. If these steps do not resolve the issue, see KB article KB15911 for redundancy group failover tips.

  2. If you wish to debug further, see KB article KB21164 to check the debug logs.

  3. To open a JTAC case with the Juniper Networks Support team, see Data Collection for Customer Support for the data you should collect to assist in troubleshooting before you open a JTAC case.