Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Configuring Routing Engine Redundancy

 

The following sections describe how to configure Routing Engine redundancy:

Note

To complete the tasks in the following sections, re0 and re1 configuration groups must be defined. For more information about configuration groups, see the CLI User Guide.

Modifying the Default Routing Engine Mastership

For routers with two Routing Engines, you can configure which Routing Engine is the master and which is the backup. By default, the Routing Engine in slot 0 is the master (re0) and the one in slot 1 is the backup (re1).

Note

In systems with two Routing Engines, both Routing Engines cannot be configured to be master at the same time. This configuration causes the commit check to fail.

To modify the default configuration, include the routing-engine statement at the [edit chassis redundancy] hierarchy level:

slot-number can be 0 or 1. To configure the Routing Engine to be the master, specify the master option. To configure it to be the backup, specify the backup option. To disable a Routing Engine, specify the disabled option.

Note

To switch between the master and the backup Routing Engines, see Manually Switching Routing Engine Mastership.

Configuring Automatic Failover to the Backup Routing Engine

The following sections describe how to configure automatic failover to the backup Routing Engine when certain failures occur on the master Routing Engine.

Without Interruption to Packet Forwarding

For routers with two Routing Engines, you can configure graceful Routing Engine switchover (GRES). When graceful switchover is configured, socket reconnection occurs seamlessly without interruption to packet forwarding. For information about how to configure graceful Routing Engine switchover, see Configuring Graceful Routing Engine Switchover.

On Detection of a Hard Disk Error on the Master Routing Engine

After you configure a backup Routing Engine, you can direct it to take mastership automatically if it detects a hard disk error from the master Routing Engine. To enable this feature, include the on-disk-failure statement at the [edit chassis redundancy failover] hierarchy level.

On Detection of a Broken LCMD Connectivity Between the VM and RE

Set the following configuration that will result in an automatic RE switchover when the LCMD connectivity between VM and RE is broken. To enable this feature, include the on-loss-of-vm-host-connection statement at the [edit chassis redundancy failover] hierarchy level.

If the LCMD process is crashing on the master, the system will switchover after one minute provided the backup RE LCMD connection is stable. The system will not switchover under the following conditions: if the backup RE LCMD connection is unstable or if the current master just gained mastership. When the master has just gained mastership, the switchover happens only after four minutes.

On Detection of a Loss of Keepalive Signal from the Master Routing Engine

After you configure a backup Routing Engine, you can direct it to take mastership automatically if it detects a loss of keepalive signal from the master Routing Engine.

To enable failover on receiving a loss of keepalive signal, include the on-loss-of-keepalives statement at the [edit chassis redundancy failover] hierarchy level:

When graceful Routing Engine switchover is not configured, by default, failover occurs after 300 seconds (5 minutes). You can configure a shorter or longer time interval.

Note

The keepalive time period is reset to 360 seconds when the master Routing Engine has been manually rebooted or halted.

To change the keepalive time period, include the keepalive-time statement at the [edit chassis redundancy] hierarchy level:

The range for keepalive-time is 2 through 10,000 seconds.

The following example describes the sequence of events if you configure the backup Routing Engine to detect a loss of keepalive signal in the master Routing Engine:

  1. Manually configure a keepalive-time of 25 seconds.

  2. After the Packet Forwarding Engine connection to the primary Routing Engine is lost and the keepalive timer expires, packet forwarding is interrupted.

  3. After 25 seconds of keepalive loss, a message is logged, and the backup Routing Engine attempts to take mastership. An alarm is generated when the backup Routing Engine becomes active, and the display is updated with the current status of the Routing Engine.

  4. After the backup Routing Engine takes mastership, it continues to function as master.

Note

When graceful Routing Engine switchover is configured, the keepalive signal is automatically enabled and the failover time is set to 2 seconds (4 seconds on M20 routers). You cannot manually reset the keepalive time.

Note

When you halt or reboot the master Routing Engine, Junos OS resets the keepalive time to 360 seconds, and the backup Routing Engine does not take over mastership until the 360-second keepalive time period expires.

A former master Routing Engine becomes a backup Routing Engine if it returns to service after a failover to the backup Routing Engine. To restore master status to the former master Routing Engine, you can use the request chassis routing-engine master switch operational mode command.

If at any time one of the Routing Engines is not present, the remaining Routing Engine becomes master automatically, regardless of how redundancy is configured.

On Detection of the em0 Interface Failure on the Master Routing Engine

After you configure a backup Routing Engine, you instruct it to take mastership automatically if the em0 interface fails on the master Routing Engine. To enable this feature, include the on-re-to-fpc-stale statement at the [edit chassis redundancy failover] hierarchy level.

When a Software Process Fails

To configure automatic switchover to the backup Routing Engine if a software process fails, include the failover other-routing-engine statement at the [edit system processes process-name] hierarchy level:

process-name is one of the valid process names. If this statement is configured for a process, and that process fails four times within 30 seconds, the router reboots from the other Routing Engine. Another statement available at the [edit system processes] hierarchy level is failover alternate-media. For information about the alternate media option, see the Junos OS Administration Library.

Manually Switching Routing Engine Mastership

To manually switch Routing Engine mastership, use one of the following commands:

  • On the backup Routing Engine, request that the backup Routing Engine take mastership by issuing the request chassis routing-engine master acquire command.

  • On the master Routing Engine, request that the backup Routing Engine take mastership by using the request chassis routing-engine master release command.

  • On either Routing Engine, switch mastership by issuing the request chassis routing-engine master switch command.

Verifying Routing Engine Redundancy Status

A separate log file is provided for redundancy logging at /var/log/mastership. To view the log, use the file show /var/log/mastership command. Table 1 lists the mastership log event codes and descriptions.

Table 1: Routing Engine Mastership Log

Event Code

Description

E_NULL = 0

The event is a null event.

E_CFG_M

The Routing Engine is configured as master.

E_CFG_B

The Routing Engine is configured as backup.

E_CFG_D

The Routing Engine is configured as disabled.

E_MAXTRY

The maximum number of tries to acquire or release mastership was exceeded.

E_REQ_C

A claim mastership request was sent.

E_ACK_C

A claim mastership acknowledgement was received.

E_NAK_C

A claim mastership request was not acknowledged.

E_REQ_Y

Confirmation of mastership is requested.

E_ACK_Y

Mastership is acknowledged.

E_NAK_Y

Mastership is not acknowledged.

E_REQ_G

A release mastership request was sent by a Routing Engine.

E_ACK_G

The Routing Engine acknowledged release of mastership.

E_CMD_A

The command request chassis routing-engine master acquire was issued from the backup Routing Engine.

E_CMD_F

The command request chassis routing-engine master acquire force was issued from the backup Routing Engine.

E_CMD_R

The command request chassis routing-engine master release was issued from the master Routing Engine.

E_CMD_S

The command request chassis routing-engine master switch was issued from a Routing Engine.

E_NO_ORE

No other Routing Engine is detected.

E_TMOUT

A request timed out.

E_NO_IPC

Routing Engine connection was lost.

E_ORE_M

Other Routing Engine state was changed to master.

E_ORE_B

Other Routing Engine state was changed to backup.

E_ORE_D

Other Routing Engine state was changed to disabled.