Monitoring IP Addresses on a Chassis Cluster

IP Monitoring Overview

IP monitoring checks the end-to-end connectivity of configured IP addresses and allows a redundancy group to automatically fail over when the monitored IP address is not reachable through the redundant Ethernet (reth) interface. Both the primary and secondary nodes in the chassis cluster monitor specific IP addresses to determine whether an upstream device in the network is reachable.

IP monitoring allows for failover based upon end to-end reachability of a configured monitored IP address. On SRX Series Firewalls, the reachability test is done by sending a ping to the monitored IP address from both the primary node and the secondary node through the reth interface and checking if a response is returned. The monitored IP address can be on a directly connected host in the same subnet as the reth interface or on a remote device reachable through a next-hop router.

The reachability states of the monitored IP address are reachable, unreachable, and unknown. The status is “unknown” if Packet Forwarding Engines are not yet up and running. The status changes to either "reachable" or "unreachable," depending on the corresponding message from the Packet Forwarding Engine.

We do not recommend configuring chassis cluster IP monitoring on Redundancy Group 0 (RG0) for SRX Series Firewalls.

Table 1 provides details of different combinations of monitored results from both the primary and secondary nodes, and the corresponding actions by the Juniper Services Redundancy Protocol (jsrpd) process.

Table 1: IP Monitoring Results and Failover Action
Primary Node Monitored Status	Secondary Node Monitored Status	Failover Action
Reachable	Reachable	No action
Unreachable	Reachable	Failover
Reachable	Unreachable	No action
Unreachable	Unreachable	No action

You can configure up to 64 IP addresses for IP monitoring on SRX5000 line devices.
On SRX Branch Series devices, when the reth interface has more than one physical interface configured, IP monitoring for redundant groups is not supported. The SRX uses the lowest interface in the bundle for tracking on the secondary node. If the peer forwards the reply on any other port except the one it received it on, the SRX drops it.
The minimum interval of IP monitoring is 1 second and the maximum is 30 seconds. Default interval is 1 second.
The minimum threshold of IP monitoring is 5 requests and the maximum is 15 requests. If the IP monitoring request does not receive a response for consecutive requests (exceeding the threshold value), IP monitoring reports that the monitored IP is unreachable. Default value for the threshold is 5.
Reth interface not associated with Redundancy Group (RG) in IP monitoring CLI configuration is supported.

Table 2 provides details on multiple interface combinations of IOC2 and IOC3 with maximum MAC numbers.

Table 2: Maximum MACs Supported for IP Monitoring on IOC2 and IOC3
Cards	Interfaces	Maximum MACs Supported for IP Monitoring
IOC2 (SRX5K-MPC)	10XGE	10
	20GE	20
	2X40GE	2
	1X100GE	1
IOC3 (SRX5K-MPC3-40G10G or SRX5K-MPC3-100G10G)	24x10GE	24
	6x40GE	6
	2x100GE + 4x10GE	6

Note the following limitations for IP monitoring support on SRX5000 line IOC2 and IOC3:

IP monitoring is supported through the reth or the RLAG interface. If your configuration does not specify either of these interfaces, the route lookup returns a non-reth/RLAG interface, which results in a failure report.
Equal-cost multipath (ECMP) routing is not supported in IP monitoring.

Benefits of Monitoring IP Addresses in a Chassis Cluster

Helps determine the status of a specific IP address in a Chassis Cluster setup as unknown, reachable or unreachable.
Initiates failover based upon end to-end reachability of a configured monitored IP address. If the monitored IP address becomes unreachable, the redundancy group can fail over to its backup to maintain service.

Understanding Chassis Cluster Redundancy Group IP Address Monitoring

Redundancy group IP address monitoring checks end-to-end connectivity and allows a redundancy group to fail over because of the inability of a redundant Ethernet interface (known as a reth) to reach a configured IP address. Redundancy groups on both devices in a cluster can be configured to monitor specific IP addresses to determine whether an upstream device in the network is reachable. The redundancy group can be configured such that if the monitored IP address becomes unreachable, the redundancy group will fail over to its backup to maintain service. The primary difference between this monitoring feature and interface monitoring is that IP address monitoring allows for failover when the interface is still up but the network device it is connected to is not reachable for some reason. It may be possible under those circumstances for the other node in the cluster to route traffic around the problem.

If you want to dampen the failovers occurring because of IP address monitoring failures, use the hold-down-interval statement.

IP address monitoring configuration allows you to set not only the address to monitor and its failover weight but also a global IP address monitoring threshold and weight. Only after the IP address monitoring global-threshold is reached because of cumulative monitored address reachability failure will the IP address monitoring global-weight value be deducted from the redundant group’s failover threshold. Thus, multiple addresses can be monitored simultaneously as well as monitored to reflect their importance to maintaining traffic flow. Also, the threshold value of an IP address that is unreachable and then becomes reachable again will be restored to the monitoring threshold. This will not, however, cause a failback unless the preempt option has been enabled.

When configured, the IP address monitoring failover value (global-weight) is considered along with interface monitoring—if set—and built-in failover monitoring, including SPU monitoring, cold-sync monitoring, and NPC monitoring (on supported platforms). The main IP addresses that should be monitored are router gateway addresses to ensure that valid traffic coming into the services gateway can be forwarded to the appropriate network router.

Starting in Junos OS Release 12.1X46-D35 and Junos OS Release 17.3R1, for all SRX Series Firewalls, the reth interface supports proxy ARP.

One Services Processing Unit (SPU) or Packet Forwarding Engine (PFE) per node is designated to send Internet Control Message Protocol (ICMP) ping packets for the monitored IP addresses on the cluster. The primary PFE sends ping packets using Address Resolution Protocol (ARP) requests resolved by the Routing Engine (RE). The source for these pings is the redundant Ethernet interface MAC and IP addresses. The secondary PFE resolves ARP requests for the monitored IP address itself. The source for these pings is the physical child MAC address and a secondary IP address configured on the redundant Ethernet interface. For the ping reply to be received on the secondary interface, the I/O card (IOC), central PFE processor, or Flex IOC adds both the physical child MAC address and the redundant Ethernet interface MAC address to its MAC table. The secondary PFE responds with the physical child MAC address to ARP requests sent to the secondary IP address configured on the redundant Ethernet interface.

Note:

IP address monitoring is not supported on SRX5000 line devices if the redundant Ethernet interface is configured for a VPN routing and forwarding (VRF) instance.

The default interval to check the reachability of a monitored IP address is once per second. The interval can be adjusted using the retry-interval command. The default number of permitted consecutive failed ping attempts is 5. The number of allowed consecutive failed ping attempts can be adjusted using the retry-count command. After failing to reach a monitored IP address for the configured number of consecutive attempts, the IP address is determined to be unreachable and its failover value is deducted from the redundancy group's global-threshold.

On SRX5600 and SRX5800 devices, only two of the 10 ports on each PIC of 40-port 1-Gigabit Ethernet I/O cards (IOCs) can simultaneously enable IP address monitoring. Because there are four PICs per IOC, this permits a total of eight ports per IOC to be monitored. If more than two ports per PIC on 40-port 1-Gigabit Ethernet IOCs are configured for IP address monitoring, the commit will succeed but a log entry will be generated, and the accuracy and stability of IP address monitoring cannot be ensured. This limitation does not apply to any other IOCs or devices.

Once the IP address is determined to be unreachable, its weight is deducted from the global-threshold. If the recalculated global-threshold value is not 0, the IP address is marked unreachable, but the global-weight is not deducted from the redundancy group’s threshold. If the redundancy group IP monitoring global-threshold reaches 0 and there are unreachable IP addresses, the redundancy group will continuously fail over and fail back between the nodes until either an unreachable IP address becomes reachable or a configuration change removes unreachable IP addresses from monitoring. Note that both default and configured hold-down-interval failover dampening is still in effect.

Every redundancy group x has a threshold tolerance value initially set to 255. When an IP address monitored by redundancy group x becomes unavailable, its weight is subtracted from the redundancy group x's threshold. When redundancy group x's threshold reaches 0, it fails over to the other node. For example, if redundancy group 1 was primary on node 0, on the threshold-crossing event, redundancy group 1 becomes primary on node 1. In this case, all the child interfaces of redundancy group 1's redundant Ethernet interfaces begin handling traffic.

A redundancy group x failover occurs because the cumulative weight of the redundancy group x's monitored IP addresses and other monitoring has brought its threshold value to 0. When the monitored IP addresses of redundancy group x on both nodes reach their thresholds at the same time, redundancy group x is primary on the node with the lower node ID, which is typically node 0.

Upstream device failure detection for the chassis cluster feature is supported on SRX Series Firewalls.

Starting in Junos OS Release 15.1X49-D60 and Junos OS Release 17.3R1, configuring Address Resolution Protocol (ARP) request throttling is supported on SRX5000 line devices. This feature allows you to bypass the previously hard-coded ARP request throttling time default (10 seconds per SPU for each IP address) and set the time to a greater value (10 through 100 seconds). Setting the throttling time to a greater value reduces the high utilization of the Routing Engine, allowing it to work more efficiently. You can configure the ARP request throttling time using the set forwarding-options next-hop arp-throttle <seconds> command.

Monitoring can be accomplished only if the IP address is reachable on a redundant Ethernet interface (known as a reth in CLI commands and interface listings), and IP addresses cannot be monitored over a tunnel. For an IP address to be monitored through a redundant Ethernet interface on a secondary cluster node, the interface must have a secondary IP address configured. IP address monitoring cannot be used on a chassis cluster running in transparent mode. The maximum number of monitoring IP addresses that can be configured per cluster is 64 for the SRX5000 line of devices, SRX1500, SRX1600, SRX2300, and SRX4000 line of devices.

Redundancy group IP address monitoring is not supported for IPv6 destinations.

Example: Configure Chassis Cluster Redundancy Group IP Address Monitoring

This example shows how to configure redundancy group IP address monitoring for an SRX Series Firewall in a chassis cluster.

Requirements
Overview
Configuration
Verification

Requirements

Before you begin:

Set the chassis cluster node ID and cluster ID. See Example: Setting the Node ID and Cluster ID for Security Devices in a Chassis Cluster
Configure the chassis cluster management interface. See Example: Configuring the Chassis Cluster Management Interface.
Configure the chassis cluster fabric. See Example: Configuring the Chassis Cluster Fabric Interfaces.

Overview

You can configure redundancy groups to monitor upstream resources by pinging specific IP addresses that are reachable through redundant Ethernet interfaces on either node in a cluster. You can also configure global threshold, weight, retry interval, and retry count parameters for a redundancy group. When a monitored IP address becomes unreachable, the weight of that monitored IP address is deducted from the redundancy group IP address monitoring global threshold. When the global threshold reaches 0, the global weight is deducted from the redundancy group threshold. The retry interval determines the ping interval for each IP address monitored by the redundancy group. The pings are sent as soon as the configuration is committed. The retry count sets the number of allowed consecutive ping failures for each IP address monitored by the redundancy group.

In this example, you configure the following settings for redundancy group 1:

IP address to monitor—10.1.1.10
IP address monitoring global-weight—100
IP address monitoring global-threshold—200

The threshold applies cumulatively to all IP addresses monitored by the redundancy group.
IP address retry-interval—3 seconds
IP address retry-count—10
Weight—100
Redundant Ethernet interface—reth1.0
Secondary IP address—10.1.1.101

Configuration

CLI Quick Configuration

To quickly configure this example, copy the following commands, paste them into a text file, remove any line breaks, change any details necessary to match your network configuration, copy and paste the commands into the CLI at the [edit] hierarchy level, and then enter commit from configuration mode.

Step-by-Step Procedure

To configure redundancy group IP address monitoring:

Specify a global monitoring weight.

Specify the global monitoring threshold.

Specify the retry interval.

Specify the retry count.

Specify the IP address to be monitored, weight, redundant Ethernet interface, and secondary IP address.

Results

From configuration mode, confirm your configuration by entering the show chassis cluster redundancy-group 1 command. If the output does not display the intended configuration, repeat the configuration instructions in this example to correct it.

For brevity, this show command output includes only the configuration that is relevant to this example. Any other configuration on the system has been replaced with ellipses (...).

If you are done configuring the device, enter commit from configuration mode.

Verification

Verifying the Status of Monitored IP Addresses for a Redundancy Group

Purpose
Action

Purpose

Verify the status of monitored IP addresses for a redundancy group.

Action

From operational mode, enter the show chassis cluster ip-monitoring status command. For information about a specific group, enter the show chassis cluster ip-monitoring status redundancy-group command.

Example: Configuring IP Monitoring on SRX5000 Line Devices for IOC2 and IOC3

This example shows how to monitor IP address on an SRX5000 line device with chassis cluster enabled.

Requirements
Overview
Configuration
Verification

Requirements

This example uses the following hardware and software:

Two SRX5400 Services Gateways with MIC (SRX-MIC-10XG-SFPP [IOC2]), and one Ethernet switch
Junos OS Release 15.1X49-D30

The procedure mentioned in this example is also applicable to IOC3.

Before you begin:

Physically connect the two SRX5400 devices (back-to-back for the fabric and control ports).
Configure the two devices to operate in a chassis cluster.

Overview

IP address monitoring checks end-to-end reachability of the configured IP address and allows a redundancy group to automatically fail over when it is not reachable through the child link of redundant Ethernet (reth) interface. Redundancy groups on both devices, or nodes, in a cluster can be configured to monitor specific IP addresses to determine whether an upstream device in the network is reachable.

Topology

In this example, two SRX5400 devices in a chassis cluster are connected to an Ethernet switch. The example shows how the redundancy groups can be configured to monitor key upstream resources reachable through redundant Ethernet interfaces on either node in a cluster.

You set the system to send pings every second, with 10 losses required to declare unreachability to peer. You also set up a secondary IP address to allow testing from the secondary node.

In this example, you configure the following settings for redundancy group 1:

IP address to be monitored—192.0.2.2, 198.51.100.2, 203.0.113.2
IP monitoring global-weight—255
IP monitoring global-threshold—240
IP monitoring retry-interval—3 seconds
IP monitoring retry-count—10
Weight for monitored IP address—80
Secondary IP addresses— 192.0.2.12, 198.51.100.12, 203.0.113.12

Configuration

CLI Quick Configuration
Configuring IP Monitoring on a 10x10GE SFP+ MIC

CLI Quick Configuration

To quickly configure this example, copy the following commands, paste them into a text file, remove any line breaks, change any details to match your network configuration, copy and paste the commands into the CLI at the [edit] hierarchy level, and then enter commit from configuration mode.

Configuring IP Monitoring on a 10x10GE SFP+ MIC

Step-by-Step Procedure
Results

Step-by-Step Procedure

To configure IP monitoring on a 10x10GE SFP+ MIC:

Specify the number of redundant Ethernet interfaces.

Configure the control ports.

Configure fabric interfaces.

Specify a redundancy group's priority for primacy on each node of the cluster. The higher number takes precedence.

Configure IP monitoring under redundancy-group 1 with global weight, global threshold, retry interval and retry count.

Configure the redundant Ethernet interfaces to redundancy-group 1. Assign a weight to the IP address to be monitored, and configure a secondary IP address that will be used to send packets from the secondary node to track the IP address being monitored.

Assign child interfaces for the redundant Ethernet interfaces from node 0, node 1, and node 2.

Configure the redundant Ethernet interfaces to redundancy-group 1.

Create security zone and assign interfaces to zone.

Results

From configuration mode, confirm your configuration by entering the show security chassis cluster and show interfaces commands. If the output does not display the intended configuration, repeat the configuration instructions in this example to correct it.

If you are done configuring the device, enter commit from configuration mode.

Verification

Confirm the configuration is working properly.

Verifying IP Monitoring Status

Purpose
Action
Meaning

Purpose

Verify the IP status being monitored from both nodes and the failure count for both nodes.

Action

From operational mode, enter the show chassis cluster ip-monitoring status command.

Meaning

All the monitored IP addresses are reachable.

ON THIS PAGE

IP Monitoring Overview

Benefits of Monitoring IP Addresses in a Chassis Cluster

See Also

Understanding Chassis Cluster Redundancy Group IP Address Monitoring

Example: Configure Chassis Cluster Redundancy Group IP Address Monitoring

Requirements

Overview

Configuration

Procedure

CLI Quick Configuration

Step-by-Step Procedure

Results

Verification

Verifying the Status of Monitored IP Addresses for a Redundancy Group

Purpose

Action

Example: Configuring IP Monitoring on SRX5000 Line Devices for IOC2 and IOC3

Requirements

Overview

Topology

Configuration

CLI Quick Configuration

Configuring IP Monitoring on a 10x10GE SFP+ MIC

Step-by-Step Procedure

Results

Verification

Verifying IP Monitoring Status

Purpose

Action

Meaning

Related Documentation