Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 

Chassis Cluster Overview

 

Chassis clustering provides network node redundancy by grouping a pair of the same kind of supported SRX Series devices into a cluster. The devices must be running the same version of the Junos® operating system (Junos OS).

Chassis cluster functionality includes:

  • Resilient system architecture, with a single active control plane for the entire cluster and multiple Packet Forwarding Engines. This architecture presents a single device view of the cluster.

  • Synchronization of configuration and dynamic runtime states between nodes within a cluster.

  • Monitoring of physical interfaces, and failover if the failure parameters cross a configured threshold.

  • Support for generic routing encapsulation (GRE) and IP-over-IP IPv4 tunnels used to route encapsulated IPv4/IPv6 traffic.

Control Plane and Data Plane

When creating a chassis cluster, the control ports on the respective nodes are connected to form a control plane that synchronizes the configuration and kernel state to facilitate the high availability of interfaces and services. Similarly, the data plane on the respective nodes is connected over the fabric ports to form a unified data plane. The fabric link allows for the management of cross-node flow processing and for the management of session redundancy.

The control plane software operates in active or backup mode. When configured as a chassis cluster, the two nodes back up each other, with one node acting as the primary device and the other as the secondary device, ensuring stateful failover of processes and services in the event of system or hardware failure. If the primary device fails, the secondary device takes over processing of traffic.

The data plane software operates in active/active mode. In a chassis cluster, session information is updated as traffic traverses either device, and this information is transmitted between the nodes over the fabric link to guarantee that established sessions are not dropped when a failover occurs. In active/active mode, it is possible for traffic to ingress the cluster on one node and egress from the other node.

When a device joins a cluster, it becomes a node of that cluster. With the exception of unique node settings and management IP addresses, nodes in a cluster share the same configuration.

Clusters and nodes are identified in the following ways:

  • A cluster is identified by a cluster ID (cluster-id) specified as a number 1 through 15.

  • A cluster node is identified by a node ID (node) specified as a number from 0 to 1.

Chassis clustering of interfaces and services is provided through redundancy groups and primacy within groups. A redundancy group is an abstract construct that includes and manages a collection of objects. A redundancy group contains objects on both nodes. A redundancy group is primary on one node and backup on the other at any given time. When a redundancy group is said to be primary on a node, the objects on that node are active. Redundancy groups is a concept of Junos OS Services Redundancy Protocol (JSRP) clustering that is similar to virtual security interface (VSI) in Juniper Networks ScreenOS® Software. Basically, each node has an interface in the redundancy group, where only one interface is active at a time. Redundancy group 0 is always for the control plane, while redundancy group 1+ is always for the data plane ports.

At any given instant, a cluster can be in one of the following states: hold, primary, secondary-hold, secondary, ineligible, and disabled. A state transition can be triggered because of any event, such as interface monitoring, Services Processing Unit (SPU) monitoring, failures, and manual failovers.

The following high-end SRX Series Services Gateways are supported:

  • SRX1400

  • SRX3400

  • SRX3600

  • SRX5600

  • SRX5800