Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Multinode High Availability Support for vSRX Instances

SUMMARY Read this topic to understand Multinode High Availability support for vSRX instances.

Overview

Starting in Junos OS Release 22.3R1, we support Multinode High Availability on vSRX virtual firewalls. Multinode High availability addresses high availability requirements for private and public cloud deployments by offering inter-chassis resiliency.

We support Multinode High Availability for vSRX instances for the following private and public cloud platforms:

  • Kernel-based virtual machine (KVM) and VMWare ESXi
  • Amazon Web Services (AWS)

Multinode High Availability in AWS

You can configure Multinode High Availability on the vSRX firewalls deployed on AWS. Participating nodes run both control and data-plane active at the same time and backup each other to ensure a fast synchronized failover in case of system or hardware failure. The Interchassis link (ICL) connection between the two devices synchronizes and maintains the state information and handles device failover scenarios.

Lets begin by getting familiar with Multinode High Availability terms specific to the AWS deployment.

Terminology

Term Description

Elastic IP Addresses

An Elastic IP address is a public IPv4 address, which is routable from the network/Internet. EIPs are dynamically bound to an interface of any node in Multinode High Availability setup. At any given time, EIPs are bound to only one interface and bound to the same node. The Multinode High Availability setup uses EIPs to control the traffic in AWS deployments. EIP acts similar to floating IP address in Layer 3 deployment or virtual IP address as in default gateway deployment. The node with an active SRG1 owns the EIP and draws the traffic toward it.

Inter-chassis link (ICL)

IP-based link (logical link) that connects nodes over a routed network in a Multinode High Availability system. The security device uses the ICL to synchronize and maintain the state information and to handle device failover scenarios. You can use only ge-0/0/0 interface to configure an ICL. The ICL uses MAC address assigned by AWS (not the virtual MAC created by vSRX VM). When you configure the ICL, ensure that the IP address is subnet of VPC. Note that cross VPC deployment is not supported.
Juniper Services Redundancy Protocol (jsrpd) process

JSRPD manages activeness determination and enforcement, and split-brain protection.

In Junos OS Release 22.3R1, we don't support IPSec VPN for Multinode High Availability in AWS deployments.

Architecture

Figure 1 shows two vSRX instances in Multinode High Availability setup deployed on AWS. In this deployment, two vSRX instances, one acting as the active node and the other as the backup node form a high availability pair. Two nodes run identical Junos OS image and have equal number of network interfaces configured.

Figure 1: Public Cloud Deployment Public Cloud Deployment

In Multinode High Availability setup, two vSRX instances are operating in active/backup mode. Both nodes connect to each other using an ICL for synchronizing control and data plane states. The vSRX instance, on which the SRG1 is active, hosts the Elastic IP address. The active node steers traffic towards it using the Elastic IP address. Backup node remains in standby mode and takes over on failover.

Juniper Services Redundancy Protocol (jsrpd) process communicates with AWS infrastructure to perform activeness determination and enforcement and provides split-brain protection.

During a failover, the Elastic IP address moves from the old active node to the new active node by triggering API (AWS SDK API) and draws traffic towards it. AWS updates the route tables to divert the traffic to the new active node.

This mechanism enables clients to communicate with the nodes using a single IP address. The Elastic IP is configured on the interface that connects to participating networks/segments.

Split-Brain Protection

When the ICL between two nodes goes down, each node starts pinging to the peer node’s interface IP using the probes. If the peer node is healthy, it responds to the probes. Otherwise, the jsrpd process communicates with AWS infrastructure to enforce the active role for the healthy node.

Configuring Multinode High Availability In Amazon Web Services (AWS) Deployment

In this example, we'll show you how to configure Multinode High Availability on two vSRX instances in the Amazon Virtual Private Cloud (Amazon VPC).

Requirements

This example uses the following hardware and software components:

Topology

Figure 2 shows the topology used in this example.

Figure 2: Multinode High Availability In AWS Deployment Multinode High Availability In AWS Deployment

As shown in the topology, two vSRX instances are deployed in the Amazon Virtual Private Cloud (Amazon VPC). The nodes communicate with each other using a routable IP address (Elastic IP address). Untrust side connects to public network and trust side connects to the protected resources.

Complete the following configurations before configuring Multinode High Availability on vSRX instances:

  • Use instance tag in AWS to identify two vSRX instances as Multinode High Availability peers. For example, in the Name option, you can say vsrx-node-1 and for ha-peer option, you can say vsrx-node-2.

  • Deploy both vSRX instances in the same Amazon VPC and availability zone.
  • Assign IAM role for vSRX instance and launch vSRX as an Amazon Elastic Compute Cloud (EC2) instance with full permissions.
  • Enable communication to the Internet by placing vSRX instances in public subnet. In the Amazon VPC, public subnets have access to the Internet gateway.
  • Configure one VPC with multiple subnets to form high availability pair. The subnets are used to connect the two vSRX nodes (using a logical connection, similar to the physical cables connecting ports). In this example, we have CIDR for VPC is defined as 10.0.0.0/16, and created a total of four subnets to host vSRX traffic. Also you need a minimum four interfaces for both vSRX instances. Table 1 provides subnet and interfaces details.
    Table 1: Subnets Configurations
    Function Port Number Interface Connection Traffic Type Subnet
    Management 0 Fxp0 Management Interface Management traffic 10.0.254.0/24
    ICL 1 ge-0/0/0 Inter-chassis link to peer node RTO, Sync, and probes-related traffic 10.0.253.0/24
    Public 2 ge-0/0/1 Connect to public network. (Revenue Interface) External traffic 10.0.1.0/24
    Private 3 ge-0/0/2 Connect to private network. (Revenue Interface) Internal traffic 10.0.2.0/24

    Note that interface mapping with functionality mentioned in the table are default configuration and we recommend to use the same mapping in the configuration.

  • Configure interfaces with primary and secondary IP addresses. You can associate EIP (Elastic IP address) as secondary IP addresses for an interface. Primary IP address is required during launching of instance and secondary IP address is transferable from one vSRX node to another during a failover. Table 2 show interface and IP address mappings used in this example.
    Table 2: Interface and IP Address Mappings
    Instance Interface Primary IP Secondary IP
    vSRX-1 ge-0/0/1 10.0.1.101 10.0.1.103 (EIP)
    ge-0/0/2 10.0.2.201 10.0.2.203 (EIP)
    vSRX-2 ge-0/0/1 10.0.1.102 10.0.1.103 (EIP)
    ge-0/0/2 10.0.2.202 10.0.2.203 (EIP)
  • Configure neighboring routers to include vSRX in the data path and mark vSRX as the next hop for the traffic. You can use EIP to configure the route. Example: sudo ip route x.x.x.x/x dev ens6 via 10.0.2.203 where the 10.0.2.203 address is EIP.

Configuration

CLI Quick Configuration

To quickly configure this example, copy the following commands, paste them into a text file, remove any line breaks, change any details necessary to match your network configuration, copy and paste the commands into the CLI at the [edit] hierarchy level, and then enter commit from configuration mode.

These configurations are captured from a lab environment, and are provided for reference only. Actual configurations may vary based on the specific requirements of your environment.

On vSRX-1 Device

On vSRX-2 Device

Step-by-Step Procedure

The following example requires you to navigate various levels in the configuration hierarchy. For instructions on how to do that, see Using the CLI Editor in Configuration Mode in the CLI User Guide.

  1. Configure the interface for the ICL.

  2. Configure interfaces for internal and external traffic.

    The secondary IP address assigned to ge-0/0/1 interface is used as EIP.

  3. Configure security zones, assign interfaces to the zones, and specify allowed system services for the security zones .

  4. Configure routing options.

    Create a separate routing instance type virtual router to separate management traffic and revenue traffic.

  5. Configure local node and peer node details.

  6. Associate the interface to peer node for interface monitoring and configure liveness detection details.

  7. Configure SRG1 by setting mode, deployment type.

  8. Associate a peer ID to SRG1 and define activeness-priority and preemption.

  9. Configure AWS deployment related options such as service type as EIP-based and specify details for monitoring options.

Results

vSRX-1

From configuration mode, confirm your configuration by entering the following commands.

If the output does not display the intended configuration, repeat the configuration instructions in this example to correct it.

If you are done configuring the device, enter commit from configuration mode.

vSRX-2

From configuration mode, confirm your configuration by entering the following commands.

If the output does not display the intended configuration, repeat the configuration instructions in this example to correct it.

If you are done configuring the device, enter commit from configuration mode.

Verification

Check Multinode High Availability Details

Purpose

View and verify the details of the Multinode High Availability setup configured on your vSRX instance.

Action

From operational mode, run the following command:

vSRX-1

vSRX-2

Meaning

Verify these details from the command output:

  • Local node and peer node details such as IP address and ID.

  • The field Deployment Type: CLOUD indicates that configuration is for the cloud deployment.

  • The field Services Redundancy Group: 1 indicates the status of the SRG1 (ACTIVE or BACKUP) on that node.

Check Multinode High Availability Information on AWS

Purpose

View and verify cloud deployment details.

Action

From operational mode, run the following command:

Meaning

Verify these details from the command output:

  • The field Cloud Type: AWS indicates the deployment is for AWS.

  • The field Cloud Service Type: EIP indicates that EIP service type is used to control the traffic in AWS deployment.

  • The field Cloud Service Status: Bind to Local Node displays EIP binding with local node. For the backup node, this field displays Bind to Peer Node

    .

Check Multinode High Availability Peer Node Status

Purpose

Check the Multinode High Availability peer node status.

Action

From operational mode, run the following command:

vSRX-1

vSRX-2

Meaning

Verify these details from the command output:

  • Peer node details such as interface used, IP address, and ID.

  • Packet statistics across the node.

Check Multinode High Availability SRG

Purpose

View and verify SRG details in Multinode High Availability.

Action

From operational mode, run the following command:

Meaning

Verify these details from the command output:

  • SRG details such deployment type, status, activeness priority and preemption details.

  • Peer node details.

  • Split-brain prevention probe details.

Verify the Multinode High Availability Status Before and After Failover

Purpose

Check the change in node status before and after a failover in a Multinode High Availability setup.

Action

Check the Multinode High Availability status on the backup node (SRX-2).

From operational mode, run the following command:

Under Services Redundancy Group: 1 option, you can see the Status: BACKUP. This indicates that the SRG-1 is in backup mode for the device.

Initiate the failover on the active node (vSRX-1) and again run the command on the backup node (vSRX-2).

Meaning

You can notice that under Services Redundancy Group: 1 option, the status of SRG1 is changed from BACKUP to ACTIVE. This indicates that the node has transitioned into the active role and the other node (previous active) has transitioned to the backup role. You can see the other node's status in the Peer Information option. Here, the status says BACKUP.