Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Solution Architecture

Traffic Path in IPsec Scale-Out Solution

The scale-out solution is based on BGP as dynamic routing protocol. It enables all the MX Series Router and SRX Series Firewalls to learn from their surrounding networks, however, most importantly to exchange path information of the network traffic that is sent from the MX Series Router across each SRX Series Firewalls to the next MX Series Router.

This protocol enables the exchange of network paths for the external user subnets coming from IPsec peers and the specific internal networks. When each SRX Series Firewalls announces its own IKE/IPsec termination gateway to its BGP peers, each with the same “network cost”, the load balancing algorithm can then use those routes for load balancing across each SRX Series Firewalls. In case of IPsec traffic, the Internet side needs to announce the IP address used for establishing the tunnel (the IKE destination at minimum, usually both IKE – phase 1 - and IPsec – phase 2 - negotiation use the exact same address), on the other side the inner traffic transported over IPsec (the IP addresses negotiated in a traffic selector of each party).

The following diagram shows how traffic flows are distributed from an MX Series Router to multiple SRX Series Firewalls using ECMP load balancing method for the IKE/IPsec traffic started from the ENB in a mobile network. The SRX Series Firewalls are in a symmetric sandwich between the two MX Series Routers in the diagram, whether those MX Series Routers are a single physical node configured with two routing instances (more typical) or two physical MX Series Router nodes on each side, the routing principle stays the same as if two routing nodes are used, maintaining the traffic flow distribution that is consistent in both directions. However, on one side of the SRX Series Firewalls is encrypted traffic and on the other side the clear text traffic to/from the mobile endpoint.

Figure 1: Flow Distribution with IPsec Flow Distribution with IPsec

The MX Series Router on the left side uses UNTRUST-VR routing instance to forward traffic to each SRX Series Firewalls. On the left side, only IPsec traffic is seen. The only IP addresses to announce are the ones used by the remote sites (IPsec gateways or users) and the same IKE gateway IP address is used by each SRX Series Firewalls. The routes on this side are announced through BGP to the next hop, making its path available on each MX Series Router through each SRX Series Firewalls (with same cost for load balancing).

The MX Series Router on the right side uses TRUST-VR to receive traffic from each SRX Series Firewalls and forward it to the next-hop toward the target resources. When an IPsec tunnel is established on the left side (remote site to SRX), it negotiates (as part of the IPsec Security Association) an inner IP address(es) assigned to the remote entity. This is the IP address that is announced to the router on the right side, making the return path unique toward the specific SRX hosting that IPsec security association (the diagram shows a simple network IP address with a /24 prefix for IPv4, and an IPv6 shows a /120 prefix for example).

Routes are announced through BGP, each MX Series Router with their own BGP Autonomous System (AS) and peer with the SRX Series Firewalls on their two sides (TRUST and UNTRUST zones in a single routing instance). The MX Series Router peers with any other routers bringing connectivity to the Internet and servers/data center.

Figure 2 shows how the traffic comes through the remote gateway. This starts an IPsec negotiation with one of the SRX Series Firewalls (destination being selected by the load balancing mechanism), then transported over IPsec to the SRX Series Firewalls. This SRX Series Firewalls then decrypts the packet and sends the content to the next hop. The return path across the right SRX Series Firewalls to the ARI route (Auto Route Injection) is announced by the SRX Series Firewalls to the MX Series Router on the right side. Each route is announced through BGP for making every network reachable.

Figure 2: Network BGP Within/Outside IPsec A diagram of a network Description automatically generated
Note:

In the above diagram and later configuration examples, the “publicly” announced IKE Gateway address is 172.16.1.1/32 (though part of private space RFC1918) for the sake of the demonstration. All other 10.0.0.0/8 addresses are considered private IP addresses as per the same RFC.

Introduction to SRX Series Firewalls Multinode High Availability

For more information, see an extract from the public documentation on MNHA https://www.juniper.net/documentation/us/en/software/junos/high-availability/topics/topic-map/mnha-introduction.html .

Juniper Networks SRX Series Firewalls support a new solution, Multinode High Availability (MNHA), to address high availability requirements for modern networks. In this solution, both the control plane and the data plane of the participating devices (nodes) are active at the same time. Thus, the solution provides inter-chassis resiliency.

The participating devices are either co-located or geographically separated to different rooms or buildings. Having nodes with high availability across geographical locations ensures resilient service. If a disaster affects one physical location, MNHA can fail over to a node in another physical location, thereby ensuring continuity.

In MNHA, both SRX Series Firewalls have an active control plane and communicate their status over an Inter Chassis Link (ICL) that can be direct or routed across the network. This allows the nodes to be geo-dispersed while synchronizing the sessions and IKE security associations. Also, they do not share a common configuration, and this enables different IP addresses settings on both SRX Series Firewalls. Use the commit sync mechanism for the elements of configuration to be same on both the platforms.

The SRX Series Firewalls uses one or more services redundancy groups (SRGs) for the data plane that can be either active or backup (for SRG1 and above). An exception is the SRG group 0 (zero) that is always active on both. This is a group that can be used natively by scale-out solution to load balance the traffic across both SRX Series Firewalls at the same time. However, some interest exists for the other modes where it can be Active/Backup for SRG1 and Backup/Active for SRG2. This is like always active SRG0, however can also add some routing information (like BGP as-path-prepend) under certain conditions. SRG1/+ offers more health checking of its surrounding environment that can be leveraged to make an SRGn group active/backup/ineligible.

Figure 3: Munit Node High Availability General Architecture A diagram of a diagram Description automatically generated

MNHA can select a network mode between the following three possibilities:

  • Default Gateway or L2 mode: It uses the same network segment at L2 on the different sides of the SRX Series Firewalls (for example, TRUST/UNTRUST) and both SRX Series Firewalls share a common IP / MAC address on each network segment. It does not mean the SRX Series Firewalls is in switching mode, it does route between its interfaces, however, shares the same broadcast domain on one side with the other SRX Series Firewalls, and same on the other side as well.
  • Hybrid mode or mix of L2 and L3: It uses an L2 (broadcast domain) and IP address on one side of the SRX Series Firewalls (for example, TRUST) and routing on the other side (for example, UNTRUST) then having different IP subnets on the second side.
  • Routing mode or L3: The JVD uses this architecture where each side of the SRX Series Firewalls is using a different IP address, even between the SRX Series Firewalls (no common IP subnet) and all communication with rest of the network happens through routing. This mode is perfect for scale-out communication using BGP with the MX Series Router.
Figure 4: MNHA Network Modes Multi Node High Availability Network Modes

Whether using SRG0 Active/Active, or SRG1 Active/Backup (single one active at a time), or a combination of SRG1 Active/Backup and SRG2 Backup/Active, this simply uses one or two SRX Series Firewalls in a cluster at the same time.

ECMP Consistent Hashing (CHASH) Load Balancing Overview

This feature relates to the topology (single MX Series Router, multiple standalone SRX Series Firewalls) used with ECMP (dual MX Series Router and/or SRX Series Firewalls is not possible with this load balancing method).

Figure 5: Topology 1 - ECMP CHASH A computer network diagram with a green rectangular object Description automatically generated with medium confidence

ECMP Consistent Hashing in MX Series Router

Equal Cost Multi Path (ECMP) is a network routing strategy that allows traffic of the same session, or flow — that is, traffic with the same source and destination — to be transmitted across multiple paths of equal cost. It is a mechanism that allows to load balance traffic and increase bandwidth by fully utilizing bandwidth otherwise the unused bandwidth links to the same destination.

When forwarding a packet, the routing technology must decide which next-hop path to use. In deciding, the device considers the packet header fields that identify a flow. When ECMP is used, next-hop paths of equal cost are identified based on routing metric calculations and hash algorithms. That is, routes of equal cost have the same preference and metric values, and the same cost to the network. The ECMP process identifies a set of routers, each of which is a legitimate equal cost next hop towards the destination. The routes that are identified are referred to as an ECMP set. An ECMP set is formed when the routing table contains multiple next-hop addresses for the same destination with equal cost (routes of equal cost have the same preference and metric values). If there is an ECMP set for the active route, Junos OS uses a hash algorithm to choose one of the next-hop addresses in the ECMP set to install in the forwarding table. You can configure Junos OS so that the multiple next-hop entries in an ECMP set are installed in the forwarding table. On Juniper Networks devices, one can perform per-packet load balancing to spread traffic across multiple paths between routing devices.

The following example is of learned routes and forwarding table for the same destination (assuming traffic target is exact address 172.16.1.1/32 and SRX Series Firewalls BGP peers are 10.1.1.0, 10.1.1.8 and 10.1.1.16):

With scale-out architecture where stateful security devices are connected, maintained symmetricity of the flows in the security devices is the primary objective. The symmetricity means traffic from a subscriber (remote user or remote site) to the same subscriber must always go through the same SRX Series Firewalls (which maintains the subscriber state). To reach the same SRX Series Firewalls, the traffic must be hashed onto the same link towards that SRX Series Firewalls in both directions.

A subscriber is identified by the source IP address in the upstream direction (client to server) and by the destination IP address in the downstream direction (server to client). MX Series Routers do symmetric hashing i.e. for a given (sip, dip) tuple, same hash is calculated irrespective of the direction of the flow i.e. even if sip and dip are swapped. However, this is not enough for our requirement as it requires all flows from a subscriber to reach the same SRX Series Firewalls – so hash only on source-ip address (and not destination-ip address) in one direction and vice versa in the reverse direction.

However, in the present IPsec use case, traffic is not the same on both sides of the firewall. On the left side there is IPsec, coming from remote sites and terminating on the SRX Series Firewalls, and the inner traffic from the SRX Series Firewalls on the right side going to the internal servers or data center. However, the symmetry of the traffic needs to be true. The SRX Series Firewalls receiving the initial IKE/IPsec request establishes a tunnel with the source of that tunnel (the remote site), and in the IPsec negotiation (IKE phase 2) is also negotiated with source/destination IP address (i.e. the traffic selector or encryption domain depending on the language used). In the remote site term, this source IP address or subnet negotiated in this traffic selector is the one that is then used and announced through BGP to the next MX Series Router in the chain (this is the ARI route, aka Auto Route Injection). This makes the return traffic to that remote site reach the correct SRX Series Firewalls and then route that traffic back to the proper IPsec tunnel to its destination.

By default, when a failure occurs in one or more paths, the hashing algorithm recalculates the next hop for all paths, typically resulting in the redistribution of all flows. Consistent load balancing enables you to override this behavior so that only flows for links that are inactive are redirected. All existing active flows are maintained without disruption. In such an environment, the redistribution of all flows when a link fails potentially results in significant traffic loss or a loss of service to SRX Series Firewalls whose links remain active. However, consistent load balancing maintains all active links and instead remaps only those flows affected by one or more link failures. This feature ensures that flows connected to links remain active and continue uninterrupted.

This feature applies to topologies where members of an ECMP group are external BGP neighbors in a single-hop BGP session. Consistent load balancing does not apply when you add a new ECMP path or modify an existing path in any way. The new SRX Series Firewalls design is implemented where SRX devices are added gracefully with an intent of equal redistribution from each active SRX Series Firewalls. Hence, it causes minimal impact to existing ECMP flows. For example, if there are four active SRX Series Firewalls carrying 25% of total flows on each link and a 5th SRX Series Firewalls (previously unseen) is added, 5% of flows from each existing SRX Series Firewalls move to the new SRX Series Firewalls. Hence results in 20% of flow re-distribution from an existing four SRX Series Firewalls to new one.

In case of traffic redistribution (loss of a single SRX Series Firewalls or addition of a new SRX Series Firewall) the IPsec peer renegotiates to that “new” peer IKE gateway as the Security Association does not exist yet.

In the case of SRX MNHA pair, any failover (if losing its SRX Series Firewalls other node) from one to another in the same pair reuses the existing synchronized IPsec Security Association and no renegotiation happens.

ECMP CHASH Usage in Topology 1 (Single MX Series Router, Scale-Out SRXs) for IPsec:

Note:

IPsec use case usually accepts IPsec connections from remote sites (it can be remote users). The connections from the left slide as shown in Figure 6 and then their respective internal traffic transported over IPsec then decapsulated and send to the right side, typically to internal servers or a private data center.

Figure 6: Topology 1 - ECMP CHASH - IPsec Use Case A diagram of a network Description automatically generated
  • All the scale-out SRX Series Firewalls connected to MX Series Router are configured with EBGP connections.
  • All the scale-out SRX Series Firewalls need to be configured with auto-vpn config and with the same anycast IP address hosted on loopback interface as IKE endpoint IP address. All the SRX Series Firewalls are in IPsec responder only mode.
  • IPsec Tunnels getting initiated behind MX Series Router from IPsec initiator uses same SRX IKE endpoint IP with unique traffic-selectors. This traffic selector is used by SRX Series Firewalls to install unique ARI routes to attract the data return traffic from the server to the right IPsec tunnel.
  • A Load-balancing policy with source-hash for anycast IP address route is configured in the forwarding-table.
  • MX Series Router receives anycast IP address route on UNTRUST side and advertised using EBGP to MX Series router on the UNTRUST side. MX Series Router imports this route on the UNTRUST instance using load-balancing consistent-hash policy.
  • MX Series Router on the UNTRUST side has an ECMP route for anycast IP address.
  • IKE traffic initiated from IPsec initiator router reaches MX Series Router on UNTRUST instance and hits ECMP anycast IP address route and takes any one ECMP next hop to SRX Series Firewalls based on the calculated source IP address-based hash value.
  • SRX Series Firewalls anchors the IKE session and installs the ARI route.
  • SRX Series Firewalls advertises the ARI route towards the TRUST direction of MX Series Router.
  • IPsec data traffic initiated from clients behind IPsec initiator router goes through the IPsec tunnel and reaches the anchored IPsec tunnel on the SRX Series Firewalls. Clear-text packets coming out of tunnel are routed towards the TRUST direction to reach the server.
  • IPsec data reply traffic from server towards client reaches the MX Series Router on the TRUST direction and then gets routed through unique ARI route to the SRX Series Firewalls where tunnel is anchored.
  • SRX Series Firewalls encrypt the traffic and send the traffic over the tunnel to the IPsec initiator and then to the client.
  • When any SRX Series Firewalls goes down, CHASH on the MX Series Router ensures that IPsec sessions on the other SRX Series Firewalls are not disturbed and only IPsec sessions on the down SRX Series Firewalls are redistributed.

Traffic Load Balancer Overview

This feature relates to topology 2 (single MX Series Router, scale-out SRX MNHA pairs) and topology 3 (dual MX Series Routers and scale-out SRX MNHA pairs).

Figure 7: Topologies 2 and 3 – TLB – IPsec Use Cases Topologies 3 and 4 - TLB

Traffic Load Balancer in MX Series Router

Traffic Load Balancer (TLB) functionality provides stateless translated or non-translated traffic load balancer, as an inline PFE service in the MX Series Routers. Load balancing in this context is a method where incoming transit traffic is distributed across configured servers that are in service. This is a stateless load balancer, as there is no state created for any connection, and so there are no scaling limitations. Throughput can be close to line rate. TLB has two modes of load balancing i.e., translated (L3) and non-translated Direct Server Return (L3).

For the scale-out solution, the TLB mode non-translated Direct Server Return (L3) is used. As part of TLB configuration, there is a list of available SRX Series Firewalls addresses and the MX Series Router PFE programs a selector table based on this SRX Series Firewalls. TLB does a health check (ICMP usually however it can do HTTP, Custom, and TCP checks) for each of the SRX Series Firewalls individually. TLB health check is done using MX Series Router routing engine. If the SRX Series Firewalls pass the health check, TLB installs a specific IP address route or wild card IP address (TLB config option) route in the routing table with next-hop as composite next-hop. Composite next-hop in the PFE is programmed with all the available SRX Series Firewalls in the selector table. Filter based forwarding is used to push the "Client to Server" traffic to the TLB where it hits the TLB installed specific IP address route or wild card IP address route to get the traffic sprayed across the available SRX Series Firewalls with source or destination hash. "Server to Client" is directly routed back to client instead of going through the TLB.

Figure 8: TLB Work in RE and PFE TLB Service in RE and PFE
Note:

TLB has been used in the Junos OS and MX Series Routers family for a few years now (as early as Junos OS Release 16.1R6) and you are using it successfully on large server farms with around 20,000 servers.

TLB uses the control part and the health check on MS-MPC or MX-SPC3 service cards on MX240/480/960 and MX2000 chassis before data plane or PFE is already on the line cards. It is not running on the RE as it is implemented on MX304/MX10000 chassis.

For more information see, https://www.juniper.net/documentation/us/en/software/junos/interfaces-next-gen-services/interfaces-adaptive-services/topics/concept/tdf-tlb-overview.html

Using TLB in MX Series Router for Scale-Out SRX Series Firewalls Solution with IPsec

In this scenario, the source of IPsec traffic is some remote sites (for an Enterprise or remote users) that reside on the left side of Figure 9. When this remote site connects using IPsec to the SRX Series Firewalls, it is redirected by load balancing to one of the SRX Series Firewalls as TLB handles it. It can be represented the other way around for an enterprise, however, the principle stays the same, only interface IP address routing-instance and zone naming might change. A unique anycast IP address is used for all IKE/IPsec connections, hosted on each SRX Series Firewalls.

Figure 9: Topology 3 - Scale-Out IPsec with TLB A diagram of a cloud computing system Description automatically generated
  • All SRX Series Firewalls are configured with BGP to establish an eBGP peering sessions with MX Series Router-nodes.
  • All the scale-out SRX Series Firewalls need to be configured with auto-vpn config and with the same anycast IP address as IKE endpoint IP address. All SRX Series Firewalls are an IPsec responder only mode.
  • IPsec clients getting initiated behind MX Series Routers use the same SRX IKE endpoint IP address with unique traffic-selectors. This traffic-selector is used by SRX Series Firewalls to install unique ARI routes (Auto Route Injection) to attract the data return traffic to the right IPsec tunnel from the server. The ARI routes need to be unique also.
  • MX Series Routers are configured with TLB on the IPsec VR routing instance to do the load balancing of IKE traffic coming from MX Series Router towards scale-out SRX Series Firewalls.
  • All the scale-out SRX Series Firewalls connected to MX Series Routers are configured with unique IP addresses, which is used by MX TLB to do the health check and build up the selector table in the PFE. PFE uses this selector table to load balance the packet across the available next hops. This health check is reachable through BGP connection. Anycast IP address used for IKE endpoint is reachable through this Unique IP address on each SRX Series Firewalls.
  • Filter based forwarding based on source IP address match is used in MX Series Router to push IPsec specific traffic to the TLB IPsec forwarding instance.
  • TLB Forwarding instance has a default route with the next hop as a list of SRX Series Firewalls. TLB installs this default route when its health check passes with at least one SRX Series Firewalls.
  • TLB does source-based hash load balancing across all the available SRX Series Firewalls next-hop devices.
  • Load balanced IPsec tunnel sessions get anchored on any available SRX Series Firewalls and it installs the ARI route. Then packet gets decrypted, and it’s routed to reach the server through MX Series Router over TRUST routing instance.

For the return traffic coming from server to client direction on the MX TRUST routing instance, Unique ARI routes are used to route the traffic back to same SRX Series Firewalls where the IPsec tunnel is anchored.

  • SRX Series Firewalls use the same IPsec tunnel session to encrypt the packet and route the IPsec traffic towards MX Series Router on the UNTRUST VR direction.
  • MX Series Router routes the IPsec traffic back to IPsec Initiators.

Configuration Example for ECMP CHASH

The following sample configurations are proposed to understand the elements making this solution work, including configurations for both MX Series Router and some SRX Series Firewalls. It contains a lot of repetitive statements. It shows Junos OS hierarchical view.

Source-hash for forward flow is common for all ECMP based solutions or TLB based solutions. CHASH is used during any next-hop failure where it helps an existing session on an active next-hop to remain undisturbed, while sessions on down next-hop is redistributed over other active next-hop. This CHASH behavior is pre-built in the TLB solution. However, in ECMP based solution you must configure this CHASH configuration explicitly using BGP import policy.

Note:

The following sample configuration examples consider the “publicly” announced IKE Gateway address as 172.16.1.1/32 (though part of private space RFC1918) for the sake of the demonstration, as well as the remote site “public” IP addresses in the 172.16.255.0/24 range.

All other 10.0.0.0/8 addresses are considered private addresses as per the same RFC.

The following MX Series Router configuration is an example for ECMP load balancing using source hash on the UNTRUST side (only to IKE gateway unicast address shared by each SRX Series Firewalls):

The following MX Series Router configuration is an example for specific forward traffic with ECMP CHASH on the UNTRUST side (on the IPsec encrypted traffic side):

The following MX Series Router configuration is an example for specific forward traffic on the TRUST side with decrypted mobile traffic (only remote sites negotiated IP address coming from the Auto-Route-Injection Traffic Selectors, ARI-TS, need to be announced):

After reviewing the MX Series Router configuration, consider the following example showing SRX1 configuration for IPsec on the UNTRUST side (includes security zone). Very similar configuration applies to all next SRX Series Firewalls, including same IKE Loopback address and same BGP AS number, however, different IP address for their own network addresses.

The following example shows SRX1 configuration for security gateway (referred to as SECGW in code) on the TRUST side (using the single and same VR as above):

The following example shows SRX1 configuration for security gateway at the security level (IKE/IPsec listening settings – example with PSK here - and security policies):

Note:

These configurations can also use IPv6.

When running tests, some ECMP CHASH outputs can show the route selections. Notice the IKE anycast IP address for the gateway through different BGP peers on the UNTRUST side:

And the inner IP address coming out of the IPsec tunnels (allocated to each connected mobile, then showing /32) announced to the TRUST router:

Note:

This configuration is also available in the CSDS configuration example as this uses the exact same technology and configuration for the ECMP CHASH. For more information, see https://www.juniper.net/documentation/us/en/software/connected-security-distributed-services/csds-deploy/topics/example/configure-csds-ecmp-chash-singlemx-standalonesrx-scaledout-nat-statefulfw.html (some IP or AS might have changed).

Configuration Example for TLB

Like ECMP CHASH, TRUST-VR/UNTRUST-VR are similar in the TLB use case, with BGP peering with the SRX Series Firewalls on each side, however, different configuration is needed for the TLB services, including additional routing-instances and less policy statements.

Source-hash for forward flow and destination-hash for reverse flow is common for all the solutions based on ECMP or TLB. CHASH is used during any next-hop failures where it helps an existing session on active next-hops not to get disturbed and sessions only on down next-hops gets re-distributed over other active next-hops. This CHASH behavior is pre-built in the TLB solution.

General load balancing strategy for anything except TLB:

The following MX Series Router configuration is an example for specific forward and return traffic. TLB uses forwarding as a new routing-instance type.

The following configuration example shows how traffic is redirected to TLB instance using Filter Based Forwarding (associated with routing-instance srx-tproxy-fi) to extract that specific traffic for load balancing it to each SRX Series Firewalls:

TLB uses the following interface loopbacks for health checking to the SRX Series Firewalls:

And the TLB service part (Example, with the IPsec service, only TRUST side TLB instance is used as ARI route, which is announced for return traffic):

After MX Series Router configuration, the following sample SRX1 configuration is for IPsec security gateway.

In case of SRX MNHA pair, same loopback IP address is shared to failover in case of any event on the active device. This specific loopback IKE gateway IP address is announced by BGP to the MX Series Router peer (on TRUST side). The following example shows SRX1 configuration for MNHA and loopback export:

When running the tests, some TLB is seen as the group usage and packets/bytes to each SRX Series Firewalls:

Common Configurations for ECMP CHASH and TLB

Some elements of configuration need to be in place for both load balancing methods. The following sample configurations are for TRUST and UNTRUST VR and the peering with each SRX Series Firewalls. It also shows some other less seen configuration elements.

The following sample shows a common configuration when using dual MX Series Router topology: Both MX Series Router calculate the same hash value when both have same number of next hops.