Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation  Back up to About Overview 
  
[+] Expand All
[-] Collapse All

SSR Cluster Concepts and Terminology

A Session State Register cluster has both a physical and logical organization. The physical elements are servers. The logical elements are nodes. The two terms should not be used as synonyms; they are not interchangeable.

Session State Register Servers

The Session State Register has requirements for the entire cluster and all servers that participate in the cluster, over and above the requirements for standalone SBR Carrier servers. A SSR cluster should not have any single point of failure, so each server in a cluster must have its own memory and disks. We do not recommend or support virtual servers, network shares, network file systems, and SANs.

All servers in the cluster require at least two physical Ethernet ports that provide the same throughput. Multi-pathing the NICs to a single IP address is required. Session State Register Cluster can work over a 100Base-T network but we recommend 1000Base-T (gigabit Ethernet).

All data servers must have equal processor power, memory space, and available bandwidth because they are tightly coupled and share data. If the overall throughput of the data servers varies from machine to machine, performance degrades. SBR Carrier servers and management servers’ configuration may vary from machine to machine, so long as the basic standalone requirements are met.  

Session State Register Nodes

There are three types of SSR nodes, each with a specific role within the cluster:

  • A SBR Carrier node is a machine that hosts the RADIUS process. It runs the RADIUS process, any optional modules, and all related processes that read and write data into the SSR database. This type of node accesses and manipulates the cluster’s shared data that is hosted by the data nodes.
  • A management node is a machine that hosts the SSR management process. It controls itself and all data nodes in the cluster. It provides configuration data, starts and stops nodes, can back up the database and perform other database operations. It also manages a database process that supports the SSR storage engine. Cluster configuration data is located in an identical config.ini file on each of the cluster’s management nodes.
  • A data node is machine that hosts the SSR data process. It runs the ndbd data process. The ndbd process cooperatively manages, replicates, and stores data in the SSR storage engine with other data nodes. Each data node has its own memory and permanent storage. Each one maintains both a portion of the working copy of the SSR database and a portion of one or more replicas of the database.
  • A SBRC/SSR management node is a machine that hosts both a RADIUS process and an SSR management process.

Each machine in the front end and the SSR cluster is assigned a node type which indicates what processes it hosts: S (SBRC), SM (SBRC and SSR management), M (management), D (data). Thus, the following terms may be used to describe machines that are members of either the front end or the SSR cluster: S node, SM node, M node, and D node.

All the data nodes in a cluster run a special process called the shared memory engine that manages the working copy of the SSR database. The management nodes coordinate the service among the participating data nodes. The shared storage engine and the SSR database replaces the on-board database used by standalone Steel-Belted Radius Carrier servers. The shared memory engine ensures that the database is updated by a synchronous replication mechanism that keeps cluster nodes synchronized: a transaction is not committed until all cluster nodes are updated.

Note: In some cases, the terms node and machine have been used interchangeably. The term node refers to software processes that can be collocated on the same machine.

SSR Data Entities

Each data node participates in a node group of two data nodes. A Starter Kit cluster has a single node group with two members; a Starter Kit with an Expansion Kit has two node groups, each with two data nodes. Each node group stores different partitions and replicas.

  • A partition is a portion of all the data stored by the cluster. There are as many cluster partitions as node groups in the cluster. Each node group keeps at least one copy of any partitions assigned to it (that is, at least one replica) available to the cluster.
  • A replica is a copy of a partition. Each data node in a node group stores a replica of a partition. A replica belongs entirely to a single data node; a node can (and usually does) store several replicas because maintaining two replicas is the fixed setting for SSR.

Figure 243 shows the data components of a data cluster with four data nodes arranged in two node groups of two nodes each. Nodes 1 and 2 belong to Node Group 1. Nodes 3 and 4 belong to Node Group 2.

  • Because there are four data nodes, there are four partitions.
  • The number of replicas is two, to create a two copies of each primary partition.

So long as either both nodes in one node group, or one node in each node group is operating, the cluster remains viable.

Figure 243: SSR with Four Data Nodes in Two Groups

SSR  with Four Data Nodes in Two Groups

The data stored by the cluster in Figure 243 is divided into four partitions: 0, 1, 2, and 3. Multiple copies of each partition are stored within the same node group. Partitions are stored on alternate node groups:

  • Partition 0 is stored on Node Group 1. A primary replica is stored on Data Node 1 and a backup replica is stored on Data Node 2.
  • Partition 1 is stored on the other node group, Node Group 2. The primary replica is on Data Node 3 and its backup replica is on Data Node 4.
  • Partition 2 is stored on Node Group 1. The placement of its two replicas is reversed from that of Partition 0; the primary replica is stored on Data Node 2 and the backup on Data Node 1.
  • Partition 3 is stored on Node Group 2, and the placement of its two replicas are reversed from those of partition 1: the primary replica is on Data Node 4 and the backup on Data Node 3.

    Tip: Primary and replica are used in another context in the Steel-Belted Radius Carrier environment and documentation, which can cause some confusion. These terms mean something specific in the context of Session State Register, but they are also used when talking about centralized configuration management, or CCM.

    CCM is a feature that coordinates Steel-Belted Radius Carrier server settings between a primary RADIUS server and one or more replica RADIUS servers. It copies critical configuration files from the primary to the replicas, so it keeps multiple SBR Carrier servers operating the same way.

    CCM is a separate tool and process that is not tied or linked to SSR, but it is often used in SSR environments to keep the SBR Carrier nodes operating identically.

Cluster Configurations

For the highest level of redundancy, we recommend that each node in a cluster run on its own server. In many locations and for many installations, that might not be practical, so you can run a SBR Carrier and a management node together on the same server—in fact, that is the default configuration for the SSR Starter Kit cluster. However, neither a management node nor an SBR Carrier node can run on the same machine as a data node. Separation is required so that management arbitration services continue if one of the data node servers fails.

Using these separation guidelines, the recommended minimum size of a Session State Register cluster is four physical computers: two servers that each run a SBR Carrier and a management node, and two servers to host the data nodes. This configuration supports all licenses and nodes included in the Session State Register Cluster Starter Kit and is shown in Figure 244:

Figure 244: Basic Session State Register Starter Kit Cluster

Basic  Session State Register  Starter
Kit Cluster

Session State Register Scaling

You scale a Session State Register cluster when you add a separately-licensed SSR Expansion Kit to a Starter Kit, a third management node, or additional SBR Carrier front end systems.

Adding a Data Node Expansion Kit

An Expansion Kit adds two data nodes to increase the number of data nodes in a cluster to four. The additional nodes form a second node group (as shown in Figure 243) that provide more working memory for the SSR shared database. With the Expansion Kit in place, each node group manages a partition of the primary database and replicas. The data in each partition is synchronously replicated between the group’s data nodes, so if one data node fails, the remaining node can still access all the data. This configuration also provides very quick failover times if a node fails.

Figure 245: SSR Cluster with an Expansion Kit Setup to Create Two-Node Groups

SSR Cluster with an Expansion Kit Setup
to Create Two-Node Groups

Adding a Third Management Node

A Management Node Expansion Kit provides software and a license for a third management node. If it is set up on a separate host instead of alongside a SBR Carrier node on a shared server, this also increases the resiliency of the cluster by providing an additional arbiter in case of a node failure.

Adding More SBR Carrier Front End Servers

The service capacity of the SBR Carrier environment grows when you add additional stateless SBR servers to the front end. Adding additional SBR Carrier servers increases the resiliency of the cluster and the speed of processing a particular transaction because wait time is reduced. Up to 20 Steel-Belted Radius Carrier nodes can be supported by a data cluster.

The SBR Carrier servers do not require identical configurations; they can be configured with different optional modules or communications interfaces. Each one requires a separate SBR Carrier license, but they all share the Session State Register license.

We recommend installing a load balancer in front of the SBR Carrier servers to evenly distribute the RADIUS load between front end SBR Carrier nodes. Regular server-based load balancing works if the front ends only processes RADIUS transactions, Use a RADIUS-aware load balancer if the front ends perform multi-round authentication.

Cluster Network Requirements

A redundant cluster requires a redundant network. At the computer level, we require dual interface cards in each computer and multi-pathing.

We recommend that the network be a dedicated subnet with dual switches. This fully duplicates the network and each computer in the cluster has at least two routes to all other computers, as shown in Figure 246.

Figure 246: Starter Kit SSR Cluster with Redundant Network

Starter Kit  SSR Cluster with Redundant
Network

The SSR database schema uses primary key lookups as often as possible during transaction processing, so the database cluster performance scales almost linearly based on the number of data nodes in the cluster.

Do not configure the subnet to be shared beyond the cluster computers because communications between nodes are not encrypted or shielded in any way. The only means of protecting transmissions within a cluster is to run your cluster on a protected network; do not interpose firewalls between any of the nodes.

Running the cluster on a private or protected network also increases efficiency because the cluster has exclusive use of all bandwidth between cluster hosts. This protects the cluster nodes from interference caused by transmissions between other computers on the network.

Gigabit Ethernet is the strongly recommended network type; 100Base-T is the minimum supported speed. Network latency can severely degrade performance, so we also recommend that all servers be close enough together that latency is always less—much less—than 10 ms.

Table 70: Latency Between Servers and Its Effect on Performance

Latency Times

Performance Degradation

0 ms latency (LAN)

Baseline performance as designed.

10 ms latency

Up to 40% performance loss

20 ms latency

Up to 60% performance loss

More than 20ms latency

Not supported.

Modified: 2018-01-11