Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 

Overview of Starting and Stopping a Session State Register Cluster

 

Having to stop all nodes in a cluster is uncommon because most system maintenance can be done on one system at a time. Taking the whole cluster offline defeats the intention of the cluster—to avoid down time. So ensure that taking all systems down at the same time is required before proceeding. Rather than taking down all nodes, determine whether stopping just the SBR processes or the database management processes might be sufficient.

Stopping a server that hosts both a SBR Carrier and a management node creates a double fault, but does not damage the cluster because a fully redundant cluster always has more than one of each type of node. Stopping multiple nodes that provide redundancy to each other causes multiple faults that may damage the cluster and take the entire cluster off-line.

In the SSR environment, each type of node is started in a specific order so that required resources are available when other nodes require them, and stopped. This means that several commands may be executed on servers that host both SBR Carrier and management nodes.

Startup and shutdown commands must be executed by root on each node.

Starting the Cluster

Starting the Cluster

If all nodes in the cluster are shut down, restarting requires bringing each type of node online in the order as shown in Table 72. If the systems are completely shut down, rebooting the computer should restart the appropriate processes automatically because automatic start is the default configuration for all types of Session State Register nodes.

If the systems have not been totally shut down and just the SSR processes have been halted, log in as root and execute the command listed in Table 72 to start each type of node’s processes. If more than one type of node is running on a single system, maintain the listed order and start all management nodes first, then all data nodes, and finish with the SBR Carrier nodes.

Table 72: Starting Nodes in a Session State Register Cluster

Start Order and Type of Node

Execute This Command

1. SSR Management Node

/opt/JNPRsbr/radius/sbrd start ssr

2. SSR Data Node

/opt/JNPRsbr/radius/sbrd start ssr

3. Steel-Belted Radius Carrier

/opt/JNPRsbr/radius/sbrd start radius

To monitor the cluster coming online:

  1. Log in to a management node as hadm. (Or if you are root, execute: su - hadm.)

  2. Run the show command.

    Execute:

    ./Monitor.sh "ndb_mgm -e show"

    Results similar to this example are displayed:

Stopping the Cluster

Stopping the Cluster

If you do need to shut down all nodes in the cluster, make sure that you shut each type of node down in the order shown in Table 73. Log in as root on each computer in turn and execute the command to stop the node’s processes. If more than one type of node is running on a single computer, maintain the correct order and stop all SBR Carrier nodes, then all management nodes, and finally then all data nodes.

Caution

Stopping multiple systems and processes removes all redundancy and can create multiple faults and may damage the cluster. If you do need to stop the cluster, be sure to restart the cluster properly. See Starting the Cluster.

Table 73: Stopping Nodes in a Session State Register Cluster

Stop Order and Type of Node

Execute This Command

1. Steel-Belted Radius Carrier

/opt/JNPRsbr/radius/sbrd stop

2. SSR Management Node

/opt/JNPRsbr/radius/sbrd stop ssr

3. SSR Data Node

/opt/JNPRsbr/radius/sbrd stop ssr

Note

To shut down remote SSR Management nodes, execute the /opt/JNPRsbr/radius/sbrd stop ssr command on each node, separately.

Stopping a Single Node

Stopping a Single Node

You can stop any single node in the cluster to perform maintenance without affecting the integrity of the cluster (because the server’s redundant partner takes on the primary role). These commands just start the node’s process—they have no effect on the system itself, which may still need to be shut down.

If the node is a SBR Carrier node, modifying configuration files often requires a restart.

The stop commands for each type of node are listed in Table 74:

Table 74: Stopping a Single Node in a Session State Register Cluster

Type of Node

Execute This Command

Steel-Belted Radius Carrier

/opt/JNPRsbr/radius/sbrd stop radius

SSR Management Node

/opt/JNPRsbr/radius/sbrd stop ssr

SSR Data Node

/opt/JNPRsbr/radius/sbrd stop ssr

Starting a Single Node

Starting a Single Node

To restart a node use the appropriate command from Table 75:

Table 75: Starting a Single Node in a Session State Register Cluster

Start Order and Type of Node

Execute This Command

Steel-Belted Radius Carrier

/opt/JNPRsbr/radius/sbrd start radius

SSR Management Node

/opt/JNPRsbr/radius/sbrd start ssr

SSR Data Node

/opt/JNPRsbr/radius/sbrd start ssr

To monitor the startup cycle, use the show command. See Starting the Cluster.

sbrd

sbrd

The sbrd script starts and stops different processes on host machines for all three types of Session State Register nodes. The sbrd script may be in either of two directories on servers, depending on whether they have been configured to automatically start all procedures or not.

All sbrd commands are executed by root. In an SSR environment, the hadm user can execute the script on SSR processes, but expect errors with RADIUS processes that are owned by root.

Running sbrd on Session State Register Nodes

Running sbrd on Session State Register Nodes

This section applies to running sbrd on a nodes in a Session State Register cluster.

Syntax

Options

  • The start, stop, and restart arguments start, stop, and restart the process. If a subsystem is not specified, the command works only on RADIUS and GWrelay processes because SSR processes normally are not stopped; to stop them, ssr must be invoked. For example: sbrd stop ssr.

  • Executing stop cluster on a SBR Carrier server stops both SSR and RADIUS processes. Executing stop cluster on a management node also stops the data nodes controlled by the management node.

  • The clean argument removes temporary files. When it is executed on a data node, clean also prepares the node to take part in a new environment; for example, if an expansion kit is added to increase the number of data nodes from two to four.

  • The status option displays information such as SBR package version, SBR process status, and loaded plug-in information. For more information about the RADIUS status information, see the Displaying RADIUS Status Information section in the SBR Carrier Installation Guide.

  • The hup option operates as the kill -HUP command does on SBR Carrier nodes, but does not require the process ID. Executing sbrd hup authGateway issues the SIGHUP (1) signal to all the authGateway processes running on SBR Carrier. To issue the SIGHUP (1) signal only to the specific authGateway process, you must execute the hup option with the authGateway process name, for example: sbrd hup authGateway GMT.

  • The radius, ssr, or GWrelay optional argument specifies which process to operate on when executed on a server that hosts more than one node.

    • radius specifies the local Steel-Belted Radius Carrier processes

    • ssr specifies data node and management node processes according to the type of node on which it is executed

    • GWrelay specifies the GWrelay application.

  • Executing start ssr --nowait-nodes=node-ids starts the cluster without waiting for the full cluster to be initialized. The node-ids variable specifies the comma-separated list of node IDs that are unreachable, for example: sbrd start ssr --nowait-nodes=51,52. You must use this argument only if one half of the cluster has network connectivity, but has lost the ability to communicate with the other half. When the network connectivity between the two halves of the cluster is restored, you can start the remaining nodes with the normal startup scripts.

  • The force argument makes sbrd attempt to disregard or overcome any errors that occur when processing the command. Normal behavior without the argument is to halt on errors. For example, sbrd start does not attempt to start software that is already running, but sbrd start force ignores a running process. This may produce unintended results, so use force with great care.

  • The -v option displays additional information about the RADIUS process along with basic information such as the SBR package version, SBR process status, and SBR process ID. If you have changed the default Lightweight Directory Access Protocol (LDAP) Configuration Interface (LCI) password, you should use the -p option to specify the password. For more information about the RADIUS status information, see Displaying RADIUS Status Information section in the SBR Carrier Installation Guide.

Examples

This example shows the effect of sbrd stop ssr executed on a cluster management node:

This example shows the effect of sbrd start ssr on a management node. Be aware that this does not start the data nodes.

Note
  • When sbrd is executed without a node type argument, it runs against all node processes on the server. For example, sbrd start starts both RADIUS and SSR processes for all nodes on a server.

  • In an SSR environment, because some servers may host both SBR Carrier and management nodes, sbrd may be executed more than once with different arguments.

  • Use the clean argument only when initializing new data nodes; it removes temporary files and sets file locks to support creation of a new database.

When to Stop, Start, or Restart SBR Carrier Nodes

When to Stop, Start, or Restart SBR Carrier Nodes

When modifications are made to SBR Carrier node configuration files, some processes on the node must be restarted to force the newly modified file to be read and used. Table 76 lists typical configuration control files and settings. The Yes entry indicates the least drastic action that causes the new settings to take effect:

Table 76: When to Stop and Restart SBR Carrier and SSR Processes

Item changes:

Save the window or file

Issue a SIGHUP (1) signal

Stop/restart the server

Access window or object

Yes

(Also works)

(Also works)

access.ini file

No

No

Yes

*.acc files

No

No

Yes

account.ini file

No

No

Yes

admin.ini file

No

No

Yes

*.aut files

No

(Sometimes)

Yes

blacklist.ini file

No

No

Yes

Authentication policy

Yes

(Also works)

(Also works)

*.dct files

No

No

Yes

*.dic

No

No

Yes

*.dhc files

No

No

Yes

dhcp.ini file

No

No

Yes

*.dir files

No

(Sometimes)

Yes

*.eap files

No

(Sometimes)

Yes

eap.ini file

No

No

Yes

enterprises.oid file

No

No

Yes

events.ini file

No

No

Yes

filter.ini file

No

Yes

(Also works)

*.gen files

No

No

Yes

Import *.rif or users file

Yes

(Also works)

(Also works)

*.ini for directed accounting

No

No

Yes

IP Pools dialog or object

Yes

(Also works)

(Also works)

lockout.ini file

No

No

Yes

Log levels (in radius.ini file)

No

Yes

(Also works)

Profiles dialog or object

Yes

(Also works)

(Also works)

Proxy dialog or object

Yes

(Also works)

(Also works)

*.pro files

No

Yes

(Also works)

proxy.ini file

No

(Sometimes)

Yes

radius.dct file

No

No

Yes

radius.ini file

No

(Sometimes)

Yes

RADIUS Clients dialog or object

Yes

(Also works)

(Also works)

Servers dialog or object

Yes

(Also works)

(Also works)

services file

No

No

Yes

snmpdx.acl file

No

No

Yes

tacplus.ini file

No

No

Yes

Trace levels

No

Yes

(Also works)

Tunnels page or object

Yes

(Also works)

(Also works)

Users dialog or object

Yes

(Also works)

(Also works)

vendor.ini file

No

No

Yes