Overview of Starting and Stopping a Session State Register Cluster
Having to stop all nodes in a cluster is uncommon because most system maintenance can be done on one system at a time. Taking the whole cluster offline defeats the intention of the cluster—to avoid down time. So ensure that taking all systems down at the same time is required before proceeding. Rather than taking down all nodes, determine whether stopping just the SBR processes or the database management processes might be sufficient.
Stopping a server that hosts both a SBR Carrier and a management node creates a double fault, but does not damage the cluster because a fully redundant cluster always has more than one of each type of node. Stopping multiple nodes that provide redundancy to each other causes multiple faults that may damage the cluster and take the entire cluster off-line.
In the SSR environment, each type of node is started in a specific order so that required resources are available when other nodes require them, and stopped. This means that several commands may be executed on servers that host both SBR Carrier and management nodes.
Startup and shutdown commands must be executed by root on each node.
Starting the Cluster
Starting the Cluster
If all nodes in the cluster are shut down, restarting requires bringing each type of node online in the order as shown in Table 72. If the systems are completely shut down, rebooting the computer should restart the appropriate processes automatically because automatic start is the default configuration for all types of Session State Register nodes.
If the systems have not been totally shut down and just the SSR processes have been halted, log in as root and execute the command listed in Table 72 to start each type of node’s processes. If more than one type of node is running on a single system, maintain the listed order and start all management nodes first, then all data nodes, and finish with the SBR Carrier nodes.
Table 72: Starting Nodes in a Session State Register Cluster
Start Order and Type of Node | Execute This Command |
---|---|
1. SSR Management Node | /opt/JNPRsbr/radius/sbrd start ssr |
2. SSR Data Node | /opt/JNPRsbr/radius/sbrd start ssr |
3. Steel-Belted Radius Carrier | /opt/JNPRsbr/radius/sbrd start radius |
To monitor the cluster coming online:
Log in to a management node as hadm. (Or if you are root, execute: su - hadm.)
Run the show command.
Execute:
./Monitor.sh "ndb_mgm -e show"Results similar to this example are displayed:
hadm@sbrha-4:~> ./Monitor.sh "ndb_mgm -e show"
=================[1] Wed Mar 18 17:13:56 (TZ=+00:00) 2009================= Connected to Management Server at: 172.28.84.36:5235 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=10 @172.28.84.163 (mysql-5.7.25 ndb-7.6.9, Nodegroup: 0, Master) id=11 @172.28.84.113 (mysql-5.7.25 ndb-7.6.9, Nodegroup: 0)
[ndb_mgmd(MGM)] 2 node(s) id=1 @172.28.84.36 (mysql-5.7.25 ndb-7.6.9) id=2 @172.28.84.166 (mysql-5.7.25 ndb-7.6.9)
[mysqld(API)] 4 node(s) id=21 @172.28.84.36 (mysql-5.7.25 ndb-7.6.9) id=22 @172.28.84.166 (mysql-5.7.25 ndb-7.6.9) id=30 @172.28.84.36 (mysql-5.7.25 ndb-7.6.9) id=31 @172.28.84.166 (mysql-5.7.25 ndb-7.6.9)
Stopping the Cluster
Stopping the Cluster
If you do need to shut down all nodes in the cluster, make sure that you shut each type of node down in the order shown in Table 73. Log in as root on each computer in turn and execute the command to stop the node’s processes. If more than one type of node is running on a single computer, maintain the correct order and stop all SBR Carrier nodes, then all management nodes, and finally then all data nodes.
Stopping multiple systems and processes removes all redundancy and can create multiple faults and may damage the cluster. If you do need to stop the cluster, be sure to restart the cluster properly. See Starting the Cluster.
Table 73: Stopping Nodes in a Session State Register Cluster
Stop Order and Type of Node | Execute This Command |
---|---|
1. Steel-Belted Radius Carrier | /opt/JNPRsbr/radius/sbrd stop |
2. SSR Management Node | /opt/JNPRsbr/radius/sbrd stop ssr |
3. SSR Data Node | /opt/JNPRsbr/radius/sbrd stop ssr |
To shut down remote SSR Management nodes, execute the /opt/JNPRsbr/radius/sbrd stop ssr command on each node, separately.
Stopping a Single Node
Stopping a Single Node
You can stop any single node in the cluster to perform maintenance without affecting the integrity of the cluster (because the server’s redundant partner takes on the primary role). These commands just start the node’s process—they have no effect on the system itself, which may still need to be shut down.
If the node is a SBR Carrier node, modifying configuration files often requires a restart.
The stop commands for each type of node are listed in Table 74:
Table 74: Stopping a Single Node in a Session State Register Cluster
Type of Node | Execute This Command |
---|---|
Steel-Belted Radius Carrier | /opt/JNPRsbr/radius/sbrd stop radius |
SSR Management Node | /opt/JNPRsbr/radius/sbrd stop ssr |
SSR Data Node | /opt/JNPRsbr/radius/sbrd stop ssr |
Starting a Single Node
Starting a Single Node
To restart a node use the appropriate command from Table 75:
Table 75: Starting a Single Node in a Session State Register Cluster
Start Order and Type of Node | Execute This Command |
---|---|
Steel-Belted Radius Carrier | /opt/JNPRsbr/radius/sbrd start radius |
SSR Management Node | /opt/JNPRsbr/radius/sbrd start ssr |
SSR Data Node | /opt/JNPRsbr/radius/sbrd start ssr |
To monitor the startup cycle, use the show command. See Starting the Cluster.
sbrd
sbrd
The sbrd script starts and stops different processes on host machines for all three types of Session State Register nodes. The sbrd script may be in either of two directories on servers, depending on whether they have been configured to automatically start all procedures or not.
All sbrd commands are executed by root. In an SSR environment, the hadm user can execute the script on SSR processes, but expect errors with RADIUS processes that are owned by root.
Running sbrd on Session State Register Nodes
Running sbrd on Session State Register Nodes
This section applies to running sbrd on a nodes in a Session State Register cluster.
Syntax
sbrd status [radius|ssr|GWrelay] sbrd start [radius|ssr|GWrelay] [force] sbrd start ssr --nowait-nodes=node-ids sbrd stop [radius|ssr|GWrelay] [force] sbrd stop [cluster] [force] sbrd restart [radius|ssr|GWrelay] [force] sbrd clean [radius|ssr] [force] sbrd hup [radius|ssr|authGateway [process-name]] sbrd status [radius|ssr|GWrelay] -v [-p <LCI password>]
Options
The start, stop, and restart arguments start, stop, and restart the process. If a subsystem is not specified, the command works only on RADIUS and GWrelay processes because SSR processes normally are not stopped; to stop them, ssr must be invoked. For example: sbrd stop ssr.
Executing stop cluster on a SBR Carrier server stops both SSR and RADIUS processes. Executing stop cluster on a management node also stops the data nodes controlled by the management node.
The clean argument removes temporary files. When it is executed on a data node, clean also prepares the node to take part in a new environment; for example, if an expansion kit is added to increase the number of data nodes from two to four.
The status option displays information such as SBR package version, SBR process status, and loaded plug-in information. For more information about the RADIUS status information, see the Displaying RADIUS Status Information section in the SBR Carrier Installation Guide.
The hup option operates as the kill -HUP command does on SBR Carrier nodes, but does not require the process ID. Executing sbrd hup authGateway issues the SIGHUP (1) signal to all the authGateway processes running on SBR Carrier. To issue the SIGHUP (1) signal only to the specific authGateway process, you must execute the hup option with the authGateway process name, for example: sbrd hup authGateway GMT.
The radius, ssr, or GWrelay optional argument specifies which process to operate on when executed on a server that hosts more than one node.
radius specifies the local Steel-Belted Radius Carrier processes
ssr specifies data node and management node processes according to the type of node on which it is executed
GWrelay specifies the GWrelay application.
Executing start ssr --nowait-nodes=node-ids starts the cluster without waiting for the full cluster to be initialized. The node-ids variable specifies the comma-separated list of node IDs that are unreachable, for example: sbrd start ssr --nowait-nodes=51,52. You must use this argument only if one half of the cluster has network connectivity, but has lost the ability to communicate with the other half. When the network connectivity between the two halves of the cluster is restored, you can start the remaining nodes with the normal startup scripts.
The force argument makes sbrd attempt to disregard or overcome any errors that occur when processing the command. Normal behavior without the argument is to halt on errors. For example, sbrd start does not attempt to start software that is already running, but sbrd start force ignores a running process. This may produce unintended results, so use force with great care.
The -v option displays additional information about the RADIUS process along with basic information such as the SBR package version, SBR process status, and SBR process ID. If you have changed the default Lightweight Directory Access Protocol (LDAP) Configuration Interface (LCI) password, you should use the -p option to specify the password. For more information about the RADIUS status information, see Displaying RADIUS Status Information section in the SBR Carrier Installation Guide.
Examples
This example shows the effect of sbrd stop ssr executed on a cluster management node:
root@wrx07:~> /etc/init.d/sbrd stop ssr Stopping ssr auxiliary processes Stopping ssr management processes
Connected to Management Server at: 172.28.84.36:5235 Node 2 has shutdown. Disconnecting to allow Management Server to shutdown
This example shows the effect of sbrd start ssr on a management node. Be aware that this does not start the data nodes.
root@wrx07:~> /etc/init.d/sbrd start ssr Starting ssr management processes bash-3.00#
When sbrd is executed without a node type argument, it runs against all node processes on the server. For example, sbrd start starts both RADIUS and SSR processes for all nodes on a server.
In an SSR environment, because some servers may host both SBR Carrier and management nodes, sbrd may be executed more than once with different arguments.
Use the clean argument only when initializing new data nodes; it removes temporary files and sets file locks to support creation of a new database.
When to Stop, Start, or Restart SBR Carrier Nodes
When to Stop, Start, or Restart SBR Carrier Nodes
When modifications are made to SBR Carrier node configuration files, some processes on the node must be restarted to force the newly modified file to be read and used. Table 76 lists typical configuration control files and settings. The Yes entry indicates the least drastic action that causes the new settings to take effect:
Table 76: When to Stop and Restart SBR Carrier and SSR Processes
Item changes: | Save the window or file | Issue a SIGHUP (1) signal | Stop/restart the server |
---|---|---|---|
Access window or object | Yes | (Also works) | (Also works) |
access.ini file | No | No | Yes |
*.acc files | No | No | Yes |
account.ini file | No | No | Yes |
admin.ini file | No | No | Yes |
*.aut files | No | (Sometimes) | Yes |
blacklist.ini file | No | No | Yes |
Authentication policy | Yes | (Also works) | (Also works) |
*.dct files | No | No | Yes |
*.dic | No | No | Yes |
*.dhc files | No | No | Yes |
dhcp.ini file | No | No | Yes |
*.dir files | No | (Sometimes) | Yes |
*.eap files | No | (Sometimes) | Yes |
eap.ini file | No | No | Yes |
enterprises.oid file | No | No | Yes |
events.ini file | No | No | Yes |
filter.ini file | No | Yes | (Also works) |
*.gen files | No | No | Yes |
Import *.rif or users file | Yes | (Also works) | (Also works) |
*.ini for directed accounting | No | No | Yes |
IP Pools dialog or object | Yes | (Also works) | (Also works) |
lockout.ini file | No | No | Yes |
Log levels (in radius.ini file) | No | Yes | (Also works) |
Profiles dialog or object | Yes | (Also works) | (Also works) |
Proxy dialog or object | Yes | (Also works) | (Also works) |
*.pro files | No | Yes | (Also works) |
proxy.ini file | No | (Sometimes) | Yes |
radius.dct file | No | No | Yes |
radius.ini file | No | (Sometimes) | Yes |
RADIUS Clients dialog or object | Yes | (Also works) | (Also works) |
Servers dialog or object | Yes | (Also works) | (Also works) |
services file | No | No | Yes |
snmpdx.acl file | No | No | Yes |
tacplus.ini file | No | No | Yes |
Trace levels | No | Yes | (Also works) |
Tunnels page or object | Yes | (Also works) | (Also works) |
Users dialog or object | Yes | (Also works) | (Also works) |
vendor.ini file | No | No | Yes |