Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Understanding High Availability Nodes in a Cluster

A Junos Space cluster must include at least two nodes to achieve high availability (HA). If the cluster includes more than two nodes, the availability of the cluster does not increase, but the amount of load that the cluster can handle increases with each node added to the cluster. So at any given time, only two nodes in the cluster provide HA to the whole cluster. By default, these two nodes alone (referred to as the HA nodes in the cluster) form the Linux HA cluster, the Apache Load Balancer cluster, and the MySQL cluster. If you have added dedicated database nodes to the cluster, the MySQL cluster is formed by the primary and secondary database nodes.

By default, the first two nodes added to the cluster function as the HA nodes. In the topic Understanding the Logical Clusters Within a Junos Space Cluster, the example shows that the first two nodes (Node-1 and Node-2) are HA nodes. If you were to delete Node-1 or Node-2 from the Network Management Platform > Administration > Fabric workspace, the system checks to see if other nodes in the cluster are available to replace the deleted HA node. The system then displays the list of capable nodes (only Node-3 in the example), which you can select. After you confirm the selected node, the Distributed Resource Manager (DRM) service adds the node to the HA cluster by sending requests to the Node Management Agent (NMA) running on the newly selected node. The following actions are initiated on the node added to the HA cluster:

  • Apache HTTP server with the mod_proxy load balancer is started on the node and the node is configured with all JBoss nodes as members.

  • If there are no dedicated database nodes in the cluster, the database from the MySQL server on the other HA node in the cluster is copied and the MySQL server is started on the node. This server is configured as a backup of the other MySQL server in the cluster and it resynchronizes with the primary in the background. The existing MySQL server is also reconfigured to act as a backup of this new server to ensure a symmetric primary/backup configuration on both.

When you add dedicated database nodes to the Junos Space cluster, you add two nodes together as the primary and secondary database nodes to form the MySQL cluster. The database is copied from the active HA node to the two database nodes and is disabled on the HA nodes. If you were to delete one of the database nodes from the cluster, the other database node is designated the primary database node. The system checks whether non-HA nodes in the cluster are available to replace the deleted database node and displays the list of nodes you can select to replace the deleted node.

After you select a node, the Distributed Resource Manager (DRM) service adds the node to the MySQL cluster by sending requests to the Node Management Agent (NMA) running on the newly selected node.

The following actions are initiated on the node added to the MySQL cluster:

  • The database from the MySQL server on the primary database node in the cluster is copied and the MySQL server is started on the newly-added secondary database node. This server is configured as a backup of the MySQL server on the primary database node and it resynchronizes with the primary in the background. The existing MySQL server on the primary database node is also reconfigured to act as a backup of this new server on the secondary database node to ensure a symmetric primary/backup configuration on both.

  • The JBoss server is stopped on the newly added database node.

In addition to the three default logical clusters, if you have a Cassandra cluster as part of the Junos Space fabric, the files uploaded to Cassandra are copied to all the Cassandra nodes that are part of the Cassandra cluster. Hence, if one Cassandra node fails, the files from the failed node are not lost. However, Junos Space Platform cannot upload files to or delete files in the Cassandra database until the node that failed is deleted.

If the Cassandra service is enabled on an HA node and that node goes down, and if you want to run the Cassandra service on the newly added HA node, you must manually enable and start the Cassandra service on the node. When the last node with the Cassandra service running is deleted, the files stored in the Cassandra database are lost.