Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 

Configuring the Cassandra Database in a Multiple Data Center Environment

 

NorthStar Controller uses the Cassandra database to manage database replicas in a NorthStar cluster. The default setup of Cassandra assumes a single data center. In other words, Cassandra knows only the total number of nodes; it knows nothing about the distribution of nodes within data centers.

But in a production environment, as opposed to a lab environment, it is typical to have multiple data centers with one or more NorthStar nodes in each data center. In a multiple data center environment, it is preferable for Cassandra to have awareness of the data center topology and to take that into consideration when placing database replicas.

This topic provides the steps for configuring Cassandra for use in a multiple data center environment. Because Apache Cassandra is an open source software, usage, terminology, and best practices are well documented elsewhere. These are some sample web sites:

To aid in visualization, consider Figure 1 which shows a NorthStar cluster consisting of nine NorthStar nodes distributed across three data centers. We refer to this example in the procedure that follows.

Figure 1: Multiple Data Center Example
Multiple Data Center Example

Before you begin the configuration, we recommend that you verify the NorthStar status in all nodes and check the status of the Cassandra cluster.

  1. Check the status of processes on the active node. All processes should be running.
  2. Check the status of processes on standby nodes. On standby nodes, at least the northstar: and northstar_pcs: processes should be STOPPED.
  3. Check the status of the Cassandra cluster.

To configure Cassandra to support NorthStar HA in a multiple data center environment, perform the following steps:

  1. Modify the cluster name.

    Change the cluster name in all servers (data centers 1, 2, and 3 in our example) to “NorthStar Cluster” from the default of “Test Cluster”:

  2. Modify the endpoint snitch.

    Snitch provides information to Cassandra regarding the network topology so requests can be routed efficiently and Cassandra can distribute replicas according to the assigned grouping. The recommended snitch is GossippingPropertyFileSnitch. It propagates the rack and data center as defined in the cassandra-rackdc.properties file on each node.

    Update the endpoint_snitch entry in cassandra.yaml in all of the nodes in all of the data centers:

  3. Update the seed node.

    Select one node from each data center to act as seed node. In our example, we select NS12 for DC1, NS22 for DC2, and NS32 for DC3. Seed nodes are used during initial startup to discover the cluster and for bootstrapping the gossip process for new nodes joining the cluster:

  4. Modify data center and rack properties.

    In each data center, update cassandra-rackdc.properties in all nodes to reflect the name of the data center. In our example, dc=DC1 for nodes in data center 1, dc=DC2 for nodes in data center 2, and dc=DC3 for nodes in data center 3. Use a rack name common to all data centers (rack=RAC1 in our example):

  5. Modify the limit.conf file.

    This setting is used to increase system resources. Modify limit.conf by commenting out any current ‘soft’ or ‘hard’ system settings for nofile and nproc on all nodes in all data centers:

  6. Modify supervisord_infra.conf for Cassandra.

    Modify the supervisord_infra.conf file in all nodes in all data centers so the user parameter and command option are set to run as PCS user:

  7. Stop the Cassandra database and any processes that could access the database.

    Stop the Cassandra database using the supervisorctl stop infra:cassandra command. Also stop any processes that could access Cassandra. Perform this step on all nodes in the cluster. For our example, it must be performed on all nine nodes.

  8. Remove the existing Cassandra database.

    During the initial installation, remove existing Cassandra data to avoid conflicts between existing data and the new configuration. If you omit this step, you might encounter errors or exceptions. This procedure involves clearing the existing backup directory (data.orig), and moving the existing data to the now-cleared backup directory, leaving the data directory empty for new data. Perform this step in all nodes in all data centers:

  9. Update supervisorctl and start Cassandra.

    Execute supervisorctl update to restart the processes defined under supervisord_infra.conf and start Cassandra. Perform this step in all nodes in all data centers.

    Note

    It could take up to three minutes for all processes to restart.

  10. Verify the Cassandra status.

    Check that the Cassandra process is running by executing the supervisorctl status command. To verify the status of the Cassandra database, first ensure that the proper environment is set up by running source /opt/northstar/northstar.env, and then execute the nodetool status command:

  11. Change the Cassandra password.

    When the data directory has been removed and the Cassandra database has been restarted, the credential for the database reverts to the default, “cassandra”. To change the Cassandra password, use the cqlsh shell. In this example, we are changing the Cassandra password to “Embe1mpls”. In practice, use the password assigned by the system administrator. Changing the Cassandra password need only be done on one server in the cluster (choose any server in any data center), and it is propagated across all nodes in the cluster.

  12. Verify the new Cassandra password.

    SSH into any of the NorthStar nodes using cqlsh shell with the new password to verify that the new password is updated.

  13. Replicate the Cassandra user in all nodes:

    Verify the nodes that received a replica of the Cassandra password after this operation:

  14. Perform nodetool repair to update the Cassandra user data across nodes. This step need only be performed on one of the NorthStar nodes (NS11, for example) in the cluster.
  15. Add a new user called “northstar” in the Cassandra database.

    Create a user called “northstar” with the assigned credential. In this example, the user “northstar” is assigned the password “Embe1mpls”. Configuring this user need only be done on one server in the cluster (NS11, for example). The password information is replicated across all nodes in all data centers in the cluster.

    Verify that user “northstar” has been created in all nodes in all data centers:

  16. Modify the northstar.cfg file to use the “northstar” user.

    For the NorthStar application to access the Cassandra database using the new “northstar” user, you must first change the db_username to “northstar” in the northstar.cfg file. This change must be implemented in all nodes in all data centers.

  17. Change the replication factor.

    Use cqlsh to change the default replication factor, “simpleStrategy”, to “NetworkTopologyStrategy” to ensure the definition of replicas in each data center. The keyspace “system_auth” is replicated to all nodes in all data centers for purposes of authentication. The other keyspaces are replicated to two nodes per data center with the exception of the “system_traces” keyspace which is only replicated to one node per data center.

    Changing the replication factor need only be done on one of the nodes in one of the data centers (NS11, for example). The new replication factor information is updated across all nodes in all data centers in the cluster.

  18. Initialize the Cassandra keyspace and tables.

    Select one of the servers (NS11, for example) to initialize the Cassandra database using the custom script, init_db.sh. The information is then replicated across all nodes in all data centers in the cluster.

  19. Verify the changes to the replication factor.

    Use the cqlsh client to verify that the new replication strategy has been applied:

  20. Use a Cassandra tool to update the replicas.

    Cassandra comes with a useful tool called “nodetool” that enables you to manage the Cassandra database including repairing nodes or troubleshooting. Select one of the nodes in one of the data centers and perform nodetool repair with the dc parallel option. The tool compares the replicas with each other and updates all the data to the most recent version, ensuring data consistency across the cluster. It can take time for the data to be replicated across the cluster, depending on the discrepancies discovered. The repair update activity is logged to /opt/northstar/logs/dbRepair.log.

    Nodetool offers additional options as well, as shown in this example.

    Note

    Be sure to source the environment variables before using nodetool.

  21. To resume services, restart the stopped processes.

    Restart the stopped processes in the active node.