Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 

Frequently Asked Questions: Cluster Administration

 

This section answers frequently asked questions about administering high availability clusters.



What does quorum mean and why is it necessary?

Quorum is a voting algorithm used by the cluster manager.

A cluster can function correctly only if there is general agreement among the cluster members about quorum rules. A cluster has quorum if a majority of the nodes are operational, communicating, and agree on the active cluster members. For example, in a 13-node cluster, quorum is reached only if seven or more nodes are communicating. If the seventh node is no longer operational, the cluster loses quorum and can no longer function.

A cluster must maintain quorum to prevent split-brain problems. For example, if quorum rules are not enforced in the 13-node cluster and a communication error occurs, a situation might result in which six nodes are operating on the shared disk, and the other six nodes are operating independently on the shared disk. The communication error causes the two partial clusters to overwrite areas of the disk and corrupt the file system. By contrast, if quorum rules are enforced, only one of the partial clusters can use the shared storage, thus protecting data integrity.

Although quorum rules do not prevent split-brain problems, they do determine which member is dominant and allowed to function in the cluster. If split-brain problems occur, quorum prevents more than one cluster group from taking action.



What is the minimum size of a quorum disk or partition?

The official minmum size for a quorum disk is 10 MB. The actual size is approximately 100 KB; however, we recommend reserving at least 10 MB for possible future expansion and features.



How can I rename my cluster?

To rename your cluster:

  1. Unmount all GFS partitions and stop all clustering software on all nodes in the cluster.
  2. In the /etc/cluster.conf file, change the old cluster name to the new cluster name.
  3. If you have GFS partitions in your cluster, issue the following command to change their superblock to use the new cluster name:
    # gfs_tool sb /dev/vg_name/gfs1 table new_cluster_name:gfs1
  4. Restart the clustering software on all nodes in the cluster.
  5. Remount your GFS partitions.


If both nodes in a two-node cluster lose contact with each other, don’t they try to fence each other?

Yes. When each node recognizes that the other node has stopped responding, it tries to fence the other node. Fencing is the process of separating an unavailable or malfunctioning cluster node from the resources it manages, without the support of the node being fenced. When used in combination with a quorum disk, fencing can prevent resources from being improperly used in a high availability cluster.

The node that is the first to fence the other node “wins” and becomes dominant. However, if both nodes go down simultaneously, the entire cluster is lost.

To avoid cluster loss in a two-node cluster, you can use an Intelligent Platform Management Interface (IPMI) LAN that serializes the two fencing operations, ensuring that one node reboots and the other node never fences the first.



Where can I get more information?

For more information about cluster administration, see the following Red Hat Enterprise Linux 6 documentation: