View Cluster Health
You can view the health status of the Routing Director deployment cluster on the GUI without
having to log in to a node VM. The Deployment Shell request deployment
health-check command retrieves and displays the status of the Routing Director
deployment cluster health on the CLI. In addition to the command, a periodic cron job runs
hourly, by default, similarly checking the health status of the deployment cluster. The
results of the health-check operation, whether executed manually or through the cron job, are
stored in a health-check database.
You can view the result of the last executed cluster health-check operation on the Routing Director GUI banner in the format Cluster Health: Health-Status. The Health-Status can be GREEN, AMBER, RED, or UNKNOWN. If any parameter check status is empty or has a status other than OK, AMBER, or NOT OK, the overall cluster status in the GUI is displayed as UNKNOWN.
Click the Cluster Health button to view the following information on the deployment cluster health status.
Cluster Health Status
To access the Cluster Health Checks page, click Cluster Health > Cluster Health Status on the banner.
This page displays the result and output of the most recent health-check command. The latest health-check status is retrieved from the health-check database and displayed irrespective of whether the cluster health was checked manually or through the cron job. Expand Status Details to view complete output of the health-check command. The output on the GUI matches the command output on the CLI.
When a health check is in progress, the currently displayed Health-Status on the GUI banner displays a clock symbol (⏱) on the top right. The Cluster Health Checks page displays that a health check is currently underway and a percentage completion graph is displayed on the right matching the percentage completion of the health-check command. Below the graph you can view the number of parameters validated at that point of time against the total number of parameters that the health-check command validates. The page auto-refreshes to keep the status current. When the graph reaches 100%, the page displays the result of the health-check operation with an option to view the complete status details.
When the health check is completed, the banner displays the final health status and the clock icon disappears and the Cluster Health Checks page also displays the latest result.
Table 1 describes the fields displayed on the Cluster Heath Checks page.
| Field | Description |
|---|---|
|
Overall Cluster Status |
Displays the final output of the latest cluster health-check operation. The status can be GREEN, AMBER, RED, or UNKNOWN. A green status indicates a healthy cluster, and a red status indicates serious issues in the cluster. An amber status indicates that there maybe certain noncritical issues in the cluster. The health check command validates multiple parameters when executed. Each parameter is assigned a status of OK, AMBER, or NOT OK. If any parameter returns a status outside these values, the overall cluster status is set to UNKNOWN. |
|
Status Details |
Expand Status Details to view the detailed and complete output of the health-check command. The health-check command validates multiple parameters such as the status of the node (readiness, diskpressure, pod-status, CPU and memory usage, taints-status, and I/O latency), status of databases and services, and so on. The output matches the command output on the CLI. |
|
Last Checked |
Displays the date and time stamp of when the latest health-check operation was started in the yyyy-mm-dd hh:mm:ss format. |
|
When a health check is in progress |
|
|
Current Status |
Displays IN PROGRESS indicating that a health check is currently ongoing. |
|
Previous Status |
Displays the result of the health-check operation completed prior to the ongoing one. This is the status visible on the banner. |
|
Timestamp |
Displays the initiation date and time stamp of the ongoing health-check operation in the mmm dd, yyyy, hh:mm:ss format. |
|
Execution Mode |
Displays whether the ongoing health-check operation is user initiated (manual) or automatic (cron). |
|
Session ID |
A unique identifier to identify the ongoing health-check operation. The session ID is used to retrieve the health-check status using APIs. |
Cluster Health Check History
To access the Cluster Health Check History page, click Cluster Health > Cluster Health Check History on the banner.
This page displays details of the last five health-check operations. You can choose to display the details of a maximum of 10 health-check operations.
Tasks You Can Perform
-
Click Data Points to view the history of the last 10 health-check operations. By default, five is selected and displayed.
-
Select any health-check entry in the History table and click More > Details to view the complete status output of the selected health-check operation on the Health Timeline Detail page.
The Health Timeline Detail page displays the overall cluster status and the date and timestamp that the health-check operation was initiated. You can also expand Status Details to view the complete output of the health-check command.
-
Mouse over any health-check entry and click the Details icon that appears to the left of the entry to view the complete status output of the selected health-check operation on the Health Timeline Detail page.
-
Click ⋮ > Show/Hide Columns to view the list of columns that you can display in the History table. Select or clear the check box next to a column to show or hide the column, respectively.
| Field | Description |
|---|---|
|
Session ID |
The unique identifier to identify the health-check operation. |
|
Timestamp |
The date and time stamp of the health-check operation in the mmm dd, yyyy, hh:mm:ss format. |
|
Overall Status |
The health-check operation result. |
|
Execution Mode |
Whether the health-check operation was user initiated (manual) or automatic (cron). |
|
Total Checks |
Total number of parameters that can be validated during the health-check operation. |
|
Completed Checks |
Number of parameters that were actually validated during the health-check operation. |
|
OK Count |
Number of validated parameters with an OK status. |
|
Amber Count |
Number of parameters with an Amber status. |
|
Red Count |
Number of parameters with a Red status. |
|
Success Rate |
Percentage completion of the health-check operation. |
Settings
To access the Settings page, click Cluster Health > Settings on the banner.
Use this page to view a notification message on the GUI when the deployment cluster health status changes. Enable the Show Notifications on Status Change toggle to be notified of any change in the cluster health. When enabled, and if the cluster health status changes, the GUI displays a message at the top. For example:
The cluster health status changed from GREEN to AMBER.