Reboot Nodes in Paragon Automation | Paragon Automation (Pathfinder, Planner, Insights) 24.1

Back up your current Paragon Automation cluster data.

Copy the backed up data to a secure secondary server outside the cluster. The data.sh includes information on the location of the backed up file. Run the scp -prv command to copy the backed up file from the local host to the secondary server outside the cluster.

Check for any errors in the pods using the health-check.sh script.

Use the kubectl get nodes command to view the status of the cluster nodes. The status of the nodes must be Ready, and the roles must be either control-plane or none.

Cordon off the primary node to remove it from scheduling.

Cordoning a Kubernetes node marks it as unavailable to the Kubernetes scheduler, preventing it from hosting any new pods. This is useful when you need to perform maintenance on a node without affecting the currently running pods.

The makes the node cordoned, making it ineligible to host any new pods.

After cordoning a node, you might want to drain the node to evict the running pods and reschedule them onto other nodes. Use the following command to drain all nodes (safely evict all pods from the nodes).

Identify if there are any pods waiting to be rescheduled.

Pending processes on the cordoned node are listed. Nodes that do not have any pending processes are marked <none>.

If there are no pod waiting to be scheduled, recheck for any errors in the pods using the health-check.sh script.

Reboot the cordoned node.

It would take roughly 5 to 10 minutes for the node to reboot.

Run the following command on primary node 1.

The pods in the cluster is redistributed within 15 minutes of running the command.

After the pods are redistributed, check for any errors in the pods by using the health-check.sh script.

Identify the newly rebooted node.

Repeat step 3 through step 12 to reboot the other nodes in Paragon Automation.