OpenSearch Fails During Cluster Upgrade
Problem
While upgrading Paragon Automation from an earlier release to the current release, OpenSearch might fail.
Solution
Restart Opensearch or restore OpenSearch data on the cluster node.
Log in to the Linux root shell of a cluster node and restart OpenSearch.
root@pa1:~# kubectl scale sts -n common opensearch-cluster-master --replicas=0; sleep 120; kubectl scale sts -n common opensearch-cluster-master --replicas=3 statefulset.apps/opensearch-cluster-master scaled statefulset.apps/opensearch-cluster-master scaled
Check OpenSearch health status using the
paragon-utils curl http://opensearch-cluster-master.common:9200/_cat/health?v
command. If the status is yellow or red, perform the following steps.List all the pods.
root@pa1:~# kubectl get po -n common NAME READY STATUS RESTARTS AGE alert-manager-follower-5ccf865f99-7g7l5 1/1 Running 6 (4d1h ago) 6d23h alert-manager-leader-659c4c888b-4lr9x 1/1 Running 6 (4d1h ago) 6d23h atom-db-0 2/2 Running 0 6d23h atom-db-1 2/2 Running 0 6d23h atom-db-2 2/2 Running 0 6d23h backup-server-644cc495dd-kxk26 1/1 Running 0 6d23h backup-vmdb-0ld7nh-gp9mk 0/1 Completed 0 5d8h backup-vmdb-nqkop4-cdnfz 0/1 Completed 0 5d8h backup-vmdb-zew5t9-prlbt 0/1 Completed 0 5d8h cfssl-76474f7455-bwr7b 1/1 Running 0 6d23h common-utils-865547d799-t2w78 1/1 Running 0 6d23h kafka-0 2/2 Running 0 6d23h kafka-1 2/2 Running 0 6d23h kafka-2 2/2 Running 0 6d23h local-volume-provisioner-6csjw 1/1 Running 0 7d local-volume-provisioner-g6pzl 1/1 Running 0 7d local-volume-provisioner-qnlh7 1/1 Running 0 7d local-volume-provisioner-qsdxb 1/1 Running 0 7d mailservice-9ff8dff9f-gfk5m 2/2 Running 0 6d23h nats-0 3/3 Running 0 6d23h nats-1 3/3 Running 0 6d23h nats-2 3/3 Running 0 6d23h nats-box-5886877c65-b486r 1/1 Running 0 6d23h opensearch-backup-7f7dcbb77f-2q5p5 1/1 Running 0 6d23h opensearch-backup-cron-28975680-x8wjp 0/1 Completed 0 17h opensearch-backup-cron-28976040-z6p9m 0/1 Completed 0 11h opensearch-backup-cron-28976400-9psp5 0/1 Completed 0 5h48m opensearch-cleanup-cron-28972800-zqkkr 0/1 Completed 0 2d17h opensearch-cleanup-cron-28974240-vrl4k 0/1 Completed 0 41h opensearch-cleanup-cron-28975680-2fppn 0/1 Completed 0 17h opensearch-cluster-master-0 1/1 Running 0 4d1h opensearch-cluster-master-1 1/1 Running 0 4d1h opensearch-cluster-master-2 1/1 Running 0 4d1h postgres-operator-78dcf4b786-96wdb 1/1 Running 0 6d23h redis-master-5877db87fd-szvhp 1/1 Running 0 6d23h zookeeper-0 1/1 Running 0 6d23h zookeeper-1 1/1 Running 0 6d23h zookeeper-2 1/1 Running 0 6d23h
-
Determine the name of the
opensearch-backup-**
pod and log in to the pod.root@pa1:~# kubectl exec -it -n common opensearch-backup-7f7dcbb77f-2q5p5 -- bash
List the OpenSearch backup directories on the pod.
root@opensearch-backup-7f7dcbb77f-2q5p5:/# ls -l /opt/paragon/opensearch-backup total 0 drwxr-xr-x 4 root root 2 Feb 2 18:00 20250202_180003-priority drwxr-xr-x 4 root root 2 Feb 3 00:00 20250203_000003-priority drwxr-xr-x 4 root root 2 Feb 3 06:00 20250203_060002-priority drwxr-xr-x 4 root root 2 Feb 3 12:00 20250203_120003-priority drwxr-xr-x 4 root root 2 Feb 3 18:00 20250203_180002-priority drwxr-xr-x 3 root root 1 Feb 3 18:00 temp
-
Determine the backup directory that you want to restore and restore the backup.
root@opensearch-backup-7f7dcbb77f-2q5p5:/# common-utils/paragon-opensearch-restore /opt/paragon/opensearch-backup/20250203_180002-prioritydirectory 2025-02-03 18:28:41 Usage: [backup_directory] [space-seperated indices_list] 2025-02-03 18:28:41 Restoring OpenSearch data from backup: /opt/paragon/opensearch-backup/20250203_180002-priority 2025-02-03 18:28:41 Restoring datastream template 2025-02-03 18:28:42 Warning: Failed to create datastream template for jcloud_alerts_cleared. 2025-02-03 18:28:42 2025-02-03 18:28:42 Restoring datastream metadata 2025-02-03 18:28:42 Warning: Failed to create datastream metadata for jcloud_alerts_cleared. 2025-02-03 18:28:42 2025-02-03 18:28:42 Normal indices Restore complete. 2025-02-03 18:28:42 Restoring indices from datastream 2025-02-03 18:28:42 Data Streams Restore complete. 2025-02-03 18:28:42 Restoring ISM policy 2025-02-03 18:28:43 Restoring ISM policy associations 2025-02-03 18:28:44 Restore completed.