Troubleshoot Paragon Automation Installation
SUMMARY This topic provides a general guide to troubleshooting some typical problems you might encounter during and after installation.
Resolve Merge Conflicts of the Configuration File
The init
script creates the template configuration files. If you update an
existing installation using the same config-dir directory that was used for
the installation, the template files that the init
script creates are merged
with the existing configuration files. Sometimes, this merging action creates a merge conflict
that you must resolve. The script prompts you about how to resolve the conflict. When prompted,
select one of the following options:
C—You can retain the existing configuration file and discard the new template file. This is the default option.
n—You can discard the existing configuration file and reinitialize the template file.
m—You can merge the files manually. Conflicting sections are marked with lines starting with “<<<<<<<<“, “||||||||”, “========“, and “>>>>>>>>”. You must edit the file and remove the merge markers before you proceed with the update.
d—You can view the differences between the files before you decide how to resolve the conflict.
Common Backup and Restore Issues
In a scenario when you destroy an existing cluster and redeploy a software image on the same cluster nodes, if you try to restore a configuration from a previously backed up configuration folder, the restore operation might fail. Restore fails because the mount path for the backed up configuration is now changed. When you destroy an existing cluster, the persistent volume is deleted. When you redeploy the new image, the persistent volume gets recreated in one of the cluster nodes wherever space is available, but not necessarily in the same node as it was present in previously. Hence, the restore operation fails.
As a workaround:
Determine the mount path of the new persistent volume.
Copy the contents of the previous persistent volume's mount path to the new path.
Retry the restore operation.
View Installation Log Files
If the deploy
script fails, you must check the installation log files in the
config-dir directory. By default, the config-dir
directory stores six zipped log files. The current log file is saved as
log, and the previous log files are saved as log.1
through log.5 files. Every time you run the deploy
script, the current log is saved, and the oldest one is discarded.
Error messages are typically found at the end of a log file. View the error message, and fix the configuration.
View Log Files in Kibana
System logs are stored in Elasticsearch, and can be accessed through the Kibana application. To view logs in Kibana:
Troubleshooting using the kubectl Interface
The main interface in the Kubernetes cluster is kubectl, which is installed on a primary node. You can log in to the primary node and use the kubectl interface to access the Kubernetes API, view node details, and perform basic troubleshooting actions. The admin.conf file is copied to the config-dir directory on the control host as part of the installation process.
You can also access the Kubernetes API from any other node that has access to the cluster. To use
a node other than the primary node, you must copy the admin.conf file and
set the kubeconfig
environment variable. Another option is to use the
export KUBECONFIG=config-dir/admin.conf
command.
SUMMARY Use the following sections to troubleshoot and view installation details using the kubctl interface.
- View node status
- View pod status
- View detailed information about a pod
- View the logs for a container in a pod
- Run a command on a container in a pod
- View services
- Frequently used kubectl commands
View node status
Use the kubectl get no
command to view the status of the cluster nodes. The
status of the nodes must be Ready, and the roles should be either control-plane or none. For
example:
root@primary-node:~# kubectl get no NAME STATUS ROLES AGE VERSION 10.49.xx.x1 Ready control-plane,master 5d5h v1.20.4 10.49.xx.x6 Ready <none> 5d5h v1.20.4 10.49.xx.x7 Ready <none> 5d5h v1.20.4 10.49.xx.x8 Ready <none> 5d5h v1.20.4
If a node is not Ready, verify whether the kubelet process is running. You can also use the system log of the node to investigate the issue.
View pod status
Use the kubectl get po –n namespace | -A
command to view the
status of a pod. You can specify an individual namespace (such as healthbot, northstar, and
common) or you can use the -A
parameter to view the status of all namespaces.
For example:
root@primary-node:~# kubectl get po -n northstar NAME READY STATUS RESTARTS AGE bmp-854f8d4b58-4hwx4 3/3 Running 1 30h dcscheduler-55d69d9645-m9ncf 1/1 Running 1 7h13m
The status of healthy pods must be displayed as Running or Completed, and the number of ready
containers should match the total. If the status of a pod is not Running or if the number of
containers do not match, use the kubectl describe po
command to troubleshoot
the issue further.
View detailed information about a pod
Use the kubectl describe po -n namespace
pod-name
command to to view detailed information about a specific
pod. For example:
root@primary-node:~# kubectl describe po -n northstar bmp-854f8d4b58-4hwx4 Name: bmp-854f8d4b58-4hwx4 Namespace: northstar Priority: 0 Node: 10.49.xx.x1/10.49.xx.x1 Start Time: Mon, 10 May 2021 07:11:17 -0700 Labels: app=bmp northstar=bmp pod-template-hash=854f8d4b58 …
View the logs for a container in a pod
Use the kubectl logs -n namespace
pod-name [-c container-name]
command to view
the logs for a particular pod. If a pod has multiple containers, you must specify the
container for which you want to view the logs. For example:
root@primary-node:~# kubectl logs -n common atom-db-0 | tail -3 2021-05-31 17:39:21.708 36 LOG {ticks: 0, maint: 0, retry: 0} 2021-05-31 17:39:26,292 INFO: Lock owner: atom-db-0; I am atom-db-0 2021-05-31 17:39:26,350 INFO: no action. i am the leader with the lock
Run a command on a container in a pod
Use the kubectl exec –ti –n namespace
pod-name [-c container-name] --
command-line
command to run commands on a container inside a
pod. For example:
root@primary-node:~# kubectl exec -ti -n common atom-db-0 -- bash ____ _ _ / ___| _ __ (_) | ___ \___ \| '_ \| | |/ _ \ ___) | |_) | | | (_) | |____/| .__/|_|_|\___/ |_| This container is managed by runit, when stopping/starting services use sv Examples: sv stop cron sv restart patroni Current status: (sv status /etc/service/*) run: /etc/service/cron: (pid 29) 26948s run: /etc/service/patroni: (pid 27) 26948s run: /etc/service/pgqd: (pid 28) 26948s root@atom-db-0:/home/postgres#
After you run exec
the command, you get a bash shell into the Postgres database
server. You can access the bash shell inside the container, and run commands to connect to the
database itself. Not all containers provide a bash shell. Some containers provide only SSH,
and some containers do not have any shells.
View services
Use the kubectl get svc namespace | -A
command to view the
cluster services. You can specify an individual namespace (such as healthbot, northstar, and
common), or you can use -A
parameter to view the services for all namespaces.
For example:
root@primary-node:~# kubectl get svc -A --sort-by spec.type NAMESPACE NAME TYPE EXTERNAL-IP PORT(S) … healthbot tsdb-shim LoadBalancer 10.54.xxx.x3 8086:32081/TCP healthbot ingest-snmp-proxy-udp LoadBalancer 10.54.xxx.x3 162:32685/UDP healthbot hb-proxy-syslog-udp LoadBalancer 10.54.xxx.x3 514:31535/UDP ems ztpservicedhcp LoadBalancer 10.54.xxx.x3 67:30336/UDP ambassador ambassador LoadBalancer 10.54.xxx.x2 80:32214/TCP,443:31315/TCP,7804:32529/TCP,7000:30571/TCP northstar ns-pceserver LoadBalancer 10.54.xxx.x4 4189:32629/TCP …
In this example, the services are sorted by type, and only services of type LoadBalancer are displayed. You can view the services that are provided by the cluster and the external IP addresses that are selected by the load balancer to access those services.
Frequently used kubectl commands
List the replication controllers:
# kubectl get –n namespace deploy
# kubectl get –n namespace statefulset
Restart a component:
kubectl rollout restart –n namespace deploy deployment-name
Edit a Kubernetes resource: You can edit a deployment or any Kubernetes API object, and these changes are saved to the cluster. However, if you reinstall the cluster, these changes are not preserved.
# kubectl edit –ti –n namespace deploy deployment-name