Back Up and Restore
SUMMARY Learn how to use etcdctl commands to back up and restore the etcd database.
We provide these example procedures purely for informational purposes. For more information on back up and restore, see the official Kubernetes documentation (https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#backing-up-an-etcd-cluster and https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#restoring-an-etcd-cluster).
Back Up the Etcd Database
Use this example procedure to back up the etcd database.
- SSH into one of the control plane nodes in the cluster.
-
Install etcdctl version 3.4.13 or later. Etcdctl is the command line tool
for managing etcd.
If the node already has etcdctl version 3.4.13 or later installed, then you can skip this step.
Otherwise, install etcdctl:
root@cp1:~# etcdctl version Command 'etcdctl' not found
In this example, we move the file to the /tmp directory.curl -L https://storage.googleapis.com/etcd/v3.5.0/etcd-v3.5.0-linux-amd64.tar.gz
mv etcd-v3.5.0-linux-amd64.tar.gz /tmp
Extract and copy the executable to /usr/local/bin.mkdir -p /tmp/etcd-download tar -xzvf /tmp/etcd-v3.5.0-linux-amd64.tar.gz -C /tmp/etcd-download --strip-components=1
cp /tmp/etcd-download/etcdctl /usr/local/bin
Verify by querying the version.root@cp1:~# etcdctl version etcdctl version: 3.5.0 API version: 3.5
-
Set the required ETCDCTL env variables.
These variables are used implicitly by the etcdctl commands. The file paths listed are the default file paths. You can obtain these file paths by issuing theexport ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key export ETCDCTL_API=3
kubectl describe pod <etcd-pod>
command. -
Set permissions on the certificate files.
chmod 777 /etc/kubernetes/pki/etcd/server.key chmod 777 /etc/kubernetes/pki/etcd/ca.crt
- Repeat step 1 to step 4 on all the control plane nodes.
-
Back up the etcd database.
SSH back into one of the control plane nodes and take a snapshot of the etcd database.
This takes a snapshot of the database and stores it in /tmp/etcdBackup.db.etcdctl snapshot save /tmp/etcdBackup.db --endpoints=https://127.0.0.1:2379
- Copy the snapshot off the node and store in a safe place.
Restore the Etcd Database
Use this example procedure to restore the etcd database from a snapshot.
-
Restore the snapshot on all the control plane nodes.
- SSH into one of the control plane nodes.
- Copy the saved snapshot to /tmp/etcdBackup.db (for example).
-
Restore the backup.
where <cp1-etcd-pod> is the name of the contrail-etcd pod that you're currently in and <cp1-etcd-pod-ip> is the IP address of that pod. The <cp2-etcd-pod> and <cp3-etcd-pod> refer to the other contrail-etcd pods. This creates a <cp1-etcd-pod>.etcd directory on the node.etcdctl snapshot restore /tmp/etcdBackup.db \ --name=<cp1-etcd-pod> \ --initial-cluster=<cp1-etcd-pod>=https://<cp1-etcd-pod-ip>:2380,<cp2-etcd-pod>=https://<cp2-etcd-pod-ip>:2380,<cp3-etcd-pod-name>=https://<cp3-etcd-pod-ip>:2380 \ --initial-advertise-peer-urls= https://<cp1-etcd-pod-ip>:2380
-
Repeat for the other control plane nodes, substituting the
--name
and--initial-advertise-peer-urls
parameters with the respective pod name and IP address.
-
Stop the API server on all the control plane nodes.
- SSH into one of the control plane nodes.
-
Stop the API server.
mkdir -p /tmp/k8s
mv /etc/kubernetes/manifests/*.yaml /tmp/k8s
- Repeat for the other control plane nodes.
-
Move the restored etcd snapshot to /var/lib/etcd on all
the control plane nodes.
- SSH into one of the control plane nodes.
-
Move the restored etcd snapshot.
mv /var/lib/etcd/member /var/lib/etcd/member.bak
where <restored-etcd-directory> is the .etcd directory created in step 1.mv <restored-etcd-directory>/member /var/lib/etcd/
- Repeat for the other control plane nodes.
-
Restore the API server on all control plane nodes.
- SSH into one of the control plane nodes.
-
Restore the API server.
mv /tmp/k8s/*.yaml /etc/kubernetes/manifests
- Repeat for the other control plane nodes.
-
Restart the kubelet on all control plane nodes.
- SSH into one of the control plane nodes.
-
Restart the kubelet.
systemctl stop kubelet
systemctl start kubelet
- Repeat for the other control plane nodes.
-
Restart the kube-system apiserver and controller.
Delete all the kube-apiserver and kube-controller pods.
kubectl delete pod <kube-apiserver-xxx> -n kube-system
These pods will automatically restart.kubectl delete pod <kube-controller-xxx> -n kube-system
-
Restart the contrail-system apiserver and controller.
Delete all the contrail-k8s-apiserver and contrail-k8s-controller pods.
kubectl delete pod <contrail-k8s-apiserver-xxx> -n contrail-system
These pods will automatically restart.kubectl delete pod <contrail-k8s-controller-xxx> -n contrail-system
-
Restart the vrouters.
Delete all the contrail-vrouter-masters and contrail-vrouter-nodes pods.
kubectl delete pod <contrail-vrouter-masters-xxx> -n contrail
These pods will automatically restart.kubectl delete pod <contrail-vrouter-nodes-xxx> -n contrail
-
Check that all pods are in running state.
kubectl get pods -n contrail-system
kubectl get pods -n contrail