How to Backup and Restore Contrail Databases in JSON Format in Openstack Environments Using the Openstack 16.1 Director Deployment
This document shows how to backup and restore the Contrail databases—Cassandra and Zookeeper—in JSON format when Contrail Networking is running in Openstack-orchestrated environments that were deployed using the RedHat Openstack 16.1 director deployment.
If you are deploying Contrail Networking in an Openstack-orchestrated environment that was deployed using an Openstack 13-based or Ansible deployer, see How to Backup and Restore Contrail Databases in JSON Format in Openstack Environments Using the Openstack 13 or Ansible Deployer.
Contrail Networking is initially supported in Openstack environments using the Openstack 16.1 director deployment in Contrail Networking Release 2008. See Contrail Networking Supported Platforms for a matrix of Contrail Networking release support within orchestration platforms and deployers.
Before You Begin
The backup and restore procedure must be completed for nodes running the same Contrail Networking release. The procedure is used to backup the Contrail Networking databases only; it does not include instructions for backing up orchestration system databases.
Database backups must be consistent across all systems because the state of the Contrail database is associated with other system databases, such as OpenStack databases. Database changes associated with northbound APIs must be stopped on all the systems before performing any backup operation. For example, you might block the external VIP for northbound APIs at the load balancer level, such as HAproxy.
Simple Database Backup in JSON Format
This procedure provides a simple database backup in JSON format.
This procedure is performed using the db_json_exim.py
script located inside the config-api container in /usr/lib/python2.7/site-packages/cfgm_common
on the controller node.
To perform this database backup:
From a controller node, enable the
db_json_exim.py
script:(overcloud) [user@overcloud-contrailcontroller-0 heat-admin]# podman exec -it contrail_config_api bash (config-api)[user@overcloud-contrailcontroller-0 /]$ ls /usr/lib/python2.7/site-packages/cfgm_common/db_json_exim.py /usr/lib/python2.7/site-packages/cfgm_common/db_json_exim.py
Log into one of the config nodes. Create the /tmp/db-dump directory on any of the config node hosts.
mkdir /tmp/db-dump
On the same config node, copy the
contrail-api.conf
file from the container to the host.podman cp contrail_config_api:/etc/contrail/contrail-api-0.conf /tmp/db-dump/contrail-api.conf
The Cassandra database instance on any configuration node includes the complete Cassandra database for all configuration nodes in the cluster. Steps 2 and 3, therefore, only need to be performed on one configuration node.
On all Contrail controller nodes, stop the following Contrail configuration services:
systemctl stop tripleo_contrail_config_svc_monitor.service systemctl stop tripleo_contrail_config_device_manager.service systemctl stop tripleo_contrail_config_schema.service systemctl stop tripleo_contrail_config_api.service systemctl stop tripleo_contrail_config_nodemgr.service systemctl stop tripleo_contrail_config_database_nodemgr.service
This step must be performed on each individual controller node in the cluster.
On all nodes hosting Contrail Analytics containers, stop the following analytics services:
systemctl stop tripleo_contrail_analytics_kafka.service systemctl stop tripleo_contrail_analytics_snmp_nodemgr.service systemctl stop tripleo_contrail_analytics_alarmgen.service systemctl stop tripleo_contrail_analytics_alarm_nodemgr.service systemctl stop tripleo_contrail_analytics_topology.service systemctl stop tripleo_contrail_analytics_collector.service systemctl stop tripleo_contrail_analytics_nodemgr.service systemctl stop tripleo_contrail_analytics_snmp_collector.service systemctl stop tripleo_contrail_analytics_api.service
This step must be performed on each individual analytics node in the cluster.
Return to the config node where you performed steps 2 and 3.
Use the podman images command to list the name or ID of the config api image.
podman images | grep config-api
Example:
(overcloud) [user@overcloud-contrailcontroller-0 db-dump]# podman images | grep config-api 192.168.24.1:8787/contrail/contrail-controller-config-api 2011.L1.297 2dcd2feaeed5 2 months ago 876 MB
From the same config node, start the config api container by pointing the
entrypoint.sh
script to the /bin/bash directory then mapping /tmp/db-dump directory from the host to the /tmp directory inside the container. You perform this step to ensure that the API services are not started on the config node.Enter the -v /etc/contrail/ssl:/etc/contrail/ssl:ro command option when
cassandra_use_ssl
is used as the api-server configuration parameter to ensure TLS certificates are mounted to the Contrail SSL directory. This mounting ensures that the backup procedure succeeds in environments with endpoints that require TLS authentication.The registry_name and container_tag variables must match step 6.
podman run --rm -it -v /tmp/db-dump/:/tmp:Z -v /etc/contrail/ssl:/etc/contrail/ssl:ro --network host --entrypoint=/bin/bash registry_name/contrail-controller-config-api:container_tag
Example:
podman run --rm -it -v /tmp/db-dump/:/tmp:Z -v /etc/contrail/ssl:/etc/contrail/ssl:ro --network host --entrypoint=/bin/bash 192.168.24.1:8787/contrail/contrail-controller-config-api:2011.L1.297
From the container created on the config node in Step 7, use the
db_json_exim.py
script to backup data in JSON format. The db dump file will be saved in the /tmp/db-dump/ directory on this config node.Example:
(config-api)[user@overcloud-contrailcontroller-0 /]$ cd /usr/lib/python2.7/site-packages/cfgm_common (config-api)[user@overcloud-contrailcontroller-0 /usr/lib/python2.7/site-packages/cfgm_common]$ python db_json_exim.py --export-to /tmp/db-dump.json --api-conf /tmp/contrail-api.conf 2021-06-30 19:47:27,120 INFO: Cassandra DB dumped 2021-06-30 19:47:28,878 INFO: Zookeeper DB dumped 2021-06-30 19:47:28,895 INFO: DB dump wrote to file /tmp/db-dump.json
The Cassandra database instance on any configuration node includes the complete Cassandra database for all configuration nodes in the cluster. You, therefore, only need to perform step 4 through 6 from one of the configuration nodes.
(Optional. Recommended) From the same config node, enter the
cat /tmp/db-dump.json | python -m json.tool | less
command to view a more readable version of the file transfer.cat /tmp/db-dump.json | python -m json.tool | less
From the same config node, exit out of the config api container. This will stop the container.
exit
On each configuration node, start the following configuration services:
systemctl start tripleo_contrail_config_svc_monitor.service systemctl start tripleo_contrail_config_device_manager.service systemctl start tripleo_contrail_config_schema.service systemctl start tripleo_contrail_config_api.service systemctl start tripleo_contrail_config_nodemgr.service systemctl start tripleo_contrail_config_database_nodemgr.service
This step must be performed on each individual config node.
On each analytics node, start the following analytics services:
systemctl start tripleo_contrail_analytics_kafka.service systemctl start tripleo_contrail_analytics_snmp_nodemgr.service systemctl start tripleo_contrail_analytics_alarmgen.service systemctl start tripleo_contrail_analytics_alarm_nodemgr.service systemctl start tripleo_contrail_analytics_topology.service systemctl start tripleo_contrail_analytics_collector.service systemctl start tripleo_contrail_analytics_nodemgr.service systemctl start tripleo_contrail_analytics_snmp_collector.service systemctl start tripleo_contrail_analytics_api.service
This step must be performed on each individual analytics node.
On each config node, enter the
contrail-status
command to confirm that services are in the active or running states.Note:Some command output and output fields are removed for readability. Output shown is from a single node hosting config and analytics services.
contrail-status Pod Service Original Name State analytics api contrail-analytics-api running analytics collector contrail-analytics-collector running analytics nodemgr contrail-nodemgr running analytics provisioner contrail-provisioner running analytics redis contrail-external-redis running analytics-alarm alarm-gen contrail-analytics-alarm-gen running analytics-alarm kafka contrail-external-kafka running <some output removed for readability> == Contrail control == control: active nodemgr: active named: active dns: active == Contrail analytics-alarm == nodemgr: active kafka: active alarm-gen: active == Contrail database == nodemgr: active query-engine: active cassandra: active == Contrail analytics == nodemgr: active api: active collector: active == Contrail config-database == nodemgr: active zookeeper: active rabbitmq: active cassandra: active == Contrail webui == web: active job: active == Contrail analytics-snmp == snmp-collector: active nodemgr: active topology: active == Contrail config == svc-monitor: active nodemgr: active device-manager: active api: active schema: active
Restore Database from the Backup in JSON Format
This procedure provides the steps to restore a system using the simple database backup JSON file that was created in Simple Database Backup in JSON Format.
To restore a system from a backup JSON file:
Copy the
contrail-api.conf
file from the container to the host on any one of the config nodes.podman cp contrail_config_api:/etc/contrail/contrail-api-0.conf /tmp/db-dump/contrail-api.conf
On all of the Contrail controller nodes, stop these configuration services:
systemctl stop tripleo_contrail_config_svc_monitor.service systemctl stop tripleo_contrail_config_device_manager.service systemctl stop tripleo_contrail_config_schema.service systemctl stop tripleo_contrail_config_api.service systemctl stop tripleo_contrail_config_nodemgr.service systemctl stop tripleo_contrail_config_database_nodemgr.service
On all nodes hosting Contrail Analytics containers, stop the following services:
systemctl stop tripleo_contrail_analytics_kafka.service systemctl stop tripleo_contrail_analytics_snmp_nodemgr.service systemctl stop tripleo_contrail_analytics_alarmgen.service systemctl stop tripleo_contrail_analytics_alarm_nodemgr.service systemctl stop tripleo_contrail_analytics_topology.service systemctl stop tripleo_contrail_analytics_collector.service systemctl stop tripleo_contrail_analytics_nodemgr.service systemctl stop tripleo_contrail_analytics_snmp_collector.service systemctl stop tripleo_contrail_analytics_api.service
Stop the Cassandra service on all the
config-db
controllers.systemctl stop tripleo_contrail_config_database.service
Stop the Zookeeper service on all controllers.
systemctl stop tripleo_contrail_config_zookeeper.service
Backup the Zookeeper data directory on all the controllers.
cd /var/lib/contrail/config_zookeeper cp -aR version-2/ zookper-bkp.save
Delete the Zookeeper data directory contents on all the controllers.
rm -rf version-2
Backup the Cassandra data directory on all the controllers.
cd /var/lib/contrail/config_cassandra cp -aR data/ Cassandra_data-save
Delete the Cassandra data directory contents on all controllers.
rm -rf data/
Start the Zookeeper service on all of the controllers.
systemctl start tripleo_contrail_config_zookeeper.service
Start the Cassandra service on all of the controllers.
systemctl start tripleo_contrail_config_database.service
Use the podman images command to list the name or ID of the config api image.
podman image ls | grep config-api
Example:
user@overcloud-contrailcontroller-0 heat-admin]# podman image ls | grep config-a 192.168.24.1:8787/contrail/contrail-controller-config-api 2011.L1.297 2dcd2feaeed5 1 months ago 876 MB
Run a new podman container using the name or ID of the
config_api
image on the same config node.Enter the -v /etc/contrail/ssl:/etc/contrail/ssl:ro command option when
cassandra_use_ssl
is used as api-server configuration parameter to ensure TLS certificates are mounted to the Contrail SSL directory. This mounting ensures that this backup procedure succeeds in environments with endpoints that require TLS authentication.Use the registry_name and container_tag from the output of the step 12.
podman run --rm -it -v /tmp/db-dump/:/tmp:Z -v /etc/contrail/ssl:/etc/contrail/ssl:ro --network host --entrypoint=/bin/bash <registry_name>/contrail-controller-config-api:<container tag>
Example:
podman run --rm -it -v /tmp/db-dump/:/tmp:Z -v /etc/contrail/ssl:/etc/contrail/ssl:ro --network host --entrypoint=/bin/bash 192.168.24.1:8787/contrail/contrail-controller-config-api:2011.L1.297
Restore the data in the new running container on the same config node.
cd /usr/lib/python2.7/site-packages/cfgm_common python db_json_exim.py --import-from /tmp/db-dump.json --api-conf /tmp/contrail-api.conf
Example:
cd /usr/lib/python2.7/site-packages/cfgm_common python db_json_exim.py --import-from /tmp/db-dump.json --api-conf /tmp/contrail-api.conf 2021-07-06 17:22:17,157 INFO: DB dump file loaded 2021-07-06 17:23:12,227 INFO: Cassandra DB restored 2021-07-06 17:23:14,236 INFO: Zookeeper DB restored
Exit out of the config api container. This will stop the container.
exit
Start config services on all of the controllers:
systemctl start tripleo_contrail_config_svc_monitor.service systemctl start tripleo_contrail_config_device_manager.service systemctl start tripleo_contrail_config_schema.service systemctl start tripleo_contrail_config_api.service systemctl start tripleo_contrail_config_nodemgr.service systemctl start tripleo_contrail_config_database_nodemgr.service
Start services on all of the analytics nodes:
systemctl start tripleo_contrail_analytics_kafka.service systemctl start tripleo_contrail_analytics_snmp_nodemgr.service systemctl start tripleo_contrail_analytics_alarmgen.service systemctl start tripleo_contrail_analytics_alarm_nodemgr.service systemctl start tripleo_contrail_analytics_topology.service systemctl start tripleo_contrail_analytics_collector.service systemctl start tripleo_contrail_analytics_nodemgr.service systemctl start tripleo_contrail_analytics_snmp_collector.service systemctl start tripleo_contrail_analytics_api.service
Enter the
contrail-status
command on each configuration node and, when applicable, on each analytics node to confirm that services are in the active or running states.Note:Output shown for a config node. Some command output and output fields are removed for readability.
contrail-status Pod Service Original Name State config api contrail-controller-config-api running config device-manager contrail-controller-config-devicemgr running config dnsmasq contrail-controller-config-dnsmasq running config nodemgr contrail-nodemgr running config provisioner contrail-provisioner running config schema contrail-controller-config-schema running config stats contrail-controller-config-stats running <some output removed for readability> == Contrail control == control: active nodemgr: active named: active dns: active == Contrail database == nodemgr: active query-engine: active cassandra: active == Contrail config-database == nodemgr: active zookeeper: active rabbitmq: active cassandra: active == Contrail webui == web: active job: active == Contrail config == svc-monitor: active nodemgr: active device-manager: active api: active schema: active