Upgrade Instructions
Instructions for upgrading your Contrail Cloud to the specified release.
Upgrade Your Contrail Cloud to the Current Release
Juniper supports a n+1 upgrade path for releases. This procedure remains unchanged and supports upgrade from 13.5 to 13.6.
This is an in place upgrade as defined by RHOSP TripleO model. You now have the option to run a parallel update of roles to complete this upgrade. You must follow a reboot process following the upgrade, if the nodes were not rebooted automatically.
No deployment configurations are required when updating. If deployment configuration changes must be made for any reason, they must be applied to your existing Contrail Cloud deployment before upgrading to the current version. As a best practice, it is always good to review your configuration files to make sure they adhere to a proper schema and the needs of your deployment environment.
The Contrail Cloud upgrade procedure allows for fine-grained
control of the upgrade process. Control of the upgrade process is
expressed through configurations in the update plan found in the config/site.yml
file.
Before You Upgrade
Take these initial steps before starting your Contrail Cloud Upgrade. This will help eliminate possible errors that might occur during the upgrade process and will help ensure expected results. The sections below are a prerequisite to the upgrade of your Contrail Cloud.
Review Your Configuration Files
At this point you want to review your current setup to ensure all configuration settings are accurate and reflect a desired deployment for your Contrail Cloud environment.
Review all the YAML files in the
/var/lib/contrail_cloud/config
directory and ensure all values match your expected results.Compare the old configs against the new Contrail Cloud config schema to check for gaps. To check that the configs are compatible, run:
/var/lib/contrail_cloud/scripts/node-configuration.py schema
Verify Undercloud/Overcloud Health and Service Operations
It is vital that you always check the health of your cloud and the services running in your cloud before attempting any deployment or upgrade activities. You must ensure that the undercloud/overcloud is fully functional, healthy, and that all services are active. Any problems in your cloud health may cause errors during upgrading. Incorrect settings and configurations will carry over to the upgraded Contrail Cloud deployment.
- Check the health of the undercloud, overcloud and the nodes running on them. To verify the health of your cloud and the services, see Node Reboot and Health Check and refer to the “Verify Quorum and Node Health section” in the document.
Back Up Your Undercloud and Overcloud
Make sure to back up your undercloud and overcloud before running the update script. For complete instructions to back up your cloud, see BACK UP AND RESTORE THE DIRECTOR UNDERCLOUD, Backing up the overcloud control plane services, and Backing up Contrail Databases in JSON Format.
Pause and Shutdown Business Services
You must pause or shutdown external business services at this time to ensure a smooth upgrade while preventing possible data loss or workload errors. These business services can include the scope of anything outside of the Contrail Cloud deployment but interacts with Contrail Cloud as a whole. The steps to complete the tasks below are dependent on the specific business service/VM that is running. Please consult the documentation for the specific service you need to pause/shutdown.
Quiesce all external API requests, for example, Horizon.
Gracefully shutdown any vulnerable workloads.
You will want to consider migrating your services/VMs to a different cloud that is outside of the upgrade environment.
Review the Configuration Options
The sample below shows the upgrade configuration options in
its entirety and is configured in the site.yml
file. Use this upgrade configuration sample to determine accurate
whitespace and indentations within the configuration hierarchy. Come
back and reference this sample as needed to assist you through the
upgrade process.
update_plan: # Directory for lockfiles. # Consists all lockfiles created during update. Lockfiles are created # for each step, batch name, item from nodes_list if update ends with success. lockfile_directory: "/home/{{ undercloud['vm']['user'] }}/contrail_cloud_update" # Set how many nodes from role will be updated at the same time # If dont set serial for custom role (with profile or/and leaf) we will use from basic # role: ComputeKernel/ComputeDpdk/ComputeSriov/CephStorage serial: Controller: 1 ContrailController: 1 ContrailAnalytics: 1 ContrailAnalyticsDatabase: 1 AppformixController: 1 ComputeKernel: 25 ComputeDpdk: 25 ComputeSriov: 25 CephStorage: 1 # can be parallel or sequence or disabled reboot_computes: parallel batches: # unique batch name - name: controller_nodes # posssible values: `update_type: parallel` or `update_type: sequence` # If parallel: batch will be updated in parallel, all positions from list # at the same time. # If sequence: all positions from batch list will be processed one by one. update_type: parallel # nodes_list may contain 'all' word or role or node name from ironic. # if set 'all' all overcloud nodes will be upgraded role by role. # if set 'all' and serial > 1 for role, more than 1 node from role # will be processed at the same time. # Nodes list can be listed through command executed on undercloud: # `. stackrc && openstack server list -f value -c Flavor | sort -u` # node_list should consist all role/nodes used in overcloudi. # When on node_list will be a specific Compute node name and node role # into which specific Compute node belongs it will be: # - updated twice, firstly as a node name, secondly as a part of group # - rebooted once as node name nodes_list: - Controller - ContrailController - ContrailAnalytics - ContrailAnalyticsDatabase - AppformixController - name: ceph_nodes update_type: sequence nodes_list: - overcloudkt0-cephstorage2hw6-0 - overcloudkt0-cephstorage1hw7-0 - overcloudkt0-cephstorage0hw6-0 - name: compute_nodes update_type: sequence nodes_list: - ComputeKernel - ComputeDpdk - overcloudkt0-compsriov1hw3-0 # during `contrail-cloud-upgrade-overcloud-step2.sh` step: # - all nodes from `nodes_list` will be reregistered # - compute nodes from `nodes_list` restarted # - during each run, only one batch is processed # - script need to executed multiple times, to process all batches # - when all batches will be processed, proper lockfile will be created, # you can start with step 3 script
Start the Contrail Cloud Upgrade
Time to upgrade your Contrail Cloud Release. The process will deliver updated containers, Red Hat RHEL/RHOSP/Storage content, and kernel version that are associated with the chosen Release.
The procedure below will guide you through the update. There is a small disruption in service during the update. However, the update preserves existing overcloud configurations. For example: images, projects, networks, volumes, virtual machines, and so on.
Retrieve Adjusted Keys and Install
Follow these steps to start your upgrade:
- Send an e-mail message to contrail_cloud_subscriptions@juniper.net
and request a Contrail Cloud upgrade. Provide the following information:
Include your current activation key in the email request. Your Contrail Cloud activation key will be adjusted to the requested version.
Specify the time and date you would like to upgrade your Contrail Cloud. The Contrail Cloud team will prepare the activation for your maintenance window.
- Refresh your Contrail Cloud subscription on the jump host
server by running the contrail_cloud_installer.sh from
the jump host with the arguments:
./contrail_cloud_installer.sh \ --satellite_host ${SATELLITE} \ --satellite_key ${SATELLITE_KEY} \ --satellite_org ${SATELLITE_ORG}
Upgrade Contrail Cloud
The following procedure and scripts will upgrade your Contrail Cloud.
As the “contrail
” user
(su - contrail from root), execute the following scripts
on the jump host to perform the update:
- Upgrade the jump host and the undercloud VM.
/var/lib/contrail_cloud/scripts/contrail-cloud-update-undercloud.sh
This will:
Update the packages and containers on the jump host and the undercloud VM.
Update Red Hat OpenStack Platform Director on the undercloud VM.
Update image on the undercloud VM used to provision all new overcloud role instances.
overcloud-image-full is updated and used to provision any new overcloud role instance.
- Prepare the overcloud for upgrade:
/var/lib/contrail_cloud/scripts/contrail-cloud-update-overcloud-step1.sh
This will:
Publish new containers to the registry on the undercloud VM.
Update the overcloud plan on the undercloud VM.
Prepare the overcloud nodes for update: openstack overcloud update prepare.
- Perform the overcloud upgrade.
The overcloud upgrade (
contrail-cloud-update-overcloud-step2.sh
) will:Upgrade all nodes as defined in
config/site.yml
usingnodes_list
.Upgrade packages and containers for each node.
Upgrade one node batch per script run.
Automatically create a lockfile when the batch has been processed.
Reboot the compute nodes, unless manually disabled.
There are different methods that can be used to complete the overcloud upgrade step. The different methods are listed below (choose one):
Default method. All nodes will upgrade in one run.
All roles are upgraded one by one.
Within each role the nodes are upgraded one by one.
Targeted method. You have the ability to target roles and even nodes to control the upgrade sequence.
Ability to set the desired upgrade targets in the
configs/site.yml
file.Typical to upgrade all control plane roles together with this method.
Computes can be upgraded in small targeted groups.
If you encounter failures while running the
contrail-cloud-update-overcloud-step2.sh
script, see If an Upgrade Fails in the sections below.Note The overcloud upgrade script
contrail-cloud-update-overcloud-step2.sh
has a hard timeout of 4 hours, which may not be sufficient for complex deployments. Consider using targeted updates to allow for incremental role upgrades which can complete within that timeframe.Default Method
To upgrade all the nodes using the default method, run the script below. This will upgrade all nodes in one run and require no additional steps. The update will apply to all roles one at a time and one node at a time within each role.
/var/lib/contrail_cloud/scripts/contrail-cloud-update-overcloud-step2.sh
Targeted Method
The procedure below allows you to target specific roles and nodes during the update. This approach allows for control and predictability of the update and subsequent compute node reboots. This method is desirable if you want to target specific resources to be updated as workloads are migrated. The roles can now be updated in parallel and the nodes within each role can be updated sequentially.
To complete a targeted update, copy and paste the sample plan
samples/features/update-contrail-cloud/site.yml
into yourconfig.site.yml
. Edit the sample plan to match your deployment for each targeted group and run the update script for each batch defined within the update plan. The step2 script is run multiple times. You will run step2 once for each defined batch in the update plan. Per-node control allows for planning around node reboot. For how to reboot your compute nodes, see Node Reboot and Health Check and refer to the node reboot section.Note Compute nodes will automatically reboot as part of the upgrade process, unless manually disabled. Select “disabled” for
reboot_computes:
to stop the automated reboots. You will have to follow the manual reboot procedure after the upgrade is complete for updated packages to take effect (e.g., kernel updates).- Configure the update plan in your
config/site.yml
. Define how many nodes from each role will be updated at the same time:update_plan: serial: Controller: 1 ContrailController: 1 ContrailAnalytics: 1 ContrailAnalyticsDatabase: 1 AppformixController: 1 ComputeKernel: 25 ComputeDpdk: 25 ComputeSriov: 25 CephStorage: 1
- Configure the update plan for the desired reboot behavior:
update_plan: # can be parallel or sequence or disabled reboot_computes: parallel
- Now define your batches.
This is where you define your series of batches. You define the roles and nodes that belong to each unique batch. Other batch update characteristics are set here as well. When the update script is run, the script will identify the first unique batch which has not already been executed and updated. A lockfile is created after each successful batch update to identify it as being completed.
Name the unique batch and configure the update type with a value of either parallel or sequence. The update will be performed in the batch order you configure in your
site.yml
file. You can also target specific nodes you want to update (e.g. computes) by including the node name. To start, you might configure it to look like this:update_plan: batches: # unique batch name - name: controller_nodes update_type: parallel
- Set the node types in nodes_list. This list belongs to
the unique batch name defined above. In this example, this would be
all the node types associated with the batch named controller_nodes:
update_plan: batches: # unique batch name - name: controller_nodes update_type: parallel nodes_list: - Controller - ContrailController - ContrailAnalytics - ContrailAnalyticsDatabase - AppformixController
- Define the specific nodes that are unique within the named
node role. Below is an example of defining both storage and compute
nodes:
update_plan: batches: # unique batch name - name: ceph_nodes update_type: sequence nodes_list: - overcloudkt0-cephstorage2hw6-0 - overcloudkt0-cephstorage1hw7-0 - overcloudkt0-cephstorage0hw6-0 # unique batch name - name: compute_nodes update_type: sequence nodes_list: - ComputeKernel - ComputeDpdk - overcloudkt0-compsriov1hw3-0
- Run the update script after you have set your variables
for each defined batch in the update plan. Rerun the update script
until all batches have successfully updated:
/var/lib/contrail_cloud/scripts/contrail-cloud-update-overcloud-step2.sh
- Converge the overcloud upgrade. The script below will
update Ceph and converges the overcloud heat stack. Note, the
overcloud[‘deployment_timeout’]
value in theconfig/site.yml
can be increased to avoid timeouts in the Ceph upgrade./var/lib/contrail_cloud/scripts/contrail-cloud-update-overcloud-step3.sh
This will:
Ensure that the stack resource structure aligns with the new packages and configurations.
Update the Ceph cluster configuration: openstack overcloud ceph-upgrade run.
Run update converge: openstack overcloud update converge.
Finalize the overcloud update.
Move on to the next sections to upgrade AppFormix and Contrail Command.
Upgrade AppFormix
Upgrade AppFormix for use with Contrail Cloud.
- As the “
contrail
” user (su - contrail from root), execute the following script on the jump host to perform the update:/var/lib/contrail_cloud/scripts/contrail-cloud-update-appformix.sh
This will:
Upgrade all packages and containers on the AppFormix nodes.
- Verify the status of AppFormix.
Run the following command to view the status AppFormix:
ansible -i /usr/bin/tripleo-ansible-inventory AppformixController -m shell -a "curl -s http://127.0.0.1:9000/appformix/controller/v2.0/status"
This will return a 200 on success. Any other code returned should be considered a failure. The API output also contains the AppFormix version. This is helpful to verify the correct version has been installed. See the sample below:
{ "Version": "2.19.10-65aa34f7ad", "DBVersion": "70" }
Upgrade Contrail Command
Upgrade Contrail Command for use with Contrail Cloud.
- As the “
contrail
” user (su - contrail from root), execute the following script on the jump host to perform the update:/var/lib/contrail_cloud/scripts/contrail-cloud-update-command.sh
This will:
Upgrade all packages and containers in the Contrail Command VM.
- Login to the Contrail Command web UI to verify that it
was successfully installed. You access Contrail Command by entering
https://<jumphost>:9091
in your browser.Review the
/var/lib/contrail_cloud/config/vault-data.yml
for Contrail Command authentication details.
If an Upgrade Fails
If at any point your upgrade fails you will need to troubleshoot. Follow these basic steps for failure analysis:
Review the failure output and take screenshots. The screenshots will help others review your failure.
Review your configuration files. There could be mistakes in your YAML configuration files. Some common configuration errors include (but not limited to): NIC setup, role assignment, network assignment, and networking related errors.
Gather information to help troubleshoot the problem. One common troubleshooting step is to retrieve the log from a failed node. You do this by ssh to the node and check
/var/log/messages
. Use the following sequence of CLI commands:- Log in to the jump host as the
root
user. su - contrail
ssh undercloud
source stackrc
- Run
openstack stack failures list overcloud
to identify any stack failures to help identify which roles are having issues.Nothing will return in the CLI if there are no failures to report.
nova list
ssh <address>
. Use the list generated in step 6 to identify the node you need to ssh to.sudo vi /var/log/messages
from within the selected node.
- Log in to the jump host as the
You must bring all services back to health for the failure to be considered corrected.
Restore the Pacemaker cluster that was stopped as a result of the failed step in the upgrade procedure (
pcs cluster start
on the controller nodes that have it stopped) to bring the cluster back to healthy state.Re-run the failed script only when the failure has been corrected and Pacemaker has been started with the cluster healthy again. Move forward with the upgrade procedure only after the failed playbook runs successfully.
You can safely move on to reboot your nodes if you received no failures during the upgrade process.
Remove Duplicate vRouters
It is possible that duplicate instances of the vRouter might occur during the upgrade process, and it is necessary to remove these duplicates. Access the GUI at this point to identify and remove any duplicate vRouters before continuing with the upgrade process.
Reboot Your Nodes
A Contrail Cloud update will introduce a new RHEL image and kernel. You will now need to reboot your nodes if you chose to disable automatic reboots. You will also need to reboot the control plane, control hosts, and storage at this time. Reboot your nodes as described in, Node Reboot and Health Check.
Upgrade from Contrail Cloud Release 13.1 to 13.2
This is an in place upgrade as defined by Red Hat. You will have to upgrade role-by-role and host-by-host to complete this upgrade. You must follow a reboot process following the upgrade.
There are no changes in the configuration YAML files between Contrail Cloud 13.1 and 13.2. Therefore, You don't need configuration changes between Contrail Cloud 13.1 to 13.2. If configuration changes must be made for any reason, they must be applied to your existing Contrail Cloud 13.1 deployment before upgrading to Version 13.2. As a best practice, it is always good to review your configuration files to make sure they adhere to a proper schema and the needs of your deployment environment.
Before You Upgrade
Take these initial steps before starting your Contrail Cloud Upgrade. This will help eliminate possible errors that might occur during the upgrade process and will help ensure expected results. The sections below are a prerequisite to the upgrade of your Contrail Cloud.
Review Your Configuration Files
At this point you want to review your current setup to ensure all configuration settings are accurate and reflect a desired deployment for your Contrail Cloud environment.
Review all the YAML files in the
/var/lib/contrail_cloud/config
directory and ensure all values match your expected results.
Verify Undercloud/Overcloud Health and Service Operations
It is vital that you always check the health of your cloud and the services running in your cloud before attempting any deployment or upgrade activities. You must ensure that the undercloud/overcloud is fully functional, healthy, and that all services are active. Any problems in your cloud health may cause errors during upgrading. Incorrect settings and configurations will carry over to the Contrail Cloud 13.2 deployment.
- Check the health of the undercloud, overcloud and the nodes running on them. To verify the health of your cloud and the services, see Node Reboot and Health Check and refer to the “Verify Quorum and Node Health section” in the document.
Back Up Your Undercloud and Overcloud
Make sure to back up your undercloud and overcloud before running the update script. For complete instructions to back up your cloud, see BACK UP AND RESTORE THE DIRECTOR UNDERCLOUD, Backing up the overcloud control plane services, and Backing up Contrail Databases in JSON Format.
Pause and Shutdown Business Services
You must pause or shutdown external business services at this time to ensure a smooth upgrade while preventing possible data loss or workload errors. These business services can include the scope of anything outside of the Contrail Cloud deployment but interacts with Contrail Cloud as a whole. The steps to complete the tasks below are dependent on the specific business service/VM that is running. Please consult the documentation for the specific service you need to pause/shutdown.
Quiesce all external API requests, for example, Horizon.
Gracefully shutdown any vulnerable workloads.
You will want to consider migrating your services/VMs to a different cloud that is outside of the upgrade environment.
Start the Upgrade from Contrail Cloud Release 13.1 to 13.2
Time to upgrade to Contrail Cloud Release 13.2. Contrail Cloud 13.2 will deliver updated containers, RHEL image and kernel version that are associated with Release 13.2.
The procedure below will guide you through the update. There is a small disruption in service during the update. However, the update preserves existing overcloud configurations. For example: images, projects, networks, volumes, virtual machines, and so on.
Retrieve Adjusted Keys and Install
Follow these steps to start your upgrade:
- Send an e-mail message to contrail_cloud_subscriptions@juniper.net
to request Contrail Cloud 13.2. Provide the following information:
Include your current activation key in the email request. Your Contrail Cloud activation key will be adjusted to Version 13.2.
Specify the time and date you would like to upgrade your Contrail Cloud version. The Contrail Cloud team will prepare the activation for your maintenance window.
- Refresh your Contrail Cloud subscription on the jump host
server by running the contrail_cloud_installer.sh from
the jump host with the arguments:
./contrail_cloud_installer.sh \ --satellite_host ${SATELLITE} \ --satellite_key ${SATELLITE_KEY} \ --satellite_org ${SATELLITE_ORG}
- Ensure that all overcloud nodes have valid subscription-manager registrations.
Upgrade to Contrail Cloud 13.2
The following procedure and scripts will upgrade your Contrail Cloud to Version 13.2.
As the “contrail
” user
(su - contrail from root), execute the following scripts
on the jump host to perform the update:
- Upgrade the jump host and the undercloud VM.
/var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-undercloud.sh
- Prepare the overcloud for upgrade:
/var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-overcloud-step1.sh
- Perform the overcloud upgrade.
There are two different methods that can be used to complete this step. The different methods are listed below (choose one):
Default method. All nodes will upgrade in one run.
All roles are upgraded one by one.
Within each role the nodes are upgraded one by one.
Targeted method. You have the ability to target roles and even nodes to control the upgrade sequence.
Ability to set the desired upgrade targets in the
configs/site.yml
file.Typical to upgrade all control plane roles together with this method.
Computes can be upgraded in small targeted groups.
If you encounter failures while running the
contrail-cloud-upgrade-overcloud-step2.sh
script, see If an Upgrade Fails in the sections below.Note The overcloud upgrade script
contrail-cloud-upgrade-overcloud-step2.sh
has a hard timeout of 4 hours, which may not be sufficient for complex deployments. Consider using targeted updates to allow for incremental role upgrades which can complete within that timeframe.Default Method
To upgrade all the nodes using the default method, run the script below. This will upgrade all nodes in one run and require no additional steps. The update will apply to all roles one at a time and one node at a time within each role.
/var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-overcloud-step2.sh
Targeted Method
The procedure below allows you to target specific roles and nodes during the upgrade. This approach allows for control and predictability of the upgrade and subsequent compute node reboots. This method is desirable if you want to update the control plane roles at one time, and then target specific compute resources to be updated as workloads are migrated.
To complete a targeted update, just edit your
/config/site.yml
for each targeted group and rerun the update script each time a change is made. This process can be rerun multiple times if necessary. You can use the name of a specific node, or the name of a specific role to upgrade. Just remember to change your/config/site.yml
with each update. Per-node control allows for planning around node reboot. It may be desirable to reboot compute nodes as they are updated to avoid disruption later. For how to reboot your compute nodes, see Node Reboot and Health Check and refer to the node reboot section.Note Compute nodes may automatically reboot as part of the upgrade process.
- You need to configure your
/config/site.yml
to reflect the nodes you want upgraded. The upgrade will be performed in the order you configure in thesite.yml
file. To start, you might configure it to look like this:upgrade: nodes_list: - Controller - ContrailController - ContrailAnalytics - ContrailAnalyticsDatabase - AppformixController - CephStorage
- Run the update script after you have set your variables:
/var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-overcloud-step2.sh
- You can now edit your
/config/site.yml
to target the specific nodes you want to update (e.g. computes). Replace the role names with the node names you want to update. Below is an example targeting specific compute nodes to be upgraded:upgrade: nodes_list: - overcloudc54-compkernel1hw0-0 - overcloudc54-compdpdk0hw0-0
Run the update script after all variables have been set:
/var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-overcloud-step2.sh
You need to create a flag file to mark
contrail-cloud-upgrade-overcloud-step2.sh
as compete once all overcloud nodes have been upgraded. The flag file is required before running the next upgrade script. Run the following command:ssh undercloud touch /home/stack/.run-contrail-containers-upgrade
- Converge the overcloud upgrade. The script below will
update Ceph and converges the overcloud heat stack. Note, the
overcloud[‘deployment_timeout’]
value in theconfig/site.yml
can be increased to avoid timeouts in the Ceph upgrade./var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-overcloud-step3.sh
Move on to the next sections to upgrade your AppFormix and Contrail Command for Contrail Cloud 13.2.
Upgrade AppFormix
Upgrade to the latest version of AppFormix for use with Contrail Cloud 13.2.
- As the “
contrail
” user (su - contrail from root), execute the following script on the jump host to perform the update:/var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-appformix.sh
- Verify the status of AppFormix.
Run the following command to view the status AppFormix:
ansible -i /usr/bin/tripleo-ansible-inventory AppformixController -m shell -a "curl -s http://127.0.0.1:9000/appformix/controller/v2.0/status"
This will return a 200 on success. Any other code returned should be considered a failure. The API output also contains the AppFormix version. This is helpful to verify the correct version has been installed. See the sample below:
{ "Version": "2.19.10-65aa34f7ad", "DBVersion": "70" }
Upgrade Contrail Command
Upgrade to the latest version of Contrail Command for use with Contrail Cloud 13.2.
- As the “
contrail
” user (su - contrail from root), execute the following script on the jump host to perform the update:/var/lib/contrail_cloud/scripts/contrail-cloud-upgrade-command.sh
- Login to the Contrail Command web UI to verify that it
was successfully installed. You access Contrail Command by entering
https://<jumphost>:9091
in your browser.Review the
/var/lib/contrail_cloud/config/vault-data.yml
for Contrail Command authentication details.
If an Upgrade Fails
If at any point your upgrade fails you will need to troubleshoot. Follow these basic steps for failure analysis:
Review the failure output and take screenshots. The screenshots will help others review your failure.
Review your configuration files. There could be mistakes in your YAML configuration files. Some common configuration errors include (but not limited to): NIC setup, role assignment, and networking related errors.
Gather information to help troubleshoot the problem. One common troubleshooting step is to retrieve the log from a failed node. You do this by ssh to the node and check
/var/log/messages
. Use the following sequence of CLI commands:- Log in to the jump host as the
root
user. su - contrail
ssh undercloud
source stackrc
- Run
openstack stack failures list overcloud
to identify any stack failures to help identify which roles are having issues.Nothing will return in the CLI if there are no failures to report.
nova list
ssh <address>
. Use the list generated in step 6 to identify the node you need to ssh to.sudo vi /var/log/messages
from within the selected node.
- Log in to the jump host as the
You must bring all services back to health for the failure to be considered corrected.
Restore the Pacemaker cluster that was stopped as a result of the failed step in the upgrade procedure (
pcs cluster start
on the controller nodes that have it stopped) to bring the cluster back to healthy state.Re-run the failed script only when the failure has been corrected and Pacemaker has been started with the cluster healthy again. Move forward with the upgrade procedure only after the failed playbook runs successfully.
You can safely move on to reboot your nodes if you received no failures during the upgrade process.
Remove Duplicate vRouters
It is possible that duplicate instances of the vRouter might occur during the upgrade process, and it is necessary to remove these duplicates. Access the GUI at this point to identify and remove any duplicate vRouters before continuing with the upgrade process.
Reboot Your Nodes
Contrail Cloud 13.2 introduces a new RHEL image and kernel. You need to reboot the nodes as described in, Node Reboot and Health Check.
Upgrade from Contrail Cloud Release 13.02 to 13.1
Contrail Cloud 13.1 does not support upgrade from earlier releases. You must redeploy using adjusted activation keys and retrieve new software packages from the Contrail Cloud Satellite.
- Send a request to contrail_cloud_subscriptions@juniper.net regarding the adjustment of your Contrail Cloud keys to Version 13.1.
- Redeploy Contrail Cloud using the adjusted activation
keys.
For more information, see Deploying Contrail Cloud.
Upgrade from Contrail Cloud Release 13.0.1 to 13.0.2
Upgrade to Contrail Cloud Release 13.0.2 to apply the updated containers that are delivered with Contrail Networking 5.0.2. This update restarts each instance of overcloud roles, one-by-one, so there is a small disruption in service during the update. However, the update preserves existing overcloud configurations. For example: images, projects, networks, volumes, virtual machines, and so on.
To update Contrail Cloud to 13.0.2:
- Ensure that the overcloud is fully functional and that all services are active.
- Review the
config/site.yml
.Remove any
overcloud.registry
configurationValidate that the control host storage allocations use defined storage pools. If the defaults were not used then it might be necessary to adjust the control-host configuration.
- Review the
config/overcloud-nics.yml
,config/control-host-nodes.yml
, andconfig/appformix-nodes.yml
to rename all instances ofControlInterfaceDefaultRoute
toControlPlaneDefaultRoute
. - Send an e-mail message to contrail_cloud_subscriptions@juniper.net to coordinate the deployment activation key from Contrail Cloud
13.0.1 to Contrail Cloud 13.0.2. An update script
cc-update.sh
is then provided. - Download the
cc-update.sh
script to/var/lib/contrail_cloud/scripts/cc-upgrade.sh
on the jumphost. Make this file executable:sudo chmod +x /var/lib/contrail_cloud/scripts/cc-upgrade.sh; sudo chown contrail /var/lib/contrail_cloud/scripts/cc-upgrade.sh
- As the “Contrail” user, execute the following
script on the jumphost to perform the update:
/var/lib/contrail_cloud/scripts/cc-upgrade.sh
.
Workaround for DPDK Compute Nodes
The update script does not update the contrail-vrouter-agent-dpdk
container on the DPDK compute nodes.
Use the instructions below to update the Contrail Cloud 13.0.2 DPDK compute nodes:
- For each DPDK compute node, update
/etc/sysconfig/network-scripts/network-functions-vrouter-dpdk-env
to the following:#!/bin/bash CONTRAIL_VROUTER_AGENT_DPDK_DOCKER_IMAGE=192.0.2.1:8787/contrail-vrouter-agent -dpdk:5.0.2-0.360-rhel-queens-13.0.2 #CONTRAIL_VROUTER_AGENT_DPDK_DOCKER_IMAGE=192.0.2.1:8787/contrail-vrouter-age nt-dpdk:5.0.1-0.214-rhel-queens CONTRAIL_VROUTER_AGENT_CONTAINER_NAME=contrail-vrouter-agent CONTRAIL_VROUTER_AGENT_DPDK_CONTAINER_NAME=contrail-vrouter-agent-dpdk DPDK_UIO_DRIVER=uio_pci_generic
- Restart the vhost0 interface for the changes to take effect.
sudo ifdown vhost0 sudo ifup vhost0
Workaround for Kernel vRouter Compute Nodes
The update script does not update the contrail-vrouter-kernel-init
container on the kernel compute nodes.
Use the instructions below to update the Contrail Cloud 13.0.2 kernel vRouter compute nodes:
- For each kernel vRouter compute node, pull the latest
Docker image:
docker pull 192.0.2.1:8787/contrail-vrouter-kernel-init:5.0.2-0.360-rhel-queens-13.0.2
- Find the docker image ID:
docker images | grep ker | grep 360 192.0.2.1:8787/contrail-vrouter-kernel-init 5.0.2-0.360-rhel-queens-13.0.2 17c02b0e122d 4weeks ago 1.6 GB
- Run the init container:
docker run -v /dev:/dev:rw -v /bin:/host/bin:rw -v /lib/modules:/lib/modules:rw -v /etc/sysconfig/network-scripts:/etc/sysconfig/network-scripts:rw 17c02b0e122d
- Restart the vRouter agent and vhost0 interface:
docker stop contrail_vrouter_agent ifdown vhost0 ifup vhost0 docker start contrail_vrouter_agent
- Reboot to apply the updates:
sudo reboot 0