Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Known Behavior

This section lists known limitations with this release.

Known Behavior in Contrail Release 1912.L1

  • CEM-7424 In Contrail fabric deployments, in MX acting as DC-GW, FIP for VMs hosted on OpenStack computes and bare metal server workloads hosted on the datacenter fabric cannot be enabled at the same time.

  • CEM-11479 vhost0 loses the IP address due to dhcpclient timeout. On one of the gateways, on the vhost0 interface the timeout for dhcp is set to 1582 seconds and it times out after that. The renewal of the lease fails as usual and vhost0 loses its IP.

    As a workaround, perform the following steps only for Google Cloud GWs.

    1. Log in to Google Cloud vrouter-gateway instances.

    2. Check the DHCP lease on the vhost0 interface using the following command:

      ip a | grep vhost0 -A3| grep valid_lft

    3. Check if the above command returns a forever value. For example:

      [root@g3312v1g1 vrouter]# ip a | grep vhost0 -A3| grep valid_lft

      valid_lft forever preferred_lft forever

    4. Ensure that the value for ‘valid_lft’ is NOT short in the 1K range, for instance 1582 seconds or less.

      Note:

      This number keeps decrementing by second (units in time).

    5. Disable the ifup-vhost script by running the following command.

      chmod 400 /etc/sysconfig/network-scripts/ifup-vhost

    6. Reboot the instance by running the following command.

      reboot

    Once the instance is up, check the vhost0 lease by rerunning the command in step 1. The value of ‘valid_lft’ must be in the 157 million range.

  • CEM-11411 Packet loss is seen on overlay ping of packet size starting from 1430 bytes across all providers.

    • In order to support jumbo frames for underlay and overlay from onPrem to AWS, ensure the onPrem contrail cluster and the underlay IP fabric support jumbo frames.

    • For Google Cloud, ensure that you use the instance c2-standard-8 for contrail Multicloud GW and a minimum of c2-standard-4 for vrouterCNI+k8snode.

    • Please check the MTU in each of the cloud and adjust the MTU accordingly, in case of any issues.

  • CEM-11338 Reconfiguring sFlow collectors after deleting and adding a fabric back fails. On a well planned cluster deployment with sufficient sFlow nodes provisioned in the beginning will prevent this situation.

  • CEM-11163 In Fortville X710 NIC: With TX and RX buffers performance degrade is observed as mbufs gets exhausted.

  • CEM-11160 WebUI returns stack trace when navigating to config_sc_svcInstances in a JuJu based installation.

  • CEM-10199 In public cloud deployments, after deleting the public cloud, the snapshots are left in the cloud. To clear them, the user has to log in to the respective cloud console (AWS/Azure/GCP) and deregister the AMIs and delete the snapshots from there.

  • CEM-9979 During upgrade of DPDK computes deployed with OOO Heat Templates in RHOSP environment, vRouter coredumps are observed. This is due to the sequence in which the services are started during upgrade and does not have impact on cluster operation.

  • CEM-9278 The sFlow stats for the BMS added after initial provisioning of a cluster is not displayed. As a workaround, to enable sFlow stats for the BMS added post initial provisioning, execute the following:

    1. Add the host as Remote Host in AppFormix UI.

      Go to AppFormix Swagger API (Settings > API Documentation > Link to AppFormix Documentation).

      Click Show/Hide to get the API Details.

      Go to /Hosts POST API.

      Set X-Auth-Type as OpenStack and fill the X-Auth-Token with Keystone token. Specify the following in the body:

      Send POST request.

    2. Once a device is added in the UI, go to Settings > Network Devices. Select the Network Device which you want to add to BMS.

      Go to Edit section, set LLD to Disabled, select SNMP, click Next and set snmp community string and click Save.

      Go to Edit Connection Info > Continue, select the Network Device and then Add the Target Device as BMS and set the interface on Network Device which is connected to this BMS and click Save.

      Go to Contrail Command UI, the BMS stats can be seen.

  • CEM-8701 While bringing up a BMS using the Life Cycle Management workflow, sometimes on faster servers the re-image does not go through and instance not moved from ironic vn to tenant vn. This is because if the PXE boot request from the BMS is sent before the routes are converged between the BMS port and the TFTP service running in Contrail nodes. As a workaround, the servers can be rebooted or the BIOS in the servers can be configured to have a delayed boot.

  • CEM-8149 BMS LCM with fabric set with enterprise_style=True is not supported. By default, enterprise_style is set to False. User should avoid using enterprise_style=True if the fabric object will onboard BMS LCM instance.

  • CEM-7874 User defined alarms may not be generated, when third stunnel/Redis service instance is down after the first two instances were restarted.

  • CEM-5788 Installation fails if FQDN is used to deploy Contrail Cluster through Contrail Command with OpenStack orchestration.

  • CEM-5284 Cloud Compute/vRouter nodes are not listed in the cluster-nodes/compute node page, all nodes/computes are listed in the servers page

  • CEM-5141 For deleting compute nodes, the UI workflow will not work. Instead, update the instances.yaml with “ENABLE_DESTROY: True” and “roles:” (leave it empty) and run the following playbooks.

    For example:

  • CEM-5043 VNI update on a LR doesnt update the RouteTable. As a workaround, delete the LogicalRouter and create a new LogicalRouter with the new VNI.

  • CEM-5041 Provisioning of Region or VPC objects only on the cloud without any nodes is not supported. Add atleast one node while provisioning Region/VPC.

  • CEM-5024 Current multi cloud provisioning does not enable the On-prem TOR to exchange public cloud subnets with the On-Prem controllers. The user need to add static routes on the controllers to all the public cloud subnets.

  • CEM-4941 The multicloud gateway on the public cloud cannot be shared across different subnets. Each subnet must have its own gateway.

  • CEM-4865 Provisioning of Contrail Controllers on public cloud is not supported. Controllers need to be provisioned On-prem.

  • CEM-4467 On DPDK computes, sometimes VM creation fails with "Connection is closed" error. The issue is not related to any of the contrail components. It is related to systemd-machined service in registering VMs. As a workaround, restart the systemd-machined service to fix the issue.

  • CEM-4381 Contrail Fabric device manager tasks can fail if one or more Contrail API servers is down. Contrail-status on the Contrail config nodes can be used to determine if this situation occur.

  • CEM-4370 After creating a PNF Service Instance, the fields like PNF eBGP ASN*, RP IP Address, PNF Left BGP Peer ASN*, Left Service VLAN*, PNF Right BGP Peer ASN* ,Right Service VLAN* cannot be modified. If there is a need to modify these values, delete and re-create the Service Instance with intended values.

  • CEM-3959 BMS movement across TORs is not supported. To move BMS across TORs the whole VPG need to be moved. That means if there are more than one BMS associated to one VPG, and one of the BMS need to be moved, the whole VPG need to be deleted and re-configured as per the new association.

  • CEM-3324 Users cannot provision Contrail Cluster entirely in Public cloud. Contrail Cluster need to be On-Prem and vRouters can be extended to public cloud.

  • JCB-204796 In a Helm-based provisioned cluster, VM launch fails if MariaDB replication is set to >1.

  • JCB-202874 After deleting a vRouter chart with DPDK, the NICs do not rebind to the host in Helm.

  • JCB-190956 While creating ironic-provision, service address in the subnet must be pointing to openstack ironic node ip/kolla internal vip.

  • JCB-187320 On a DPDK compute vif list –rate core-dumps with traffic.

  • JCB-187287 High Availability provisioning of Kubernetes master is not supported.

  • JCB-186493 When a snapshot of an active VM fails, shutdown the VM before generating the snapshot.

  • JCB-184837 After provisioning Contrail by using a Helm-based provisioned cluster, restart nova-compute container.

  • JCB-184776 When the vRouter receives the head fragment of an ICMPv6 packet, the head fragment is immediately enqueued to the assembler. The flow is created as hold flow and then trapped to the agent. If fragments corresponding to this head fragment are already in the assembler or if new fragments arrive immediately after the head fragment, the assembler releases them to flow module. Fragments get enqueued in the hold queue if agent does not write flow action by the time the assembler releases fragments to the flow module. A maximum of three fragments are enqueued in the hold queue at a time. The remaining fragments are dropped from the assembler to the flow module.

    As a workaround, the head fragment is enqueued to assembler only after flow action is written by agent. If the flow is already present in non-hold state, it is immediately enqueued to assembler.

  • JCB-177787 In DPDK vRouter use cases such as SNAT and LBaaS that require netns, jumbo MTU cannot be set. Maximum MTU allowed: <=1500.

  • JCB-177541 When you receive an error message during Kolla provisioning, rerunning the code will not work. In order for the provisioning to work, restart provisioning from scratch.

  • JCB-171466 Metadata SSL works only in HA deployment mode.

  • JCB-163773 A false alarm for config service is generated when config and configdb services are installed on different nodes. Ignore the false alarm.

  • JCB-162927 SR-IOV with DPDK co-existence deployment is not supported using contrail-helm-deployer.

Known Behavior in Contrail Release 1912

  • CEM-11479 vhost0 loses the IP address due to dhcpclient timeout. On one of the gateways, on the vhost0 interface the timeout for dhcp is set to 1582 seconds and it times out after that. The renewal of the lease fails as usual and vhost0 loses its IP.

    As a workaround, perform the following steps only for Google Cloud GWs.

    1. Log in to Google Cloud vrouter-gateway instances.

    2. Check the DHCP lease on the vhost0 interface using the following command:

      ip a | grep vhost0 -A3| grep valid_lft

    3. Check if the above command returns a forevervalue. For example:

      [root@g3312v1g1 vrouter]# ip a | grep vhost0 -A3| grep valid_lft

      valid_lft forever preferred_lft forever

    4. Ensure that the value for ‘valid_lft’ is NOT short in the 1K range, for instance 1582 seconds or less.

      Note:

      This number keeps decrementing by second (units in time).

    5. Disable the ifup-vhost script by running the following command.

      chmod 400 /etc/sysconfig/network-scripts/ifup-vhost

    6. Reboot the instance by running the following command.

      reboot

    Once the instance is up, check the vhost0 lease by rerunning the command in step 1. The value of ‘valid_lft’ must be in the 157 million range.

  • CEM-11411 Packet loss is seen on overlay ping of packet size starting from 1430 bytes across all providers.

    • In order to support jumbo frames for underlay and overlay from onPrem to AWS, ensure the onPrem contrail cluster and the underlay IP fabric support jumbo frames.

    • For Google Cloud, ensure that you use the instance c2-standard-8 for contrail Multicloud GW and a minimum of c2-standard-4 for vrouterCNI+k8snode.

    • Please check the MTU in each of the cloud and adjust the MTU accordingly, in case of any issues.

  • CEM-11338 Reconfiguring sFlow collectors after deleting and adding a fabric back fails. On a well planned cluster deployment with sufficient sFlow nodes provisioned in the beginning will prevent this situation.

  • CEM-11163 In Fortville X710 NIC: With TX and RX buffers performance degrade is observed as mbufs gets exhausted.

  • CEM-11160 WebUI returns stack trace when navigating to config_sc_svcInstances in a JuJu based installation.

  • CEM-10199 In public cloud deployments, after deleting the public cloud, the snapshots are left in the cloud. To clear them, the user has to log in to the respective cloud console (AWS/Azure/GCP) and deregister the AMIs and delete the snapshots from there.

  • CEM-9979 During upgrade of DPDK computes deployed with OOO Heat Templates in RHOSP environment, vRouter coredumps are observed. This is due to the sequence in which the services are started during upgrade and does not have impact on cluster operation.

  • CEM-9278 The sFlow stats for the BMS added after initial provisioning of a cluster is not displayed. As a workaround, to enable sFlow stats for the BMS added post initial provisioning, execute the following:

    1. Add the host as Remote Host in AppFormix UI.

      Go to AppFormix Swagger API (Settings > API Documentation > Link to AppFormix Documentation).

      Click Show/Hide to get the API Details.

      Go to /Hosts POST API.

      Set X-Auth-Type as OpenStack and fill the X-Auth-Token with Keystone token. Specify the following in the body:

      Send POST request.

    2. Once a device is added in the UI, go to Settings > Network Devices. Select the Network Device which you want to add to BMS.

      Go to Edit section, set LLD to Disabled, select SNMP, click Next and set snmp community string and click Save.

      Go to Edit Connection Info > Continue, select the Network Device and then Add the Target Device as BMS and set the interface on Network Device which is connected to this BMS and click Save.

      Go to Contrail Command UI, the BMS stats can be seen.

  • CEM-8701 While bringing up a BMS using the Life Cycle Management workflow, sometimes on faster servers the re-image does not go through and instance not moved from ironic vn to tenant vn. This is because if the PXE boot request from the BMS is sent before the routes are converged between the BMS port and the TFTP service running in Contrail nodes. As a workaround, the servers can be rebooted or the BIOS in the servers can be configured to have a delayed boot.

  • CEM-8149 BMS LCM with fabric set with enterprise_style=True is not supported. By default, enterprise_style is set to False. User should avoid using enterprise_style=True if the fabric object will onboard BMS LCM instance.

  • CEM-7874 User defined alarms may not be generated, when third stunnel/Redis service instance is down after the first two instances were restarted.

  • CEM-5788 Installation fails if FQDN is used to deploy Contrail Cluster through Contrail Command with OpenStack orchestration.

  • CEM-5284 Cloud Compute/vRouter nodes are not listed in the cluster-nodes/compute node page, all nodes/computes are listed in the servers page

  • CEM-5141 For deleting compute nodes, the UI workflow will not work. Instead, update the instances.yaml with “ENABLE_DESTROY: True” and “roles:” (leave it empty) and run the following playbooks.

    For example:

  • CEM-5043 VNI update on a LR doesnt update the RouteTable. As a workaround, delete the LogicalRouter and create a new LogicalRouter with the new VNI.

  • CEM-5041 Provisioning of Region or VPC objects only on the cloud without any nodes is not supported. Add atleast one node while provisioning Region/VPC.

  • CEM-5024 Current multi cloud provisioning does not enable the On-prem TOR to exchange public cloud subnets with the On-Prem controllers. The user need to add static routes on the controllers to all the public cloud subnets.

  • CEM-4941 The multicloud gateway on the public cloud cannot be shared across different subnets. Each subnet must have its own gateway.

  • CEM-4865 Provisioning of Contrail Controllers on public cloud is not supported. Controllers need to be provisioned On-prem.

  • CEM-4467 On DPDK computes, sometimes VM creation fails with "Connection is closed" error. The issue is not related to any of the contrail components. It is related to systemd-machined service in registering VMs. As a workaround, restart the systemd-machined service to fix the issue.

  • CEM-4381 Contrail Fabric device manager tasks can fail if one or more Contrail API servers is down. Contrail-status on the Contrail config nodes can be used to determine if this situation occur.

  • CEM-4370 After creating a PNF Service Instance, the fields like PNF eBGP ASN*, RP IP Address, PNF Left BGP Peer ASN*, Left Service VLAN*, PNF Right BGP Peer ASN* ,Right Service VLAN* cannot be modified. If there is a need to modify these values, delete and re-create the Service Instance with intended values.

  • CEM-3959 BMS movement across TORs is not supported. To move BMS across TORs the whole VPG need to be moved. That means if there are more than one BMS associated to one VPG, and one of the BMS need to be moved, the whole VPG need to be deleted and re-configured as per the new association.

  • CEM-3324 Users cannot provision Contrail Cluster entirely in Public cloud. Contrail Cluster need to be On-Prem and vRouters can be extended to public cloud.

  • JCB-204796 In a Helm-based provisioned cluster, VM launch fails if MariaDB replication is set to >1.

  • JCB-202874 After deleting a vRouter chart with DPDK, the NICs do not rebind to the host in Helm.

  • JCB-190956 While creating ironic-provision, service address in the subnet must be pointing to openstack ironic node ip/kolla internal vip.

  • JCB-187320 On a DPDK compute vif list –rate core-dumps with traffic.

  • JCB-187287 High Availability provisioning of Kubernetes master is not supported.

  • JCB-186493 When a snapshot of an active VM fails, shutdown the VM before generating the snapshot.

  • JCB-184837 After provisioning Contrail by using a Helm-based provisioned cluster, restart nova-compute container.

  • JCB-184776 When the vRouter receives the head fragment of an ICMPv6 packet, the head fragment is immediately enqueued to the assembler. The flow is created as hold flow and then trapped to the agent. If fragments corresponding to this head fragment are already in the assembler or if new fragments arrive immediately after the head fragment, the assembler releases them to flow module. Fragments get enqueued in the hold queue if agent does not write flow action by the time the assembler releases fragments to the flow module. A maximum of three fragments are enqueued in the hold queue at a time. The remaining fragments are dropped from the assembler to the flow module.

    As a workaround, the head fragment is enqueued to assembler only after flow action is written by agent. If the flow is already present in non-hold state, it is immediately enqueued to assembler.

  • JCB-177787 In DPDK vRouter use cases such as SNAT and LBaaS that require netns, jumbo MTU cannot be set. Maximum MTU allowed: <=1500.

  • JCB-177541 When you receive an error message during Kolla provisioning, rerunning the code will not work. In order for the provisioning to work, restart provisioning from scratch.

  • JCB-171466 Metadata SSL works only in HA deployment mode.

  • JCB-163773 A false alarm for config service is generated when config and configdb services are installed on different nodes. Ignore the false alarm.

  • JCB-162927 SR-IOV with DPDK co-existence deployment is not supported using contrail-helm-deployer.