Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation  Back up to About Overview 

Known Issues

AWS Spoke

  • The AWS device activation process takes up to 30 minutes. If the process does not complete in 30 minutes, a timeout might occur and you must retry the process. You do not need to download the cloud formation template again.

    To retry the process:

    1. Log in to Customer Portal.
    2. Access the Activate Device page, enter the activation code, and click Next.
    3. After the CREATE_COMPLETE message is displayed on the AWS server, click Next on the Activate Device page to proceed with device activation.

    Bug Tracking Number: CXU-19102.

  • For an AWS spoke, during the activation process, the device status on the Activate Device page is displayed as Detected even though the device is down.

    Workaround: None.

    Bug Tracking Number: CXU-19779.

CSO HA

  • In a CSO HA environment, two RabbitMQ nodes are clustered together, but the third RabbitMQ node does not join the cluster. This might occur just after the initial installation, if a virtual machine reboots, or if a virtual machine is powered off and then powered on.

    Workaround: Do the following:

    1. Log in to the RabbitMQ dashboard for the central microservices VM (http://central-microservices-vip:15672) and the regional microservices VM (http://regional-microservices-vip:15672).
    2. Check the RabbitMQ overview in the dashboards to see if all the available infrastructure nodes are present in the cluster.
    3. If an infrastructure node is not present in the cluster, do the following:
      1. Log in to the VM of that infrastructure node.
      2. Open a shell prompt and execute the following commands sequentially:

        rabbitmqctl stop_app

        service rabbitmq-server stop

        rabbitmqctl stop_app command

        rm -rf /var/lib/rabbitmq/mnesia/

        service rabbitmq-server start

        rabbitmqctl start_app

    4. In the RabbitMQ dashboards for the central and regional microservices VMs, confirm that all the available infrastructure nodes are present in the cluster.

    Bug Tracking Number: CXU-12107

  • In an HA setup, the time configured for the CAN VMs might not be synchronized with the time configured for the other VMs in the setup. This can cause issues in the throughput graphs.

    Workaround:

    1. Log in to can-vm1 as the root user.
    2. Modify the /etc/ntp.conf file to point to the desired NTP server.
    3. Restart the NTP process.

    After the NTP process restarts successfully, can-vm2 and can-vm3 automatically resynchronize their times with can-vm1.

    Bug Tracking Number: CXU-15681.

  • In some cases, when the power fails, the ArangoDB cluster does not form.

    Workaround:

    1. Log in to the centralinfravm3 VM.
    2. Execute the service arangodb3.cluster stop command.
    3. Log in to the centralinfravm2 VM.
    4. Execute the service arangodb3.cluster stop command.
    5. Log in to the centralinfravm1 VM.
    6. Execute the service arangodb3.cluster stop command.
    7. On the centralinfravm1 VM, execute the service arangodb3.cluster start command and wait for 20 seconds for the command to finish executing.
    8. On the centralinfravm2 VM, execute the service arangodb3.cluster start command and wait for 20 seconds for the command to finish executing.
    9. On the centralinfravm3 VM, execute the service arangodb3.cluster start command and wait for 20 seconds for the command to finish executing.

    Bug Tracking Number: CXU-20346.

  • When an HA setup comes back up after a power outage, MariaDB instances do not come back up on the VMs.

    Workaround:

    You can recover the MariaDB instances by executing the recovery.sh script (that is packaged with the CSO installation package:

    1. Log in to the installer VM.
    2. Navigate to the current deployment directory for CSO; for example, /root/Contrail_Service_Orchestration_3.3/,
    3. Execute the ./recovery.sh command and follow the instructions.

    Bug Tracking Number: CXU-20260,

  • In a HA setup, the operation to roll back to CSO Release 3.2.1 might fail because during the health check the Redis component is reported as unhealthy.

    Workaround:

    1. Log in to the infrastructure VM on which the Redis component was reported as unhealthy.
    2. Execute the redis-trib.rb check infra-vm-ip:6379 command, where infra-vm-ipis the IP address of the infrastructure VM.

      A sample command and output are shown below:

      root@csp-central-infravm3:~# redis-trib.rb check 192.0.2.12:6379
      >>> Performing Cluster Check (using node 192.0.2.12:6379)
      M: f197f6be870b9a84c075d967ab23ce1657c2b864 192.0.2.12:6379
         slots:0-5460 (5461 slots) master
         1 additional replica(s)
      S: 2d9f94ea80e7c17b1b13c38213ae20f57157b9be 192.0.2.10:6380
         slots: (0 slots) slave
         replicates 38f840d6acb17808d85a79b22e22fdbd2642535b
      M: 38f840d6acb17808d85a79b22e22fdbd2642535b 192.0.2.11:6379
         slots:10923-16383 (5461 slots) master
         1 additional replica(s)
      M: 94e2b955552ff162690d7b8c41b44dad7aa904cb 192.0.2.10:6379
         slots:5461-10922 (5462 slots) master
         1 additional replica(s)
      S: 0328dc8a9bc4be9411d9d01adc1ce3d15b106b93 192.0.2.11:6380
         slots: (0 slots) slave
         replicates f197f6be870b9a84c075d967ab23ce1657c2b864
      S: 0fb89166609886e7093efafbf9713759def4b066 192.0.2.12:6380
         slots: (0 slots) slave
         replicates 94e2b955552ff162690d7b8c41b44dad7aa904cb
      [ERR] Nodes don't agree about configuration!
      >>> Check for open slots...
      [WARNING] Node 192.0.2.12:6379 has slots in importing state (6256,9093,9259).
      [WARNING] The following slots are open: 6256,9093,9259
      >>> Check slots coverage...
      [OK] All 16384 slots covered.
      
    3. Check the output of the command (for the keyword [ERR]) to find out which nodes are not in sync with the cluster, and note down the IP address of the nodes.
    4. For each node that is not in sync with the cluster:
      1. Execute the redis-trib.rb fix out-of-sync-node-IP:6379 command, where out-of-sync-node-IPis the IP address of the node that is not in sync with the cluster.
      2. Execute the following commands:
        service redis-master stop
        service redis-slave stop
        service redis-master start
        service redis-slave start
    5. Check the status of the cluster by executing the redis-trib.rb check out-of-sync-node-IP:6379 command.

      If the cluster configuration is fine, the message [OK] All nodes agree about slots configuration is displayed in the output. A sample command and output is shown below.

      root@csp-central-infravm3:~# redis-trib.rb check 192.0.2.12:6379
      >>> Performing Cluster Check (using node 192.0.2.12:6379)
      S: f197f6be870b9a84c075d967ab23ce1657c2b864 192.0.2.12:6379
         slots: (0 slots) slave
         replicates 0328dc8a9bc4be9411d9d01adc1ce3d15b106b93
      M: 94e2b955552ff162690d7b8c41b44dad7aa904cb 192.0.2.10:6379
         slots:5461-10922 (5462 slots) master
         1 additional replica(s)
      M: 38f840d6acb17808d85a79b22e22fdbd2642535b 192.0.2.11:6379
         slots:10923-16383 (5461 slots) master
         1 additional replica(s)
      S: 2d9f94ea80e7c17b1b13c38213ae20f57157b9be 192.0.2.10:6380
         slots: (0 slots) slave
         replicates 38f840d6acb17808d85a79b22e22fdbd2642535b
      M: 0328dc8a9bc4be9411d9d01adc1ce3d15b106b93 192.0.2.11:6380
         slots:0-5460 (5461 slots) master
         1 additional replica(s)
      S: 0fb89166609886e7093efafbf9713759def4b066 192.0.2.12:6380
         slots: (0 slots) slave
         replicates 94e2b955552ff162690d7b8c41b44dad7aa904cb
      [OK] All nodes agree about slots configuration.
      >>> Check for open slots...
      >>> Check slots coverage...
      [OK] All 16384 slots covered. 

    Bug Tracking Number: CXU-21302.

  • In a HA setup, if you shut down all the CSO servers, after the servers are restarted successfully, MariaDB and ArangoDB fail to form their respective clusters.

    Workaround:

    1. Perform a clean reboot of the central infrastructure VMs.
    2. After the VMs have rebooted successfully, check the cluster health from the HAproxy page (http://central-ip-address:1936, where central-IP-address is the IP address of the VM that hosts the microservices for the central POP)
    3. If the MariaDB and ArangoDB clusters are still down, you can recover the clusters as follows:
      • To recover the MariaDB cluster, do the following:

        1. On the centralinfravm1 VM, execute the service mysql stop command.
        2. On the centralinfravm2 VM, execute the service mysql stop command.
        3. On the centralinfravm3 VM, execute the service mysql stop command.
        4. On all three central infrastructure VMs, verify that the service has stopped executing the service mysql status command.
        5. On the centralinfravm1 VM, start the service by executing the service mysql start command.
        6. On the centralinfravm2 VM, start the service by executing the service mysql start command.
        7. On the centralinfravm3 VM, start the service by executing the service mysql start command.
        8. On all three central infrastructure VMs, verify that the service has started executing the service mysql status command.
      • To recover the ArangoDB cluster, do the following on all three central infrastructure VMs:

        1. On the centralinfravm1 VM, execute the service arangodb3.cluster stop command.
        2. On the centralinfravm2 VM, execute the service arangodb3.cluster stop command.
        3. On the centralinfravm3 VM, execute the service arangodb3.cluster stop command.
        4. On all three central infrastructure VMs, verify that the service has stopped executing the ps -aef|grep arangodb command.
        5. On the centralinfravm1 VM, start the service by executing the service arangodb3.cluster start command.
        6. On the centralinfravm2 VM, start the service by executing the service arangodb3.cluster start command.
        7. On the centralinfravm3 VM, start the service by executing the service arangodb3.cluster start command.
        8. On all three central infrastructure VMs, verify that the service has started executing the ps -aef|grep arangodb command.

    Bug Tracking Number: CXU-21819.

  • The connection of the Celery worker to RabbitMQ might be lost in some cases, such as the shutdown of the infrastructure VM, pool network, or high load on the infrastructure VMs. After the broken connection is detected, Celery is restarted. However, if an exception takes place during the restart, then the Celery worker loses the connection to RabbitMQ.

    Workaround: To manually restart the connection between the Celery worker and RabbitMQ, do the following:

    1. Access the RabbitMQ management page and check the consumers count for related celery queues.
    2. If the consumers count is zero, restart the pods of the related core microservices on the central and regional microservices VMs.
    3. Wait for approximately five minutes for the pods to restart.

      The connection of the Celery worker to RabbitMQ is then established.

    4. (Optional) To confirm that the connection is established, verify that the consumers count for the related celery queues is not zero.

    Bug Tracking Number: CXU-21823.

  • In a HA setup, if you onboard devices and deploy policies on the devices and if one of the policy deployments is in progress when a microservices or infrastructure node goes down, the deployment job is stuck in the In Progress state for about 90 minutes, which is the default timeout. Because of this, you cannot perform deploy operations for the tenant for about 90 minutes.

    Workaround: Wait for the job to fail and then redeploy the policy.

    Bug Tracking Number: CXU-21922.

SD-WAN

  • The LTE link can be only a backup link. Therefore, the SLA metrics are not applicable and default values of zero might be displayed on the Application SLA Performance page, which can be ignored.

    Workaround: None.

    Bug Tracking Number: CXU-19943

  • In an SRX Series dual CPE site, when the application traffic takes the Z-mode path, the application throughput reported in the Administration Portal GUI is lower than the actual data throughput.

    Workaround: None.

    Bug Tracking Number: PR1347723.

  • If all the active links, including OAM connectivity to CSO, are down and the LTE link is used for traffic, and if the DHCP addresses change to a new subnet, the traffic is dropped because CSO is unable to reconfigure the device.

    Workaround: None.

    Bug Tracking Number: CXU-19080.

  • On the Site SLA Performance page, applications with different SLA scores are plotted at the same coordinate on the x-axis.

    Workaround: None.

    Bug Tracking Number: CXU-19768.

  • When all local breakout links are down, site to Internet traffic fails even though there is an active overlay to the hub.

    Workaround: None.

    Bug Tracking Number: CXU-19807

  • When the CPE device is not able to reach CSO, DHCP address changes on WAN interfaces might not be detected and reconfigured.

    Workaround: None.

    Bug Tracking Number: CXU-19856

  • When the OAM link is down, the communication between the CPE devices and CSO does not work even though CSO can be reached over other WAN links. There is no impact to the traffic.

    Workaround: None.

    Bug Tracking Number: CXU-19881.

  • In the bandwidth-optimized SD-WAN mode, when the same SLA is used in the SD-WAN policy for different departments and an SLA violation occurs, two link switch events that appear identical, because the department name is missing from the event details, are displayed.

    Workaround: None.

    Bug Tracking Number: CXU-20529.

  • In a Cloud hub multihoming topology, after a link switch, the GRE tunnel links on the secondary hub might be displayed as red in the CSO GUI, even though the GRE tunnels are up.

    Workaround: Wait for approximately 10 minutes and the links are displayed as green indicating that the GRE tunnels are up.

    Bug Tracking Number: CXU-20550.

  • On the SD-WAN Events page, for link switch events, if you mouse over the Reason field, the values displayed for the SLA metrics are the ones that are recorded when the system logs are sent from the device and not the values for which the SLA violation was detected.

    Workaround: None.

    Bug Tracking Number: CXU-21461.

  • In a tenant with Real time-optimized SD-WAN, the duration of the link switch violation (on the WAN tab of the Site-Name page) might be displayed incorrectly.

    Workaround: None.

    Bug Tracking Number: CXU-21590.

  • On the SD-WAN Policy page, when you click the Up or Down arrow to re-order a policy intent, the policy intent moves to the top of the list instead of moving one intent above or below respectively. In addition, when you click Deploy, the changes are not deployed to the device.

    Workaround:

    • There is no workaround for re-ordering the policy intent.

    • For deploying the policy, modify at least one policy intent and re-deploy the policy.

    Bug Tracking Number: CXU-20861.

  • If you try to delete the SLA profile associated with the SD-WAN policy immediately after deleting the SD-WAN policy, an error message might be displayed and the SLA profile is not deleted.

    Workaround: Wait for approximately three minutes after deleting the SD-WAN policy, and then trigger the deletion of the associated SLA profile.

    Bug Tracking Number: CXU-22168.

Security Management

  • On the Active Database page in Customer Portal, the wrong installed device count is displayed. The count displayed is for all tenants and not for a specific tenant.

    Workaround: None.

    Bug Tracking Number: CXU-20531.

  • If you restart a central microservices VM or the csp.secmgt-sm Kubernetes pod on a central microservices VM when the deployment of a firewall policy or NAT policy is in progress, the deployment job fails.

    In addition, after the restart is completed, if you modify the firewall or NAT policy, the changes fail to deploy.

    Workaround: After the restart of the central microservices VM or the csp.secmgt-sm Kubernetes pod is completed and the csp.secmgmt-sm Kubernetes pod is Up, do the following:

    1. Redeploy the firewall or NAT policy that failed without modifying the policy, shared objects, or departments.
    2. After the deployment is successful, modify the firewall or NAT policy, shared objects, or departments.
    3. Redeploy the firewall or NAT policy and the deployment goes through successfully.

    Bug Tracking Number: CXU-21106.

  • A user with the Tenant Administrator role cannot install application signatures on the devices belonging to a tenant.

    Workaround: A user with the MSP Administrator role can install application signatures on the devices for a tenant.

    Bug Tracking Number: CXU-22064.

Site and Tenant Workflow

  • The tenant delete operation fails when CSO is installed with an external Keystone.

    Workaround: You must manually delete the tenant from the Contrail OpenStack user interface.

    Bug Tracking Number: CXU-9070

  • For tenants with a large number of spoke sites, the tenant deletion job fails because of token expiry.

    Workaround: Retry the tenant delete operation.

    Bug Tracking Number: CXU-19990.

  • In some cases, if automatic license installation is enabled in the device profile, after ZTP is complete, the license might not be installed on the CPE device even though license key is configured successfully.

    Workaround: Reinstall the license on the CPE device by using the Licenses page on the Administration Portal.

    Bug Tracking Number: PR1350302.

  • LAN segments with overlapping IP prefixes are not supported across tenants or sites.

    Workaround: Create LAN segments with unique IP prefixes across tenants and sites.

    Bug Tracking Number: CXU-20347.

  • In the Monitor > Overview page, if you select a cloud hub site and access the WAN tab, an error message is displayed.

    Workaround: None.

    Bug Tracking Number: CXU-20353.

  • When the primary and backup interfaces of the CPE device uses the same WAN interface of the hub, the backup underlay might be used for Internet or site-to-site traffic even though the primary links are available.

    Workaround: Ensure that you connect the WAN links of each CPE device to unique WAN links of the hub.

    Bug Tracking Number: CXU-20564.

  • After you configure a site, you cannot modify the configuration either before or after activation.

    Workaround: None.

    Bug Tracking Number: CXU-21165

  • If you initiate RMA on an NFX Series device that was successfully onboarded and provisioned with stage-2 templates, the device RMA might get stuck in the device activation stage if the stage-2 configuration templates have inter-dependencies.

    Workaround: Ensure that the stage-2 templates that are deployed on the device do not have inter-dependencies before initiating the device RMA workflow.

    Bug Tracking Number: CXU-21464.

  • On the Monitor > Overview page, if you click a site indicating that a major alarm was triggered (site icon color turns orange), and in the subsequent popup, click the link for major alarms in the Alerts & Alarms section, you are taken to the Alarms page. However, no alarm for the device is displayed.

    Workaround: None.

    Bug Tracking Number: CXU-21828.

General

  • If you create VNF instances in the Contrail cloud by using Heat Version 2.0 APIs, a timeout error occurs after 120 instances are created.

    Workaround: Contact Juniper Networks Technical Support.

    Bug Tracking Number: CXU-15033

  • When you upgrade the gateway router by using the CSO GUI, after the upgrade completes and the gateway router reboots, the gateway router configuration reverts to the base configuration and loses the IPsec configuration added during Zero Touch Provisioning (ZTP).

    Workaround: Before you upgrade the gateway router by using the CSO GUI, ensure that you do the following:

    1. Log in to the Juniper Device Manager (JDM) CLI of the NFX Series device.
    2. Execute the virsh list command to obtain the name of the gateway router (GWR_NAME).
    3. Execute the request virtual-network-functions GWR_NAME restart command, where GWR_NAME is the name of the gateway router obtained in the preceding step.
    4. Wait a few minutes for the gateway router to come back up.
    5. Log out of the JDM CLI.
    6. Proceed with the upgrade of the gateway router by using the CSO GUI.

    Bug Tracking Number: CXU-11823.

  • CSO might not come up after a power failure.

    Workaround:

    1. Log in to the installer VM.
    2. Navigate to the /root/Contrail_Service_Orchestration_3.3/ directory.
    3. Run the reinitialize_pods.py script as follows:
      ./python.sh recovery/components/reinitialize_pods.py
    4. SSH to the VRR by using the VRR IP address to check if you are able to access the VRR.

      If there is an error in connecting (port 22: Connection refused), then you must recover the VRR by following step 5 through 19.

    5. Log in to physical server hosting the VRR.
    6. Execute the virsh destroy vrr command to destroy the VRR.

      Warning: Do not execute the virsh undefine vrr command because doing so will cause the VRR configuration to be lost and the configuration cannot be recovered.

    7. Delete the VRR image that is located in the /root/ubuntu_vm/vrr/vrr-15.1R6.7.qcow2 directory.
    8. Copy the fresh VRR image from the /root/disks/vrr-15.1R6.7.qcow2 directory to the /root/ubuntu_vm/vrr/vrr-15.1R6.7.qcow2 directory.
    9. Execute the virsh start vrr command and wait for approximately 5 minutes for the command to finish executing.
    10. Execute the virsh list –all command to check if the VRR is running or not.

      If the VRR is not running, check that the image that was copied was the uncorrupted image and re-try the steps from step 7.

    11. If the VRR is running, navigate to the /root/ubuntu_vm/vrr/ directory.
    12. Run the ./vrr.exp command to push the base configuration to the VRR.
    13. Use POSTMAN to import the VRR configuration.

      The following is the base configuration (JSON format) for the VRR. (In the configuration below, <vrr-ip-address> is the IP address of the VRR and <vrr-password> is the password that was configured for the VRR.

      {                "input": {                                "job_name_prefix": "ImportPop",                                "pop": [{                                                "dc_name": "regional",                                                "device": [{                                                                "name": "vrr-<vrr-ip-address>",                                                                "family": "VRR",                                                                "device_ip": "<vrr-ip-address>",                                                                "assigned_device_profile": "VRR_Advanced_SDWAN_option_1",                                                                "authentication": {                                                                                "password_based": {                                                                                                "username": "root",                                                                                                "password": "<vrr-password>"                                                                                }                                                                },                                                                "management_state": "managed",                                                                "pnf_package": "null"                                                }],                                                "name": "regional"                                }]                }}
    14. Verify whether the VRR configuration is imported properly:
      1. Log in to the CSO Administration Portal.
      2. Click Resources > POPs > Import POPs > Import History and confirm that the ImportPop job is running and that it has completed successfully.
    15. On the Tenants page, add a tenant named recovery.
    16. After the tenant is successfully created, log in to the VRR and access the Junos OS CLI.
    17. Execute the show configuration|display set and verify that the tenant configuration (for the previously-configured tenants) is recovered.
    18. Execute the show bgp summary and check that the BGP status to the hub and spokes are Established.
    19. If the status is Not Established, add the routes for the OAM traffic of the hub and spokes to the VRR and recheck the status.

    Bug Tracking Number: CXU-16530

  • If you run the script to revert the upgraded setup to CSO Release 3.2.1, in some cases, the ArangoDB cluster becomes unhealthy.

    Workaround:

    1. Log in to the centralinfravm3 VM.
    2. Execute the service arangodb3 stop command and wait for 30 seconds.
      • If the command executes successfully, proceed to Step 3.

      • If there is no progress after 30 seconds:

        1. Press Ctrl+c to abort the command.
        2. Execute the kill -9 `ps -ef | grep arangod | grep -v grep | awk {'print $2'}` command.
    3. Log in to the centralinfravm2 VM.
    4. Execute the service arangodb3 stop command and wait for 30 seconds.
      • If the command executes successfully, proceed to Step 5.

      • If there is no progress after 30 seconds:

        1. Press Ctrl+c to abort the command.
        2. Execute the kill -9 `ps -ef | grep arangod | grep -v grep | awk {'print $2'}` command.
    5. Log in to the centralinfravm1 VM.
    6. Execute the service arangodb3 stop command and wait for 30 seconds.
      • If the command executes successfully, proceed to Step 7.

      • If there is no progress after 30 seconds:

        1. Press Ctrl+c to abort the command.
        2. Execute the kill -9 `ps -ef | grep arangod | grep -v grep | awk {'print $2'}` command.
    7. On the centralinfravm3 VM, execute the service arangodb3 stop command and wait for 20 seconds for the command to finish executing.
    8. On the centralinfravm2 VM, execute the service arangodb3 stop command and wait for 20 seconds for the command to finish executing.
    9. On the centralinfravm1 VM, execute the service arangodb3 stop command and wait for 20 seconds for the command to finish executing.
    10. Execute the netstat -tuplen | grep arangod command on all three central infrastructure VMs to check the status of the ArangoDB cluster. If the port binding is successful for all the central infrastructure VMs, then the ArangoDB cluster is healthy.

      The following is a sample output.

          tcp6 0 0 :::8528 :::* LISTEN 0 54213 9220/arangodb
          tcp6 0 0 :::8529 :::* LISTEN 0 44018 9327/arangod
          tcp6 0 0 :::8530 :::* LISTEN 0 91216 9289/arangod
          tcp6 0 0 :::8531 :::* LISTEN 0 42530 9232/arangod 

      Bug Tracking Number: CXU-20397.

  • The provisioning of CPE devices fails if all VRRs within a redundancy group are unavailable.

    Workaround: Recover the VRR that is down and retry the provisioning job.

    Bug Tracking Number: CXU-19063

  • The CSO health check displays the following error message: ERROR: ONE OR MORE KUBE-SYSTEM PODS ARE NOT RUNNING

    Workaround:

    1. Log in to the central microservices VM.
    2. Execute the kubectl get pods –namespace=kube-system command.
    3. If the kube-proxy process is not in the Running state, execute the kubectl apply –f /etc/kubernetes/manifests/kube-proxy.yaml command.

      Bug Tracking Number: CXU-20275.

  • After the upgrade, the health check on the standalone Contrail Analytics Node (CAN) fails.

    Workaround:

    1. Log in to the CAN VM.
    2. Execute the docker exec analyticsdb service contrail-database-nodemgr restart command.
    3. Execute the docker exec analyticsdb service cassandra restart command.

    Bug Tracking Number: CXU-20470.

  • The class-of-service scheduler configuration does not take effect on the CPE device.

    Workaround:

    1. Log in to the CPE device and access the Junos OS CLI.
    2. Enable the scheduler-map on each physical interface manually by executing the following commands:
      set class-of-service interfaces interface-name unit * scheduler-map scheduler-map-name
      set interfaces interface-name per-unit-scheduler

      Where interface-name is the name of the physical interface (for example, ge-0/0/4), and scheduler-map-name is the name of the scheduler map.

    3. Commit the configuration on the CPE device.

    Bug Tracking Number: CXU-20708.

  • The load services data operation or health check of the infrastructure components might fail if the data in the Salt server cache is lost because of an error.

    Workaround: If you encounter a Salt server-related error, do the following:

    1. Log in to the installer VM.
    2. Execute the salt '*' deployutils.get_role_ips 'cassandra' command to confirm whether one or more Salt minions have lost the cache.
      • If the output returns the IP address for all the Salt minions, this means that the Salt server cache is fine; proceed to step 7.

      • If the IP address for some minions is not present in the output, this means that the Salt server has lost its cache for those minions and must be rebuilt as explained from step 3.

    3. Navigate to the current deployment directory for CSO; for example, /root/Contrail_Service_Orchestration_3.3.1/.
    4. Redeploy the central infrastructure services (up to the NTP step):
      1. Execute the DEPLOYMENT_ENV=central ./deploy_infra_services.sh command.
      2. Press Ctrl+c when you see the following message on the console:
        2018-04-10 17:17:03 INFO utils.core Deploying roles set(['ntp']) to servers ['csp-central-msvm', 'csp-contrailanalytics-1', 'csp-central-k8mastervm', 'csp-central-infravm']
    5. Redeploy the regional infrastructure services (up to the NTP step):
      1. Execute the DEPLOYMENT_ENV=regional ./deploy_infra_services.sh command.
      2. Press Ctrl+c when you see a message similar to the one for the central infrastructure services.
    6. Execute the salt '*' deployutils.get_role_ips 'cassandra' command and confirm that the output displays the IP addresses of all the Salt minions.
    7. Re-run the load services data operation or the health component check that had previously failed.

    Bug Tracking Number: CXU-20815.

  • In some cases, high values of round-trip time (RTT) and jitter are displayed in the CSO GUI because of high values reported in the device system log.

    Workaround: None.

    Bug Tracking Number: CXU-21434.

  • On an NFX Series CPE device, if you try to upgrade a vSRX gateway router, the upgrade might fail due to a lack of storage space on the VM.

    Workaround:

    Before triggering the upgrade of the vSRX gateway router on an NFX Series device, do the following:

    1. Access the vSRX CLI on the NFX Series device.
    2. Execute the request system storage cleanup command.
    3. Access the JDM CLI on the NFX Series device.
    4. Execute the show virtual-network-function command and note down the name of the vSRX gateway router VM.
    5. Execute the request virtual-network-function gwr-vm-name restart command to reboot the VM, where gwr-vm-name is the name of the vSRX gateway router VM that was obtained in the preceding step.
    6. Wait for the vSRX gateway router VM to successfully reboot.

    Trigger the upgrade of the vSRX gateway router by using the CSO GUI.

    Bug Tracking Number: CXU-21440.

  • In some cases, when the infrastructure VMs in the CSO setup are unhealthy and you initiate the upgrade, the upgrade process fails to perform a health check before starting the upgrade.

    Workaround: Recover the infrastructure VMs manually before proceeding with the upgrade.

    Bug Tracking Number: CXU-21536.

  • For a MX Series Cloud hub, if you have configured the Internet link type as OAM_and_DATA, the reverse traffic fails to reach the spoke device without additional configuration done by using the Junos OS CLI on the MX Series device.

    Workaround:

    1. Log in to the MX Series device and access the Junos CLI.
    2. Find the next-hop-service outside-service-interface as follows:
      1. Execute the show configuration | display set | grep outside-service-interface command.
      2. In the output of the command, look for the multiservices interface (ms-) corresponding to the service set that CSO created on the device.

        The name of the service set is in the format ssettenant-name_DefaultVPN-tenant-name, where tenant-name is the name of the tenant.

        An example of the command and output follows:

        show configuration | display set | grep outside-service-interface
        set groups mx-hub-Acme-Acme_DefaultVPN-vpn-routing-config services service-set ssetAcme_DefaultVPN-Acme next-hop-service outside-service-interface ms-1/0/0.4008

        In this example, the tenant name is Acme and the multiservices interface used is ms-1/0/0.4008.

    3. After you determine the correct interface, add the following configuration on the device: set routing-instances WAN_0 interface ms-interface

      where ms-interface is the name of the multiservices interface obtained in the preceding step.

    4. Commit the configuration.

    Bug Tracking Number: CXU-21818.

  • In a full mesh topology, the simultaneous deletion of LAN segments on all sites is not supported.

    Workaround: Delete LAN segments one site at a time.

    Bug Tracking Number: CXU-21936.

  • On a CSO setup that was upgraded from Release 3.2.1 to Release 3.3.0, if you start upgrading to Release 3.3.1, the ArangoDB storage engine upgrade might fail due to an issue with the Salt server synchronization.

    Workaround:

    1. Log in to the installer VM.
    2. Navigate to the current deployment directory for CSO; for example, /root/Contrail_Service_Orchestration_3.3.1/.
    3. Redeploy the central infrastructure services (up to the NTP step):
      1. Execute the DEPLOYMENT_ENV=central ./deploy_infra_services.sh command.
      2. Press Ctrl+c when you see the following message on the console:
        2018-04-10 17:17:03 INFO utils.core Deploying roles set(['ntp']) to servers ['csp-central-msvm', 'csp-contrailanalytics-1', 'csp-central-k8mastervm', 'csp-central-infravm']
    4. Redeploy the infrastructure services in the other regions (up to the NTP step):
      1. Execute the DEPLOYMENT_ENV=region-name ./deploy_infra_services.sh command, where region-name is the name of the region.
      2. Press Ctrl+c when you see a message similar to the one for the central infrastructure services.
    5. Rerun the upgrade script (upgrade.sh).

    Bug Tracking Number: CXU-22066.

  • When a factory default SRX Series device is activated by using ZTP with a redirect server, the device activation fails because the learned phone home server is deleted during the activation process.

    Workaround: Configure the phone home server IP address on the SRX Series device and retry the ZTP workflow.

    Bug Tracking Number: CXU-22154.

Modified: 2018-05-18