Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Troubleshooting Site, Device, and Link Issues

 

Secure OAM Activation Failure

Problem

Description: After entering the activation code , the CPE device status remains in DEVICE_DETECTED state; the csp.tssm_bootstrap-<site-name> job fails or the job status remains in In Progress state for a long time.

Solution

Check whether CSO is reachable or not by executing the following command on the CPE device.

user@host > ping <cso-ip > source < management-ip-configured-on-loopback-interface >

If the ping fails, then check whether the secure OAM tunnels are up by using the following command.

user@host > show security ipsec inactive-tunnels

If the secure OAM tunnels are not up, verify the connectivity to the OAM hub.

Configure SD-WAN Site Failure

Problem

Description: The configure site operation fails for a spoke site.

Solution

  1. Log in to Customer Portal and select Sites > Site Management.

    The site status must be Configured. If the site status is Configuration Failed, then the “tssm configure sites” job must have failed.

  2. Click Monitor > Jobs and check the job details to verify which task has failed.

    If the ship device task has failed, then CSO has failed to push the required secure OAM tunnel configuration to the hub device.

  3. Check the connectivity between CSO and the hub.

  4. If there are any other failures, then go to Sites > Site Management >Site-Name> Configure Site and review the input provided for configuring the site.

Device Activation Failure

Problem

Description: After entering the activation code , the device status remains is DEVICE_DETECTED state for a long time.

Solution

After entering the activation code, the activation window must display the progress of device activation and must indicate that device has been successfully detected. If the device status remains in DEVICE_DETECTED state, then follow the steps listed below:

  1. Log in to Customer Portal and select Resources-> Devices.

    The Devices page appears.

  2. Check the Management Status of the device.

    If the management status is DEVICE_DETECTED, then the deployment of the stage-1 configuration on device has failed or device has failed to send the BOOTSTRAP COMPLETE notification to CSO.

  3. Login into the device and verify whether the stage-1 configuration is committed on the device.
  4. Verify the connectivity between CSO and the device loop back address.
  5. Navigate to Monitor > Jobs page and verify the status of csp.tssm_bootstrap-<site name > job.
    • If the job is in successful state, then ztp job will be triggered.

    • If the job is in in-progress state, then the CPE device failed to establish the connection over the secure OAM tunnel.

  6. If device failed to establish the connection within an hour, or if the csp.tssm_bootstrap-<site name >” job fails, then check the bootstrap task details.
  7. Once the connectivity issue is resolved, navigate to Resources > Devices and activate the device.

    The csp.tssm_ztp-<site name > job must be successful state. If the job failed, check the task details verify which task has failed.

Dual-CPE Activation Failure for NFX Series Devices

Problem

Description: ZTP Job failed for dual CPE NFX Series devices.

Solution

For a site with dual CPE NFX Series devices, two ZTP jobs, namely, csp.tssm_ztp-<site-name> _cpe0 and csp.tssm_ztp-<site-name>_cpe1 are created. One ZTP job is created per each node.

While the jobs are still in progress and after the Gateway Router (GWR) is spawned successfully, two more jobs, namely, form_device_cluster are created per each node for cluster formation.

Log in to Administration Portal and select Monitor > Jobs to view the form_device_cluster job. If cluster formation fails, the form_device_cluster job and the csp.tssm_ztp-<site-name> _cpe0, csp.tssm_ztp-<site-name>_cpe1 jobs are reported as failure.

For any cluster formation job failure, check the logs from the device at /tmp/cluster_gwr.log.

For further troubleshooting, collect the logs and output results and contact Juniper Networks SRE team.

Dual-CPE Activation Failure for SRX Series Devices

Problem

Description: ZTP Job failed for dual SRX Series devices

Solution

For a site with dual CPE SRX Series devices, two ZTP jobs, namely , csp.tssm_ztp-<site-name> _cpe0 and , csp.tssm_ztp-<site-name>_cpe1 are created. One ZTP job is created per each node.

In case of dual SRX Series devices, as a pre-requisite, the chassis cluster is already formed manually before starting the device activation. The csp.tssm_ztp-<site-name>_cpe1 job will report success quickly, and the actual ztp progress can be tracked through the csp.tssm_ztp-<site-name>_cpe0 job. In case of any failure, refer to ZTP job task details.

Problem

Description: Link switch event is not displayed in the UI

Solution

Check whether the device is able to reach southbound load balancer VM (SBLB VM) and the time is synchronized with the NTP server.

Even when the link switch is successful on the device, it may not be indicated in the UI because of the missing syslog events. Link switch event in UI is indicated based on the APPQOE_BEST_PATH_SELECTED” syslog with reason as sla violated that is received from CPE device.

Log in to Customer Portal and select Monitor > Device Events to view all the syslogs that are received from the CPE device To filter the APPQOE_BEST_PATH_SELECTED events, use the following query: Event Name = APPQOE_BEST_PATH_SELECTED and Reason = sla violated.

Problem

Description: WAN link performance parameters, such as latency, packet loss, E2E delay, jitter, and throughput are not displayed in the UI.

Solution

Check whether the device is able to reach southbound load balancer VM (SBLB VM) and the time is synchronized with the NTP server.

Login to Customer Portal and select Sites > Site Management > Site-Name > WAN tab to view the WAN link performance.

  • The WAN link performance details for latency, packet loss, E2E delay, and jitter are retreived from APPQOE_ACTIVE_SLA_METRIC_REPORT syslog. To filter the APPQOE_ACTIVE_SLA_METRIC_REPORT events, use the following query:

    Event Name = APPQOE_ACTIVE_SLA_METRIC_REPORT and Site = <site-name >.

  • The WAN link performance details for throughput is retrieved from APPTRACK_ACTIVE_SLA_METRIC_REPORT syslog. To filter the APPTRACK_ACTIVE_SLA_METRIC_REPORT events, use the following query:

    Event Name = APPTRACK_SESSION_CLOSE and Site = <site-name>.

LTE Interface Issues

Problem

Description: LTE interface is not receiving the IP address.

Solution

  • Check the data validity of the SIM using the mobile device.

  • Check the LTE module connection status to ensure that there is adequate mobile signal strength.

    user@host>show modem wireless network cl-1/1/0

    Check the Current Modem Status, Current Service Status, Current Service Type, and Current Service Mode fields.

  • For NFX150 device, ensure that the external antenna is connected properly.