Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Monitoring and Troubleshooting

SUMMARY This section describes the network monitoring and troubleshooting features of Junos OS.

Ping Hosts

Purpose

Use the CLI ping command to verify that a host can be reached over the network. This command is useful for diagnosing host and network connectivity problems. The device sends a series of Internet Control Message Protocol (ICMP) echo (ping) requests to a specified host and receives ICMP echo responses.

Action

To use the ping command to send four requests (ping count) to host3:

Sample Output

command-name

Meaning

  • The ping results show the following information:

    • Size of the ping response packet (in bytes).

    • IP address of the host from which the response was sent.

    • Sequence number of the ping response packet. You can use this value to match the ping response to the corresponding ping request.

    • Time-to-live (ttl) hop-count value of the ping response packet.

    • Total time between the sending of the ping request packet and the receiving of the ping response packet, in milliseconds. This value is also called round-trip time.

    • Number of ping requests (probes) sent to the host.

    • Number of ping responses received from the host.

    • Packet loss percentage.

    • Round-trip time statistics: minimum, average, maximum, and standard deviation of the round-trip time.

Monitor Traffic Through the Router or Switch

For diagnosing a problem, display real-time statistics about the traffic passing through physical interfaces on the router or switch.

To display real-time statistics about physical interfaces, perform these tasks:

Display Real-Time Statistics About All Interfaces on the Router or Switch

Purpose

Display real-time statistics about traffic passing through all interfaces on the router or switch.

Action

To display real-time statistics about traffic passing through all interfaces on the router or switch:

Sample Output
command-name

Meaning

The sample output displays traffic data for active interfaces and the amount that each field has changed since the command started or since the counters were cleared by using the C key. In this example, the monitor interface command has been running for 15 seconds since the command was issued or since the counters last returned to zero.

Display Real-Time Statistics About an Interface on the Router or Switch

Purpose

Display real-time statistics about traffic passing through an interface on the router or switch.

Action

To display traffic passing through an interface on the router or switch, use the following Junos OS CLI operational mode command:

Sample Output
command-name

Meaning

The sample output shows the input and output packets for a particular SONET interface (so-0/0/1). The information can include common interface failures, such as SONET/SDH and T3 alarms, loopbacks detected, and increases in framing errors. For more information, see Checklist for Tracking Error Conditions.

To control the output of the command while it is running, use the keys shown in Table 1.

Table 1: Output Control Keys for the monitor interface Command

Action

Key

Display information about the next interface. The monitor interface command scrolls through the physical or logical interfaces in the same order that they are displayed by the show interfaces terse command.

N

Display information about a different interface. The command prompts you for the name of a specific interface.

I

Freeze the display, halting the display of updated statistics.

F

Thaw the display, resuming the display of updated statistics.

T

Clear (zero) the current delta counters since monitor interface was started. It does not clear the accumulative counter.

C

Stop the monitor interface command.

Q

See the CLI Explorer for details on using match conditions with the monitor traffic command.

Dynamic Ternary Content Addressable Memory Overview

In ACX Series routers, Ternary Content Addressable Memory (TCAM) is used by various applications like firewall, connectivity fault management, PTPoE, RFC 2544, etc. The Packet Forwarding Engine (PFE) in ACX Series routers uses TCAM with defined TCAM space limits. The allocation of TCAM resources for various filter applications are statically distributed. This static allocation leads to inefficient utilization of TCAM resources when all the filter applications might not use this TCAM resource simultaneously.

The dynamic allocation of TCAM space in ACX routers efficiently allocates the available TCAM resources for various filter applications. In the dynamic TCAM model, various filter applications (such as inet-firewall, bridge-firewall, cfm-filters, etc.) can optimally utilize the available TCAM resources as and when required. Dynamic TCAM resource allocation is usage driven and is dynamically allocated for filter applications on a need basis. When a filter application no longer uses the TCAM space, the resource is freed and available for use by other applications. This dynamic TCAM model caters to higher scale of TCAM resource utilization based on application’s demand.

Applications using Dynamic TCAM Infrastructure

The following filter application categories use the dynamic TCAM infrastructure:

  • Firewall filter—All the firewall configurations

  • Implicit filter—Routing Engine (RE) demons using filters to achieve its functionality. For example, connectivity fault management, IP MAC validation, etc.

  • Dynamic filters—Applications using filters to achieve the functionality at the PFE level. For example, logical interface level fixed classifier, RFC 2544, etc. RE demons will not know about these filters.

  • System-init filters—Filters that require entries at the system level or fixed set of entries at router's boot sequence. For example, Layer 2 and Layer 3 control protocol trap, default ARP policer, etc.

    Note:

    The System-init filter which has the applications for Layer 2 and Layer 3 control protocols trap is essential for the overall system functionality. The applications in this control group consume a fixed and minimal TCAM space from the overall TCAM space. The system-init filter will not use the dynamic TCAM infrastructure and will be created when the router is initialized during the boot sequence.

Features Using TCAM Resource

Applications using the TCAM resource is termed tcam-app in this document. For example, inet-firewall, bridge-firewall, connectivity fault management, link fault management, and so on are all different tcam-apps.

Table 2 describes the list of tcam-apps that use TCAM resources.

Table 2: Features Using TCAM Resource

TCAM Apps/TCAM Users

Feature/Functionality

TCAM Stage

bd-dtag-validate

Bridge domain dual-tagged validate

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Egress

bd-tpid-swap

Bridge domain vlan-map with swap tpid operation

Egress

cfm-bd-filter

Connectivity fault management implicit bridge-domain filters

Ingress

cfm-filter

Connectivity fault management implicit filters

Ingress

cfm-vpls-filter

Connectivity fault management implicit vpls filters

Note:

This feature is supported only on ACX5048 and ACX5096 routers.

Ingress

cfm-vpls-ifl-filter

Connectivity fault management implicit vpls logical interface filters

Note:

This feature is supported only on ACX5048 and ACX5096 routers.

Ingress

cos-fc

Logical interface level fixed classifier

Pre-ingress

fw-ccc-in

Circuit cross-connect family ingress firewall

Ingress

fw-family-out

Family level egress firewall

Egress

fw-fbf

Firewall filter-based forwarding

Pre-ingress

fw-fbf-inet6

Firewall filter-based forwarding for inet6 family

Pre-ingress

fw-ifl-in

Logical interface level ingress firewall

Ingress

fw-ifl-out

Logical interface level egress firewall

Egress

fw-inet-ftf

Inet family ingress firewall on a forwarding-table

Ingress

fw-inet6-ftf

Inet6 family ingress firewall on a forwarding-table

Ingress

fw-inet-in

Inet family ingress firewall

Ingress

fw-inet-rpf

Inet family ingress firewall on RPF fail check

Ingress

fw-inet6-in

Inet6 family ingress firewall

Ingress

fw-inet6-family-out

Inet6 Family level egress firewall

Egress

fw-inet6-rpf

Inet6 family ingress firewall on a RPF fail check

Ingress

fw-inet-pm

Inet family firewall with port-mirror action

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Ingress

fw-l2-in

Bridge family ingress firewall on Layer 2 interface

Ingress

fw-mpls-in

MPLS family ingress firewall

Ingress

fw-semantics

Firewall sharing semantics for CLI configured firewall

Pre-ingress

fw-vpls-in

VPLS family ingress firewall on VPLS interface

Ingress

ifd-src-mac-fil

Physical interface level source MAC filter

Pre-ingress

ifl-statistics-in

Logical level interface statistics at ingress

Ingress

ifl-statistics-out

Logical level interface statistics at egress

Egress

ing-out-iff

Ingress application on behalf of egress family filter for log and syslog

Ingress

ip-mac-val

IP MAC validation

Pre-ingress

ip-mac-val-bcast

IP MAC validation for broadcast

Pre-ingress

ipsec-reverse-fil

Reverse filters for IPsec service

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Ingress

irb-cos-rw

IRB CoS rewrite

Egress

lfm-802.3ah-in

Link fault management (IEEE 802.3ah) at ingress

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Ingress

lfm-802.3ah-out

Link fault management (IEEE 802.3ah) at egress

Egress

lo0-inet-fil

Looback interface inet filter

Ingress

lo0-inet6-fil

Looback interface inet6 filter

Ingress

mac-drop-cnt

Statistics for drops by MAC validate and source MAC filters

Ingress

mrouter-port-in

Multicast router port for snooping

Ingress

napt-reverse-fil

Reverse filters for network address port translation (NAPT) service

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Ingress

no-local-switching

Bridge no-local-switching

Ingress

ptpoe

Point-to-Point-Over-the-Ethernet traps

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Ingress

ptpoe-cos-rw

CoS rewrite for PTPoE

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Egress

rfc2544-layer2-in

RFC2544 for Layer 2 service at ingress

Pre-ingress

rfc2544-layer2-out

RFC2544 for Layer 2 service at egress

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Egress

service-filter-in

Service filter at ingress

Note:

This feature is not supported on ACX5048 and ACX5096 routers.

Ingress

Monitoring TCAM Resource Usage

You can use the show and clear commands to monitor and troubleshoot dynamic TCAM resource usage.

Table 3 summarizes the command-line interface (CLI) commands you can use to monitor and troubleshoot dynamic TCAM resource usage.

Table 3: Show and Clear Commands to Monitor and Troubleshoot Dynamic TCAM

Task

Command

Display the shared and the related applications for a particular application

show pfe tcam app

Display the TCAM resource usage for an application and stages (egress, ingress, and pre-ingress)

show pfe tcam usage

(ACX5448) show pfe filter hw summary

Display the TCAM resource usage errors for applications and stages (egress, ingress, and pre-ingress)

show pfe tcam errors

Clears the TCAM resource usage error statistics for applications and stages (egress, ingress, and pre-ingress)

clear pfe tcam-errors

Example: Monitoring and Troubleshooting the TCAM Resource

This section describes a use case where you can monitor and troubleshoot TCAM resources using show commands. In this use case scenario, you have configured Layer 2 services and the Layer 2 service-related applications are using TCAM resources. The dynamic approach, as shown in this example, gives you the complete flexibility to manage TCAM resources on a need basis.

The service requirement is as follows:

  • Each bridge domain has one UNI and one NNI interface

  • Each UNI interface has:

    • One logical interface level policer to police the traffic at 10 Mbps.

    • Multifield classifier with four terms to assign forwarding class and loss-priority.

  • Each UNI interface configures CFM UP MEP at the level 4.

  • Each NNI interface configures CFM DOWN MEP at the level 2

Let us consider a scenario where there are 100 services configured on the router. With this scale, all the applications are configured successfully and the status shows OK state.

  1. Viewing TCAM resource usage for all stages.

    To view the TCAM resource usage for all stages (egress, ingress, and pre-ingress), use the show pfe tcam usage all-tcam-stages detail command. On ACX5448 routers, use the show pfe filter hw summary command to view the TCAM resource usgae.

  2. Configure additional Layer 2 services on the router.

    For example, add 20 more services on the router, thereby increasing the total number of services to 120. After adding more services, you can check the status of the configuration by verifying either the syslog message using the command show log messages, or by running the show pfe tcam errors command.

    The following is a sample syslog message output showing the TCAM resource shortage for Ethernet-switching family filters for newer configurations by running the show log messages CLI command.

    If you use the show pfe tcam errors all-tcam-stages detail CLI command to verify the status of the configuration, the output will be as shown below:

    The output indicates that the fw-l2-in application is running out of TCAM resources and moves into a FAILED state. Although there are two TCAM slices available at the ingress stage, the fw-l2-in application is not able to use the available TCAM space due to its mode (DOUBLE), resulting in resource shortage failure.

  3. Fixing the applications that have failed due to the shortage of TCAM resouces.

    The fw-l2-in application failed because of adding more number of services on the routers, which resulted in shortage of TCAM resources. Although other applications seems to work fine, it is recommended to deactivate or remove the newly added services so that the fw-l2-in application moves to an OK state. After removing or deactivating the newly added services, you need to run the show pfe tcam usage and show pfe tcam error commands to verify that there are no more applications in failed state.

    To view the TCAM resource usage for all stages (egress, ingress, and pre-ingress), use the show pfe tcam usage all-tcam-stages detail command. For ACX5448 routers, use the show pfe filter hw summary command to to view the TCAM resource usage.

    To view TCAM resource usage errors for all stages (egress, ingress, and pre-ingress), use the show pfe tcam errors all-tcam-stages command.

    You can see that all the applications using the TCAM resources are in OK state and indicates that the hardware has been successfully configured.

Note:

As shown in the example, you will need to run the show pfe tcam errors and show pfe tcam usage commands at each step to ensure that your configurations are valid and that the applications using TCAM resource are in OK state. For ACX5448 routers, use the show pfe filter hw summary command to view the TCAM resource usage.

Monitoring and Troubleshooting TCAM Resource in ACX Series Routers

The dynamic allocation of Ternary Content Addressable Memory (TCAM) space in ACX Series efficiently allocates the available TCAM resources for various filter applications. In the dynamic TCAM model, various filter applications (such as inet-firewall, bridge-firewall, cfm-filters, etc.) can optimally utilize the available TCAM resources as and when required. Dynamic TCAM resource allocation is usage driven and is dynamically allocated for filter applications on a need basis. When a filter application no longer uses the TCAM space, the resource is freed and available for use by other applications. This dynamic TCAM model caters to higher scale of TCAM resource utilization based on application’s demand. You can use the show and clear commands to monitor and troubleshoot dynamic TCAM resource usage in ACX Series routers.

Note:

Applications using the TCAM resource is termed tcam-app in this document.

Dynamic Ternary Content Addressable Memory Overview shows the task and the commands to monitor and troubleshoot TCAM resources in ACX Series routers

Table 4: Commands to Monitor and Troubleshoot TCAM Resource in ACX Series

How to

Command

View the shared and the related applications for a particular application.

show pfe tcam app (list-shared-apps | list-related-apps)

View the number of applications across all tcam stages.

show pfe tcam usage all-tcam-stages

View the number of applications using the TCAM resource at a specified stage.

show pfe tcam usage tcam-stage (ingress | egress | pre-egress)

View the TCAM resource used by an application in detail.

show pfe tcam usage app <application-name> detail

View the TCAM resource used by an application at a specified stage.

show pfe tcam usage tcam-stage (ingress | egress | pre-egress) app <application-name>

Know the number of TCAM resource consumed by a tcam-app

show pfe tcam usage app <application-name>

View the TCAM resource usage errors for all stages.

show pfe tcam errors all-tcam-stages detail

View the TCAM resource usage errors for a stage

show pfe tcam errors tcam-stage (ingress | egress | pre-egress)

View the TCAM resource usage errors for an application.

show pfe tcam errors app <application-name>

View the TCAM resource usage errors for an application along with its other shared application.

show pfe tcam errors app <application-name> shared-usage

Clear the TCAM resource usage error statistics for all stages.

clear pfe tcam-errors all-tcam-stages

Clear the TCAM resource usage error statistics for a specified stage

clear pfe tcam-errors tcam-stage (ingress | egress | pre-egress)

Clear the TCAM resource usage error statistics for an application.

clear pfe tcam-errors app <application-name>

To know more about dynamic TCAM in ACX Series, see Dynamic Ternary Content Addressable Memory Overview.

Service Scaling on ACX5048 and ACX5096 Routers

On ACX5048 and ACX5096 routers, a typical service (such as ELINE, ELAN and IP VPN) that is deployed might require applications (such as policers, firewall filters, connectivity fault management IEEE 802.1ag, RFC2544) that uses the dynamic TCAM infrastructure.

Note:

Service applications that uses TCAM resources is limited by the TCAM resource availability. Therefore, the scale of the service depends upon the consumption of the TCAM resource by such applications.

A sample use case for monitoring and troubleshooting service scale in ACX5048 and ACX5096 routers can be found at the Dynamic Ternary Content Addressable Memory Overview section.

Troubleshooting DNS Name Resolution in Logical System Security Policies (Primary Administrators Only)

Problem

Description

The address of a hostname in an address book entry that is used in a security policy might fail to resolve correctly.

Cause

Normally, address book entries that contain dynamic hostnames refresh automatically for SRX Series Firewalls. The TTL field associated with a DNS entry indicates the time after which the entry should be refreshed in the policy cache. Once the TTL value expires, the SRX Series Firewall automatically refreshes the DNS entry for an address book entry.

However, if the SRX Series Firewall is unable to obtain a response from the DNS server (for example, the DNS request or response packet is lost in the network or the DNS server cannot send a response), the address of a hostname in an address book entry might fail to resolve correctly. This can cause traffic to drop as no security policy or session match is found.

Solution

The primary administrator can use the show security dns-cache command to display DNS cache information on the SRX Series Firewall. If the DNS cache information needs to be refreshed, the primary administrator can use the clear security dns-cache command.

Note:

These commands are only available to the primary administrator on devices that are configured for logical systems. This command is not available in user logical systems or on devices that are not configured for logical systems.

Troubleshooting Security Policies

Synchronizing Policies Between Routing Engine and Packet Forwarding Engine

Problem

Description

Security policies are stored in the routing engine and the packet forwarding engine. Security policies are pushed from the Routing Engine to the Packet Forwarding Engine when you commit configurations. If the security policies on the Routing Engine are out of sync with the Packet Forwarding Engine, the commit of a configuration fails. Core dump files may be generated if the commit is tried repeatedly. The out of sync can be due to:

  • A policy message from Routing Engine to the Packet Forwarding Engine is lost in transit.

  • An error with the routing engine, such as a reused policy UID.

Environment

The policies in the Routing Engine and Packet Forwarding Engine must be in sync for the configuration to be committed. However, under certain circumstances, policies in the Routing Engine and the Packet Forwarding Engine might be out of sync, which causes the commit to fail.

Symptoms

When the policy configurations are modified and the policies are out of sync, the following error message displays - error: Warning: policy might be out of sync between RE and PFE <SPU-name(s)> Please request security policies check/resync.

Solution

Use the show security policies checksum command to display the security policy checksum value and use the request security policies resync command to synchronize the configuration of security policies in the Routing Engine and Packet Forwarding Engine, if the security policies are out of sync.

Checking a Security Policy Commit Failure

Problem

Description

Most policy configuration failures occur during a commit or runtime.

Commit failures are reported directly on the CLI when you execute the CLI command commit-check in configuration mode. These errors are configuration errors, and you cannot commit the configuration without fixing these errors.

Solution

To fix these errors, do the following:

  1. Review your configuration data.

  2. Open the file /var/log/nsd_chk_only. This file is overwritten each time you perform a commit check and contains detailed failure information.

Verifying a Security Policy Commit

Problem

Description

Upon performing a policy configuration commit, if you notice that the system behavior is incorrect, use the following steps to troubleshoot this problem:

Solution

  1. Operational show Commands—Execute the operational commands for security policies and verify that the information shown in the output is consistent with what you expected. If not, the configuration needs to be changed appropriately.

  2. Traceoptions—Set the traceoptions command in your policy configuration. The flags under this hierarchy can be selected as per user analysis of the show command output. If you cannot determine what flag to use, the flag option all can be used to capture all trace logs.

You can also configure an optional filename to capture the logs.

If you specified a filename in the trace options, you can look in the /var/log/<filename> for the log file to ascertain if any errors were reported in the file. (If you did not specify a filename, the default filename is eventd.) The error messages indicate the place of failure and the appropriate reason.

After configuring the trace options, you must recommit the configuration change that caused the incorrect system behavior.

Debugging Policy Lookup

Problem

Description

When you have the correct configuration, but some traffic was incorrectly dropped or permitted, you can enable the lookup flag in the security policies traceoptions. The lookup flag logs the lookup related traces in the trace file.

Solution

Log Error Messages used for Troubleshooting ISSU-Related Problems

The following problems might occur during an ISSU upgrade. You can identify the errors by using the details in the logs. For detailed information about specific system log messages, see System Log Explorer.

Chassisd Process Errors

Problem

Description

Errors related to chassisd.

Solution

Use the error messages to understand the issues related to chassisd.

When ISSU starts, a request is sent to chassisd to check whether there are any problems related to the ISSU from a chassis perspective. If there is a problem, a log message is created.

Understanding Common Error Handling for ISSU

Problem

Description

You might encounter some problems in the course of an ISSU. This section provides details on how to handle them.

Solution

Any errors encountered during an ISSU result in the creation of log messages, and ISSU continues to function without impact to traffic. If reverting to previous versions is required, the event is either logged or the ISSU is halted, so as not to create any mismatched versions on both nodes of the chassis cluster. Table 8 provides some of the common error conditions and the workarounds for them. The sample messages used in the Table 8 are from the SRX1500 device and are also applicable to all supported SRX Series Firewalls.

Table 8: ISSU-Related Errors and Solutions

Error Conditions

Solutions

Attempt to initiate an ISSU when previous instance of an ISSU is already in progress

The following message is displayed:

warning: ISSU in progress

You can abort the current ISSU process, and initiate the ISSU again using the request chassis cluster in-service-upgrade abort command.

Reboot failure on the secondary node

No service downtime occurs, because the primary node continues to provide required services. Detailed console messages are displayed requesting that you manually clear existing ISSU states and restore the chassis cluster.

error: [Oct  6 12:30:16]: Reboot secondary node failed (error-code: 4.1)

       error: [Oct  6 12:30:16]: ISSU Aborted! Backup node maybe in inconsistent state, Please restore backup node
       [Oct  6 12:30:16]: ISSU aborted. But, both nodes are in ISSU window.
       Please do the following:
       1. Rollback the node with the newer image using rollback command
          Note: use the 'node' option in the rollback command
          otherwise, images on both nodes will be rolled back
       2. Make sure that both nodes (will) have the same image
       3. Ensure the node with older image is primary for all RGs
       4. Abort ISSU on both nodes
       5. Reboot the rolled back node

Starting with Junos OS Release 17.4R1, the hold timer for the initial reboot of the secondary node during the ISSU process is extended from 15 minutes (900 seconds) to 45 minutes (2700 seconds) in chassis clusters on SRX1500, SRX4100, SRX4200, and SRX4600 devices.

Secondary node failed to complete the cold synchronization

The primary node times out if the secondary node fails to complete the cold synchronization. Detailed console messages are displayed that you manually clear existing ISSU states and restore the chassis cluster. No service downtime occurs in this scenario.

[Oct  3 14:00:46]: timeout waiting for secondary node node1 to sync(error-code: 6.1)
        Chassis control process started, pid 36707 

       error: [Oct  3 14:00:46]: ISSU Aborted! Backup node has been upgraded, Please restore backup node 
       [Oct  3 14:00:46]: ISSU aborted. But, both nodes are in ISSU window. 
       Please do the following: 
      1. Rollback the node with the newer image using rollback command 
          Note: use the 'node' option in the rollback command 
          otherwise, images on both nodes will be rolled back 
      2. Make sure that both nodes (will) have the same image 
      3. Ensure the node with older image is primary for all RGs 
      4. Abort ISSU on both nodes 
      5. Reboot the rolled back node  

Failover of newly upgraded secondary failed

No service downtime occurs, because the primary node continues to provide required services. Detailed console messages are displayed requesting that you manually clear existing ISSU states and restore the chassis cluster.

[Aug 27 15:28:17]: Secondary node0 ready for failover.
[Aug 27 15:28:17]: Failing over all redundancy-groups to node0
ISSU: Preparing for Switchover
error: remote rg1 priority zero, abort failover.
[Aug 27 15:28:17]: failover all RGs to node node0 failed (error-code: 7.1)
error: [Aug 27 15:28:17]: ISSU Aborted!
[Aug 27 15:28:17]: ISSU aborted. But, both nodes are in ISSU window.
Please do the following:
1. Rollback the node with the newer image using rollback command
    Note: use the 'node' option in the rollback command
           otherwise, images on both nodes will be rolled back
2. Make sure that both nodes (will) have the same image
3. Ensure the node with older image is primary for all RGs
4. Abort ISSU on both nodes
5. Reboot the rolled back node
{primary:node1}

Upgrade failure on primary

No service downtime occurs, because the secondary node fails over as primary and continues to provide required services.

Reboot failure on primary node

Before the reboot of the primary node, devices being out of the ISSU setup, no ISSU-related error messages are displayed. The following reboot error message is displayed if any other failure is detected:

Reboot failure on     Before the reboot of primary node, devices will be out of ISSU setup and no primary node error messages will be displayed.
Primary node

ISSU Support-Related Errors

Problem

Description

Installation failure occurs because of unsupported software and unsupported feature configuration.

Solution

Use the following error messages to understand the compatibility-related problems:

Initial Validation Checks Failure

Problem

Description

The initial validation checks fail.

Solution

The validation checks fail if the image is not present or if the image file is corrupt. The following error messages are displayed when initial validation checks fail when the image is not present and the ISSU is aborted:

When Image Is Not Present

When Image File Is Corrupted

If the image file is corrupted, the following output displays:

The primary node validates the device configuration to ensure that it can be committed using the new software version. If anything goes wrong, the ISSU aborts and error messages are displayed.

Installation-Related Errors

Problem

Description

The install image file does not exist or the remote site is inaccessible.

Solution

Use the following error messages to understand the installation-related problems:

ISSU downloads the install image as specified in the ISSU command as an argument. The image file can be a local file or located at a remote site. If the file does not exist or the remote site is inaccessible, an error is reported.

Redundancy Group Failover Errors

Problem

Description

Problem with automatic redundancy group (RG) failure.

Solution

Use the following error messages to understand the problem:

Kernel State Synchronization Errors

Problem

Description

Errors related to ksyncd.

Solution

Use the following error messages to understand the issues related to ksyncd:

ISSU checks whether there are any ksyncd errors on the secondary node (node 1) and displays the error message if there are any problems and aborts the upgrade.

Change History Table

Feature support is determined by the platform and release you are using. Use Feature Explorer to determine if a feature is supported on your platform.

Release
Description
17.4R1
Starting with Junos OS Release 17.4R1, the hold timer for the initial reboot of the secondary node during the ISSU process is extended from 15 minutes (900 seconds) to 45 minutes (2700 seconds) in chassis clusters on SRX1500, SRX4100, SRX4200, and SRX4600 devices.