Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Troubleshooting EX9253 Components

 

Troubleshooting the Cooling System in an EX9253 Switch

Problem

Description: The fans in the fan tray are not functioning normally.

Cause

Solution

Follow these guidelines to troubleshoot the fans:

  • Check the status LED on the fan tray and the alarm LED on the front panel.

    If the alarm LED on the front panel glows, use the CLI to get information about the source of an alarm condition:

    user@switch> show chassis alarms

    If the CLI output lists only one fan failure and the other fans are functioning normally, the fan is most likely faulty and you must replace the fan tray. You cannot replace a single fan. If one or more fans fail, you must replace the entire fan tray.

  • Place your hand near the exhaust vents at the side of the chassis to determine whether the fans are pushing air out of the chassis.

  • If a fan tray is removed, both a minor alarm and a major alarm occur.

  • The following conditions automatically cause the fans to run at full speed and also trigger the indicated alarm:

    • A fan fails (major alarm).

    • The switch temperature exceeds the temperature warm threshold (minor alarm).

    • The temperature of the switch exceeds the temperature hot threshold (major alarm and automatic shutdown of the power supplies).

Troubleshooting the Power Supply in an EX9253 Switch

Problem

Description: The power system is not functioning normally.

Cause

Solution

  • Check the LEDs on each power supply faceplate. If a power supply is installed correctly and is functioning normally, the status LED lights steadily.

  • Issue the CLI show  chassis  environment  pem command to check the status of the power supplies. As shown in the sample output, the value Online in the rows labeled State must indicate that each of the power supplies is functioning normally.

    user@switch> show chassis environment pem

If a power supply is not functioning normally, perform the following steps to diagnose and correct the problem:

  • If a major alarm condition occurs, issue the show  chassis  alarms command to determine the source of the problem.

  • If all power supplies have failed, the system temperature might have exceeded the threshold, causing the system to shut down.

    Note

    If the system temperature exceeds the threshold, Junos  OS shuts down all power supplies so that no status is displayed.

    Junos  OS also can shut down one of the power supplies for other reasons. In this case, the remaining power supplies provide power to the switch, and you can still view the system status through the CLI or display.

  • Check that the DC circuit breaker or AC input switch is in the on position and that the power supply is receiving power.

  • Verify that the source circuit breaker has the proper current rating. Each power supply must be connected to a separate source circuit breaker.

  • Verify that the AC power cord or DC power cables from the power source to the switch are not damaged. If the insulation is cracked or broken, immediately replace the cord or cable.

  • Connect the power supply to a different power source with a new power cord or power cables. If the power supply status LEDs indicate that the power supply is not operating normally, the power supply is the source of the problem. Replace the power supply with a spare.

Troubleshooting Line Cards in an EX9253 Switch

Problem

Description: Line card is not functioning normally.

Solution

  • The Routing Engine downloads the line card software to it under two conditions: the line card is present when the Routing Engine boots Junos  OS, and the line card is installed and requested online through the CLI or the button on the front panel. The line card then runs diagnostics. When the line card is online and functioning normally, the OK/FAIL LED is lit green steadily.

  • Make sure the line card is properly seated in the chassis. Check that each ejector handle is tight.

  • Check the OK/FAIL LED on the line card. When the line card is online and functioning normally, the OK/FAIL LED is lit green steadily.

  • Issue the show chassis fpc command to check the status of installed line cards. As shown in the sample output, the value Online in the column labeled Slot State indicates that the line card is functioning normally:

    user@switch> show chassis fpc
    Note

    The show chassis fpc command displays the status of the line cards.

    For more detailed output, add the detail option. The following example does not specify a slot number, which is optional:

    user@switch> show chassis fpc detail

Troubleshoot Temperature Alarms in EX Series Switches

Problem

Description: EX Series switches generate a temperature alarm FPC 0 EX-PFE1 Temp Too Hot.

Cause

Temperature sensors in the chassis monitor the temperature of the chassis. The switch raises an alarm if a fan fails or if the temperature of the chassis exceeds permissible levels.

Solution

When the switch raises a temperature alarm such as the FPC 0 EX-PFE1 Temp Too Hot alarm, use the show chassis environment and the show chassis temperature-thresholds commands to identify the condition that triggered the alarm.

Caution

To prevent the switch from overheating, do not operate it in an area that exceeds the maximum recommended ambient temperature. To prevent airflow restriction, allow at least 6 inches (15.2 cm) of clearance around the ventilation openings.

  1. Connect to the switch by using Telnet and issue the show chassis environment command. This command displays environmental information about the switch chassis, including the temperature, and information about the fans, power supplies, and Routing Engines. Following is a sample output on an EX9208 switch. The output is similar on other EX Series switches.

    show chassis environment (EX9208 Switch)

    user@switch> show chassis environment

    Table 1 lists the output fields for the show chassis environment command. Output fields are listed in the approximate order in which they appear.

    Table 1: show chassis environment Output Fields

    Field Name

    Field Description

    Class

    Information about the category or class of chassis component:

    • Temp: Temperature of air flowing through the chassis in degrees Celsius (°C) and degrees Fahrenheit (°F).

    • Fans: Information about the status of fans and blowers.

    Item

    Information about the chassis components: Flexible PIC Concentrators (FPCs)–that is, the line cards–, Control Boards (CBs), Routing Engines (REs), Power Entry Modules (PEMs)–that is, the power supplies.

    Status

    Status of the specified chassis component. For example, if Class is Fans, the fan status can be:

    • OK: The fans are operational.

    • Testing: The fans are being tested during initial power-on.

    • Failed: The fans have failed or the fans are not spinning.

    • Absent: The fan tray is not installed.

    Measurement

    Depends on the Class. For example, if Class is Temp, indicates the temperature in degrees Celsius (°C) and degrees Fahrenheit (°F). If the Class is Fans, indicates actual fan RPM.

  2. Issue the command show chassis temperature-thresholds. This command displays the chassis temperature threshold settings. Following is a sample output on an EX9208 switch. The output is similar on other EX Series switches.

    show chassis temperature-thresholds (EX9208 Switch)

    user@ host> show chassis temperature-thresholds

    Table 2 lists the output fields for the show chassis temperature-thresholds command. Output fields are listed in the approximate order in which they appear.

    Table 2: show chassis temperature-thresholds Output Fields

    Field Name

    Field Description

    Item

    Chassis component. You can configure for the threshold information for components such as the chassis, the Routing Engines, and FPC for each slot in each FRU to display in the output. By default, information is displayed only for the chassis and the Routing Engines.

    Fan speed

    Temperature thresholds, in degrees Celsius, for the fans to operate at normal and at high speed.

    • Normal—The temperature threshold at which the fans operate at normal speed and when all the fans are present and functioning normally.

    • High—The temperature threshold at which the fans operate at high speed or when a fan has failed or is missing.

    Note: An alarm is not triggered until the temperature exceeds the threshold settings for a yellow or amber alarm or a red alarm.

    Yellow or amber alarm

    Temperature threshold, in degrees Celsius, that trigger a yellow or amber alarm.

    • Normal—The temperature threshold that must be exceeded on the component to trigger a yellow or amber alarm when the fans are running at full speed.

    • Bad fan—The temperature threshold that must be exceeded on the component to trigger a yellow or amber alarm when one or more fans have failed or are missing.

    Red alarm

    Temperature threshold, in degrees Celsius, that trigger a red alarm.

    • Normal—The temperature threshold that must be exceeded on the component to trigger a red alarm when the fans are running at full speed.

    • Bad fan—The temperature threshold that must be exceeded on the component to trigger a red alarm when one or more fans have failed or are missing.

    Fire Shutdown

    Temperature threshold, in degrees Celsius, for the switch to shut down.

When a temperature alarm is triggered, you can identify the condition that triggered it by running the show chassis environment command to display the chassis temperature values for each component and comparing those with the temperature threshold values, which you can display by running the show chassis temperature-thresholds command.

For example, for FPC 3:

  • If the temperature of FPC 3 exceeds 55° C, the output indicates that the fans are operating at a high speed (no alarm is triggered).

  • If the temperature of FPC 3 exceeds 65° C, a yellow alarm is triggered to indicate that one or more fans have failed.

  • If the temperature of FPC 3 exceeds 75° C, a yellow alarm is triggered to indicate that the temperature threshold limit is exceeded.

  • If the temperature of FPC 3 exceeds 80° C, a red alarm is triggered to indicate that one or more fans have failed.

  • If the temperature of FPC 3 exceeds 105° C, a red alarm is triggered to indicate that the temperature threshold limit is exceeded.

  • If the temperature of FPC 3 exceeds 110° C, the switch is powered off.

Table 3 lists the possible causes for the switch to generate a temperature alarm and the respective remedies.

Table 3: Causes and Remedies for Temperature Alarms

Cause

Remedy

Ambient temperature is above threshold temperature.

Ensure that the ambient temperature is within the threshold temperature limit. See Environmental Requirements and Specifications for EX Series Switches.

Fan module or fan tray has failed.

  • Check the fan.

  • Replace the faulty fan module or fan tray.

  • If the above two checks show no problems, open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

Restricted airflow through the switch due to insufficient clearance around the installed switch.

Ensure that there is sufficient clearance around the installed switch. See the following topics to understand the clearance requirements of various EX Series switches.

Understand Alarm Types and Severity Levels on EX Series Switches

Note

This topic applies only to the J-Web Application package.

Alarms alert you to conditions that might prevent normal operation of the switch. Before monitoring alarms on a Juniper Networks EX Series Ethernet switch, become familiar with the terms defined in Table 4.

Table 4: Alarm Terms

Term

Definition

alarm

Signal alerting you to conditions that might prevent normal operation. On a switch, the alarm signal is the ALM LED lit on the front of the chassis.

alarm condition

Failure event that triggers an alarm.

alarm severity

Seriousness of the alarm. If the Alarm (ALM) LED is red, this indicates a major alarm. If the Alarm LED is yellow or amber, this indicates a minor alarm. If the Alarm LED is unlit, there is no alarm or the switch is halted.

chassis alarm

Preset alarm triggered by a physical condition on the switch such as a power supply failure, excessive component temperature, or media failure.

system alarm

Preset alarm triggered by a missing rescue configuration or failure to install a license for a licensed software feature.

Note: On EX6200 switches, a system alarm can be triggered by an internal link error.

Alarm Types

The switch supports these alarms:

  • Chassis alarms indicate a failure on the switch or one of its components. Chassis alarms are preset and cannot be modified.

  • System alarms indicate a missing rescue configuration. System alarms are preset and cannot be modified, although you can configure them to appear automatically in the J-Web interface display or the CLI display.

Alarm Severity Levels

Alarms on switches have two severity levels:

  • Major (red)—Indicates a critical situation on the switch that has resulted from one of the following conditions. A red alarm condition requires immediate action.

    • One or more hardware components have failed.

    • One or more hardware components have exceeded temperature thresholds.

    • An alarm condition configured on an interface has triggered a critical warning.

  • Minor (yellow or amber)—Indicates a noncritical condition on the switch that, if left unchecked, might cause an interruption in service or degradation in performance. A yellow or amber alarm condition requires monitoring or maintenance.

    A missing rescue configuration generates a yellow or amber system alarm.

Chassis Component Alarm Conditions on EX9253 Switches

This topic describes the chassis component alarm conditions on EX9253 switches.

Table 5 lists the alarms that the chassis components can generate on EX9253 switches.

Table 5: Chassis Component Alarm Conditions on EX9253 Switches

Chassis Component

Alarm Condition

Alarm Severity

Remedy

Alternative media

The switch boots from an alternate boot device, the secondary SSD. The primary SSD (SSD0) is typically the primary boot device. The Routing Engine boots from the secondary SSD (SSD1) when the primary boot device fails.

Minor (yellow)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll free, US & Canada) or 1-408-745-9500 (from outside the United States).

Line Cards

A line card is offline.

Minor (yellow)

Check the line card. Remove and reinstall the line card. If this fails, replace the failed card.

A line card has failed.

Major (red)

Replace the failed line card.

A line card has been removed.

Major (red)

Install a line card in the empty slot.

Fan trays

A fan tray has been removed from the chassis.

Major (red)

Install the missing fan tray.

One fan in the chassis is not spinning or is spinning below required speed.

Major (red)

Replace the fan tray.

Hot swapping

Too many hot-swap interrupts are occurring. This message generally indicates that a hardware component that plugs into the switch’s backplane from the front (generally, an FPC) is broken.

Major (red)

Replace the failed components.

Power supplies

A power supply has been removed from the chassis.

Minor (yellow)

Install a power supply in the empty slot.

A power supply has a high temperature.

Major (red)

Replace the failed power supply.

A power supply input has failed.

Major (red)

Check power supply input connection.

A power supply output has failed.

Major (red)

Check power supply output connection.

A power supply has failed.

Major (red)

Replace the failed power supply.

AC and DC power supplies are installed.

Major (red)

Do not mix AC and DC power supplies.

Inadequate number of power supplies.

Major (red)

Install an additional power supply.

Routing Engine

Excessive framing errors on console port.

An excessive framing error alarm is triggered when the default framing error threshold of 20 errors per second on a serial port is exceeded.

A faulty serial console port cable might be connected to the device.

Minor (yellow)

Replace the serial cable connected to the device.

If the cable is replaced and no excessive framing errors are detected within five minutes from the last detected framing error, the alarm is cleared automatically.

Error in reading or writing SSD.

Minor (yellow)

Reformat the SSD and install the bootable image. If this fails, replace the failed Routing Engine.

System booted from the default backup Routing Engine. If you manually switched primary role, ignore this alarm condition.

Minor (yellow)

Install the bootable image on the default primary Routing Engine. If this fails, replace the failed Routing Engine.

System booted from SSD.

Minor (yellow)

Install the bootable image on the SSD. If this fails, replace failed the Routing Engine.

SSD missing in boot list.

Major (red)

Replace the failed Routing Engine.

Routing Engine failed to boot.

Major (red)

Replace the failed Routing Engine.

The Ethernet management interface (fxp0 or em0) on the Routing Engine is down.

Major (red)

  • Check the interface cable connection.

  • Reboot the system.

  • If the alarm recurs, open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll free, US & Canada) or 1-408-745-9500 (from outside the United States).

/var partition usage is high.

Minor (yellow)

Clean up the system file storage space on the switch. For more information, see Freeing Up System Storage Space.

/var partition is full.

Major (red)

Clean up the system file storage space on the switch. For more information, see Freeing Up System Storage Space.

Rescue configuration is not set.

Minor (yellow)

Use the request system configuration rescue save command to set the rescue configuration.

Feature usage requires a license or the license for the feature usage has expired.

Minor (yellow)

Install the required license for the feature specified in the alarm. For more information, see Understanding Software Licenses for EX Series Switches.

Temperature

The temperature has exceeded the defined threshold for the Bad fan condition listed in the output of show chassis temperature-thresholds command.

Minor (yellow)

  • Check room temperature.

  • Check air filter and replace it, if required.

  • Check airflow.

  • Replace the fan tray.

The temperature has exceeded the defined threshold for the Normal condition listed in the output of show chassis temperature-thresholds command.

Minor (yellow)

  • Check room temperature.

  • Check air filter and replace it, if required.

  • Check airflow.

  • Check the fans.

The temperature has exceeded the defined threshold for the Red alarm condition listed in the output of show chassis temperature-thresholds command.

Major (red)

  • Check room temperature.

  • Check air filter and replace it, if required.

  • Check airflow.

  • Check the fans.

The temperature has exceeded the defined threshold for the Fire Shutdown condition listed in the output of show chassis temperature-thresholds command.

Major (red)

  • Check room temperature.

  • Check air filter and replace it, if required.

  • Check airflow.

  • Check fans.

The temperature sensor has failed.

Major (red)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll free, US & Canada) or 1-408-745-9500 (from outside the United States).

Backup Routing Engine Alarms

For switches with primary and backup Routing Engines, a primary Routing Engine can generate alarms for events that occur on a backup Routing Engine. Table 6 lists chassis alarms generated for events that occur on a backup Routing Engine.

Note

Because the failure occurs on the backup Routing Engine, alarm severity for some events (such as Ethernet interface failures) is yellow instead of red.

Note

For information about configuring redundant Routing Engines, see the Junos OS High Availability Library for Routing Devices.

Table 6: Backup Routing Engine Alarms

Chassis Component

Alarm Condition

Alarm Severity

Remedy

Alternative media

The backup Routing Engine boots from an alternate boot device, the SSD. The primary SSD (SSD0) is typically the primary boot device. The Routing Engine boots from the secondary SSD (SSD1) when the primary boot device fails.

Minor (yellow)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll free, US & Canada) or 1-408-745-9500 (from outside the United States).

Boot Device

The boot device is missing in boot list on the backup Routing Engine.

Major (red)

Replace the failed backup Routing Engine.

Ethernet

The Ethernet management interface (fxp0 or em0) on the backup Routing Engine is down.

Minor (yellow)

  • Check the interface cable connection.

  • Reboot the system.

  • If the alarm recurs, open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll free, US & Canada) or 1-408-745-9500 (from outside the United States).

FRU Offline

The backup Routing Engine has stopped communicating with the primary Routing Engine.

Minor (yellow)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll free, US & Canada) or 1-408-745-9500 (from outside the United States).

Multibit Memory ECC

The backup Routing Engine reports a multibit ECC error.

Minor (yellow)

  • Reboot the system with the board reset button on the backup Routing Engine.

  • If the alarm recurs, open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll free, US & Canada) or 1-408-745-9500 (from outside the United States).

Monitor System Log Messages

Purpose

Note

This topic applies only to the J-Web Application package.

Use the monitoring functionality to filter and view system log messages for EX Series switches.

Action

To view events in the J-Web interface, select Monitor > Events and Alarms > View Events.

Apply a filter or a combination of filters to view messages. You can use filters to display relevant events. Table 7 describes the different filters, their functions, and the associated actions.

To view events in the CLI, enter the following command:

show log

Table 7: Filtering System Log Messages

Field

Function

Your Action

System Log File

Specifies the name of a system log file for which you want to display the recorded events.

Lists the names of all the system log files that you configure.

By default, a log file, messages, is included in the /var/log/ directory.

To specify events recorded in a particular file, select the system log filename from the list— for example, messages.

Select Include archived files to include archived files in the search.

Process

Specifies the name of the process generating the events you want to display.

To view all the processes running on your system, enter the CLI command show system processes.

For more information about processes, see the Junos OS Installation and Upgrade Guide.

To specify events generated by a process, type the name of the process.

For example, type mgd to list all messages generated by the management process.

Date From

To

Specifies the time period in which the events you want displayed are generated.

Displays a calendar that allows you to select the year, month, day, and time. It also allows you to select the local time.

By default, the messages generated during the last one hour are displayed. End Time shows the current time and Start Time shows the time one hour before End Time.

To specify the time period:

  • Click the Calendar icon and select the year, month, and date— for example, 02/10/2007.

  • Click the Calendar icon and select the year, month, and date— for example, 02/10/2007.

  • Click to select the time in hours, minutes, and seconds.

Event ID

Specifies the event ID for which you want to display the messages.

Allows you to type part of the ID and completes the remainder automatically.

An event ID, also known as a system log message code, uniquely identifies a system log message. It begins with a prefix that indicates the generating software process or library.

To specify events with a specific ID, type the partial or complete ID— for example, TFTPD_AF_ERR.

Description

Specifies text from the description of events that you want to display.

Allows you to use regular expressions to match text from the event description.

Note: Regular expression matching is case-sensitive.

To specify events with a specific description, type a text string from the description with regular expression.

For example, type ^Initial* to display all messages with lines beginning with the term Initial.

Search

Applies the specified filter and displays the matching messages.

To apply the filter and display messages, click Search.

Reset

Resets all the fields in the Events Filter box.

To reset the field values that are listed in the Events Filter box, click Reset.

Generate Raw Report

Note:

  • Starting in Junos OS Release 14.1X53, a Raw Report can be generated from the log messages being loaded in the Events Detail table. The Generate Raw Report button is enabled after the event log messages start loading in the Events Detail table.

  • After the log messages are completely loaded in the Events Detail table, Generate Raw Report changes to Generate Report.

Generates a list of event log messages in nontabular format.

To generate a raw report:

  1. Click Generate Raw Report.

    The Opening filteredEvents.html window appears.

  2. Select Open with to open the HTML file or select Save File to save the file.
  3. Click OK.

Generate Report

Note: Starting in Junos OS Release 14.1X53, a Formatted Report can be generated from event log messages being loaded in an Events Detail table. The Generate Report button appears only after event log messages are completely loaded in the Events Detail table. The Generate Raw Report button is displayed while event log messages are being loaded.

Generates a list of event log messages in tabular format, which shows system details, events filter criteria, and event details.

To generate a formatted report:

  1. Click Generate Report.

    The Opening Report.html window appears.

  2. Select Open with to open the HTML file or select Save File to save the file.
  3. Click OK.

Meaning

Table 8 describes the Event Summary fields.

Note

By default, the View Events page in the J-Web interface displays the most recent 25 events, with severity levels highlighted in different colors. After you specify the filters, Event Summary displays the events matching the specified filters. Click the First, Next, Prev, and Last links to navigate through messages.

Table 8: Viewing System Log Messages

Field

Function

Additional Information

Process

Displays the name and ID of the process that generated the system log message.

The information displayed in this field is different for messages generated on the local Routing Engine than for messages generated on another Routing Engine (on a system with two Routing Engines installed and operational). Messages from the other Routing Engine also include the identifiers re0 and re1 that identify the Routing Engine.

Severity

Severity level of a message is indicated by different colors.

  • Unknown—Gray—Indicates no severity level is specified.

  • Debug/Info/Notice—Green—Indicates conditions that are not errors but are of interest or might warrant special handling.

  • Warning—Yellow or Amber—Indicates conditions that warrant monitoring.

  • Error—Blue—Indicates standard error conditions that generally have less serious consequences than errors in the emergency, alert, and critical levels.

  • Critical—Pink—Indicates critical conditions, such as hard-drive errors.

  • Alert—Orange—Indicates conditions that require immediate correction, such as a corrupted system database.

  • Emergency—Red—Indicates system panic or other conditions that cause the switch to stop functioning.

A severity level indicates how seriously the triggering event affects switch functions. When you configure a location for logging a facility, you also specify a severity level for the facility. Only messages from the facility that are rated at that level or higher are logged to the specified file.

Event ID

Displays a code that uniquely identifies the message.

The prefix on each code identifies the message source, and the rest of the code indicates the specific event or error.

The event ID begins with a prefix that indicates the generating software process.

Some processes on a switch do not use codes. This field might be blank in a message generated from such a process.

An event can belong to one of the following type categories:

  • Error—Indicates an error or failure condition that might require corrective action.

  • Event—Indicates a condition or occurrence that does not generally require corrective action.

Event Description

Displays a more detailed explanation of the message.

 

Time

Displays the time at which the message was logged.

 
Release History Table
Release
Description
Starting in Junos OS Release 14.1X53, a Raw Report can be generated from the log messages being loaded in the Events Detail table.
Starting in Junos OS Release 14.1X53, a Formatted Report can be generated from event log messages being loaded in an Events Detail table.