Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Troubleshooting EX8200 Components

 

Understanding Alarm Types and Severity Levels on EX Series Switches

Note

This topic applies only to the J-Web Application package.

Alarms alert you to conditions that might prevent normal operation of the switch. Before monitoring alarms on a Juniper Networks EX Series Ethernet switch, become familiar with the terms defined in Table 1.

Table 1: Alarm Terms

Term

Definition

alarm

Signal alerting you to conditions that might prevent normal operation. On a switch, the alarm signal is the ALM LED lit on the front of the chassis.

alarm condition

Failure event that triggers an alarm.

alarm severity

Seriousness of the alarm. If the Alarm (ALM) LED is red, this indicates a major alarm. If the Alarm LED is yellow, this indicates a minor alarm. If the Alarm LED is unlit, there is no alarm or the switch is halted.

chassis alarm

Preset alarm triggered by a physical condition on the switch such as a power supply failure, excessive component temperature, or media failure.

system alarm

Preset alarm triggered by a missing rescue configuration or failure to install a license for a licensed software feature.

Note: On EX6200 switches, a system alarm can be triggered by an internal link error.

Alarm Types

The switch supports these alarms:

  • Chassis alarms indicate a failure on the switch or one of its components. Chassis alarms are preset and cannot be modified.

  • System alarms indicate a missing rescue configuration. System alarms are preset and cannot be modified, although you can configure them to appear automatically in the J-Web interface display or the CLI display.

Alarm Severity Levels

Alarms on switches have two severity levels:

  • Major (red)—Indicates a critical situation on the switch that has resulted from one of the following conditions. A red alarm condition requires immediate action.

    • One or more hardware components have failed.

    • One or more hardware components have exceeded temperature thresholds.

    • An alarm condition configured on an interface has triggered a critical warning.

  • Minor (yellow or amber)—Indicates a noncritical condition on the switch that, if left unchecked, might cause an interruption in service or degradation in performance. A yellow alarm condition requires monitoring or maintenance.

    A missing rescue configuration generates a yellow system alarm.

Chassis Component Alarm Conditions on EX8200 Switches

Purpose

This document provides information on chassis alarm conditions, and how you must respond when a certain chassis alarm is seen on your switch.

Various conditions related to the chassis components trigger yellow and red alarms. You cannot configure these conditions. See Understanding Alarm Types and Severity Levels on EX Series Switches.

Action

You can monitor chassis alarms by watching the ALM chassis status LED and using the LCD panel to gather information about the alarm. See Chassis Status LEDs in an EX8200 Switch and LCD Panel in an EX8200 Switch.

To display switch chassis alarms in the CLI, use the following command

The command output displays the number of alarms currently active, the time when the alarm began, the severity level, and an alarm description. Note the date and time of an alarm so that you can correlate it with error messages in the messages system log file.

You can also monitor chassis alarms using the J-Web interface. See Checking Active Alarms with the J-Web Interface.

Table 2 lists some of the chassis alarms that an EX8200 switch can generate.

Table 2: Chassis Alarms for EX8200 Switches

ComponentAlarm ConditionSeverityRemedyAdditional Information
Fan tray

The fan tray has been removed from the chassis.

Minor (yellow) or Major (red)

Install the fan tray.

The switch will eventually get too hot to operate if a fan tray is removed. Temperature alarms will follow.

This alarm is expected during fan tray removal and installation.

One or more fans in a fan tray are spinning below the required speed.

Major (red)

Replace the fan tray.

Individual fans cannot be replaced; you must replace the fan tray.

The fan tray might not be properly installed.

Major (red)

Remove and reinstall the fan tray.

If removing and reinstalling the fan tray does not resolve the problem, reboot the switch.

The switch will eventually get too hot to operate if a fan tray is not operating. Temperature alarms will follow.

Power supply

A power supply slot that contained a power supply at bootup is now empty.

Minor (yellow)

Install a power supply in the empty power supply slot.

You can ignore this alarm in cases in which a power supply slot can remain empty.

You will not see this alarm if the switch is booted with an empty power supply slot.

This alarm is expected during power supply removal and installation.

This alarm can be triggered during a line card installation. The alarm condition corrects itself when seen for this reason.

A power supply has failed due to an input or output failure, or due to temperature issues.

Major (red)

Replace the failed power supply.

 

The power supply might not be properly installed.

Major (red)

  • Remove and reinstall the power supply.

  • If removing and reinstalling the power supply does not resolve the problem, reboot the switch.

 

A power supply fan has failed.

Minor (yellow)

Replace the failed power supply.

 

A power supply has a high temperature.

Major (red)

Check the power supply fan.

 

Insufficient power input

Major (red)

Check the power supply.

 

An unknown power supply is installed.

Major (red)

  • Check the power supply.

  • Install a power supply recommended by Juniper Networks.

 
Temperature

The chassis warm temperature threshold has been exceeded and fan speeds have increased.

Minor (yellow)

Bring down the room temperature, if possible.

Ensure that the airflow through the switch is unobstructed.

The chassis is warm and must be cooled down. The switch is still functioning normally.

To monitor temperature:

user@switch> show chassis environment

To monitor temperature thresholds:

user@switch> show chassis temperature-thresholds

The chassis high temperature threshold has been exceeded and the fans are operating at full speed.

Major (red)

Bring down the room temperature, if possible.

Ensure that the airflow through the switch is unobstructed.

The chassis is hot and must be cooled down. The switch might still function normally but is close to shutting down if it hasn’t already.

To monitor temperature:

user@switch> show chassis environment

To monitor temperature thresholds:

user@switch> show chassis temperature-thresholds

The chassis warm temperature threshold has been exceeded, and one or more fans are not operating properly. The operating fans are running at full speed.

Minor (yellow)

Replace the fan tray that has the faulty fan or fans.

Bring down the room temperature, if possible.

Ensure that the airflow through the switch is unobstructed.

The chassis is warm and must be cooled down. The switch is still functioning normally.

To monitor temperature: user@switch> show chassis environment

To monitor temperature thresholds: user@switch> show chassis temperature-thresholds

The chassis high temperature threshold has been exceeded, and one or more fans are not operating properly. The operating fans are running at full speed.

Major (red)

Replace the fan tray that has the faulty fan or fans.

Bring down the room temperature, if possible.

Ensure that the airflow through the switch is unobstructed.

The chassis is hot and must be cooled down. The switch might still function normally but is close to shutting down if it hasn’t already.

To monitor temperature:

user@switch> show chassis environment

To monitor temperature thresholds:

user@switch> show chassis temperature-thresholds

The temperature sensor on a hardware component has failed.

Minor (yellow)

Replace the hardware component.

 
Management Ethernet interface

Management Ethernet link is down.

Major (red)

Check whether a cable is connected to the management Ethernet interface, or whether the cable is defective. Replace the cable if required.

If you are unable to resolve the problem, open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 
Media

Minor loss of communication with backup Routing Engine.

Minor (yellow)

Not applicable

Alarm to inform user of the intermittent loss of communication with Backup RE

Device booted from backup root.

Minor (yellow)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

/var or /config full (only 10% free).

Major (red)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

/var or /config full (only 25% free).

Minor (yellow)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

Upgrade bank is empty or corrupted.

Major (red)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

Firmware version is not the latest.

Minor (yellow)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

Single-bit ECC error detected.

Major (red)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

Console device encounters framing error storm.

Not applicable

Check for faulty console cable.

 
Routing Engine module (RE module), Switch Fabric and Routing Engine module (SRE module), or Switch Fabric module (SF module)

The RE module, SRE module, or the SF module has failed.

Major (red)

Replace the failed module.

 

Rescue configuration is not set.

Minor (yellow)

Use the request system configuration rescue save command to set the rescue configuration.

 

Feature usage requires a license or the license for the feature usage has expired.

Minor (yellow)

Install the required license for the feature specified in the alarm. For more information, see Understanding Software Licenses for EX Series Switches.

 

Backup Routing Engine is active.

Minor (yellow)

Not applicable

Alarm to inform user.

Link Status

The link to the network is down.

Major (red) or Minor (yellow)

Check network connectivity.

The network link is disabled by default, so you might see this alarm before you connect the switch to the network.

Line Cards

Hardware errors - Packet Forwarding Engine error, Line card fails to initiate, line card unresponsive

Major (red)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

Sensor errors - Temperature sensor error, voltage sensor error

Minor (yellow)

Open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

 

Checking Active Alarms with the J-Web Interface

Purpose

Note

This topic applies only to the J-Web Application package.

Use the monitoring functionality to view alarm information for the EX Series switches including alarm type, alarm severity, and a brief description for each active alarm on the switching platform.

Action

To view the active alarms:

  1. Select Monitor > Events and Alarms > View Alarms in the J-Web interface.
  2. Select an alarm filter based on alarm type, severity, description, and date range.
  3. Click Go.

    All the alarms matching the filter are displayed.

Note

When the switch is reset, the active alarms are displayed.

Meaning

Table 3 lists the alarm output fields.

Table 3: Summary of Key Alarm Output Fields

Field

Values

Type

Category of the alarm:

  • Chassis—Indicates an alarm condition on the chassis (typically an environmental alarm such as one related to temperature).

  • System—Indicates an alarm condition in the system.

Severity

Alarm severity—either major (red) or minor (yellow).

Description

Brief synopsis of the alarm.

Time

Date and time when the failure was detected.

Monitoring System Log Messages

Purpose

Note

This topic applies only to the J-Web Application package.

Use the monitoring functionality to filter and view system log messages for EX Series switches.

Action

To view events in the J-Web interface, select Monitor > Events and Alarms > View Events.

Apply a filter or a combination of filters to view messages. You can use filters to display relevant events. Table 4 describes the different filters, their functions, and the associated actions.

To view events in the CLI, enter the following command:

show log

Table 4: Filtering System Log Messages

Field

Function

Your Action

System Log File

Specifies the name of a system log file for which you want to display the recorded events.

Lists the names of all the system log files that you configure.

By default, a log file, messages, is included in the /var/log/ directory.

To specify events recorded in a particular file, select the system log filename from the list— for example, messages.

Select Include archived files to include archived files in the search.

Process

Specifies the name of the process generating the events you want to display.

To view all the processes running on your system, enter the CLI command show system processes.

For more information about processes, see the Junos OS Installation and Upgrade Guide.

To specify events generated by a process, type the name of the process.

For example, type mgd to list all messages generated by the management process.

Date From

To

Specifies the time period in which the events you want displayed are generated.

Displays a calendar that allows you to select the year, month, day, and time. It also allows you to select the local time.

By default, the messages generated during the last one hour are displayed. End Time shows the current time and Start Time shows the time one hour before End Time.

To specify the time period:

  • Click the Calendar icon and select the year, month, and date— for example, 02/10/2007.

  • Click the Calendar icon and select the year, month, and date— for example, 02/10/2007.

  • Click to select the time in hours, minutes, and seconds.

Event ID

Specifies the event ID for which you want to display the messages.

Allows you to type part of the ID and completes the remainder automatically.

An event ID, also known as a system log message code, uniquely identifies a system log message. It begins with a prefix that indicates the generating software process or library.

To specify events with a specific ID, type the partial or complete ID— for example, TFTPD_AF_ERR.

Description

Specifies text from the description of events that you want to display.

Allows you to use regular expressions to match text from the event description.

Note: Regular expression matching is case-sensitive.

To specify events with a specific description, type a text string from the description with regular expression.

For example, type ^Initial* to display all messages with lines beginning with the term Initial.

Search

Applies the specified filter and displays the matching messages.

To apply the filter and display messages, click Search.

Reset

Resets all the fields in the Events Filter box.

To reset the field values that are listed in the Events Filter box, click Reset.

Generate Raw Report

Note:

  • Starting in Junos  OS Release 14.1X53, a Raw Report can be generated from the log messages being loaded in the Events Detail table.The Generate Raw Report button is enabled after the event log messages start loading in the Events Detail table.

  • After the log messages are completely loaded in the Events Detail table, Generate Raw Report changes to Generate Report.

Generates a list of event log messages in nontabular format.

To generate a raw report:

  1. Click Generate Raw Report.

    The Opening filteredEvents.html window appears.

  2. Select Open with to open the HTML file or select Save File to save the file.
  3. Click OK.

Generate Report

Note: Starting in Junos  OS Release 14.1X53, a Formatted Report can be generated from event log messages being loaded in an Events Detail table.The Generate Report button appears only after event log messages are completely loaded in the Events Detail table. The Generate Raw Report button is displayed while event log messages are being loaded.

Generates a list of event log messages in tabular format, which shows system details, events filter criteria, and event details.

To generate a formatted report:

  1. Click Generate Report.

    The Opening Report.html window appears.

  2. Select Open with to open the HTML file or select Save File to save the file.
  3. Click OK.

Meaning

Table 5 describes the Event Summary fields.

Note

By default, the View Events page in the J-Web interface displays the most recent 25 events, with severity levels highlighted in different colors. After you specify the filters, Event Summary displays the events matching the specified filters. Click the First, Next, Prev, and Last links to navigate through messages.

Table 5: Viewing System Log Messages

Field

Function

Additional Information

Process

Displays the name and ID of the process that generated the system log message.

The information displayed in this field is different for messages generated on the local Routing Engine than for messages generated on another Routing Engine (on a system with two Routing Engines installed and operational). Messages from the other Routing Engine also include the identifiers re0 and re1 that identify the Routing Engine.

Severity

Severity level of a message is indicated by different colors.

  • Unknown—Gray—Indicates no severity level is specified.

  • Debug/Info/Notice—Green—Indicates conditions that are not errors but are of interest or might warrant special handling.

  • Warning—Yellow—Indicates conditions that warrant monitoring.

  • Error—Blue—Indicates standard error conditions that generally have less serious consequences than errors in the emergency, alert, and critical levels.

  • Critical—Pink—Indicates critical conditions, such as hard-drive errors.

  • Alert—Orange—Indicates conditions that require immediate correction, such as a corrupted system database.

  • Emergency—Red—Indicates system panic or other conditions that cause the switch to stop functioning.

A severity level indicates how seriously the triggering event affects switch functions. When you configure a location for logging a facility, you also specify a severity level for the facility. Only messages from the facility that are rated at that level or higher are logged to the specified file.

Event ID

Displays a code that uniquely identifies the message.

The prefix on each code identifies the message source, and the rest of the code indicates the specific event or error.

The event ID begins with a prefix that indicates the generating software process.

Some processes on a switch do not use codes. This field might be blank in a message generated from such a process.

An event can belong to one of the following type categories:

  • Error—Indicates an error or failure condition that might require corrective action.

  • Event—Indicates a condition or occurrence that does not generally require corrective action.

Event Description

Displays a more detailed explanation of the message.

 

Time

Displays the time at which the message was logged.

 

Troubleshooting an EX8200 Line Card’s Failure to Power On

Problem

Description: After you have installed a line card in an EX8200 switch, the line card fails to power on correctly. The ON LED on the line card is unlit or is not lit steadily.

Cause

The line card’s failure to power on might have resulted from any one of these causes:

  • The line card is not seated correctly in the slot in the switch chassis.

  • The switch does not have sufficient power supplies installed to power on the line card while maintaining its N+1 or N+N power configuration.

  • The line card requires a particular minimum Junos OS release to power on, and that minimum release is not running on the switch.

Solution

Possible solutions to these problems are:

If the ON LED is unlit:

If the ON LED blinks in green but is not lit steadily:

  • Tighten the captive screws on the faceplate of the line card to ensure that the line card is seated correctly in the slot in the switch chassis.

Troubleshooting Temperature Alarms in EX Series Switches

Problem

Description: EX Series switches generate a temperature alarm FPC 0 EX-PFE1 Temp Too Hot.

Cause

Temperature sensors in the chassis monitor the temperature of the chassis. The switch raises an alarm if a fan fails or if the temperature of the chassis exceeds permissible levels.

Solution

When the switch raises a temperature alarm such as the FPC 0 EX-PFE1 Temp Too Hot alarm, use the show chassis environment and the show chassis temperature-thresholds commands to identify the condition that triggered the alarm.

Caution

To prevent the switch from overheating, do not operate it in an area that exceeds the maximum recommended ambient temperature. To prevent airflow restriction, allow at least 6 inches (15.2 cm) of clearance around the ventilation openings.

  1. Connect to the switch by using Telnet and issue the show chassis environment command. This command displays environmental information about the switch chassis, including the temperature, and information about the fans, power supplies, and Routing Engines. Following is a sample output on an EX9208 switch. The output is similar on other EX Series switches.

    show chassis environment (EX9208 Switch)

    user@switch> show chassis environment

    Table 6 lists the output fields for the show chassis environment command. Output fields are listed in the approximate order in which they appear.

    Table 6: show chassis environment Output Fields

    Field Name

    Field Description

    Class

    Information about the category or class of chassis component:

    • Temp: Temperature of air flowing through the chassis in degrees Celsius (°C) and degrees Fahrenheit (°F).

    • Fans: Information about the status of fans and blowers.

    Item

    Information about the chassis components: Flexible PIC Concentrators (FPCs)–that is, the line cards–, Control Boards (CBs), Routing Engines (REs), Power Entry Modules (PEMs)–that is, the power supplies.

    Status

    Status of the specified chassis component. For example, if Class is Fans, the fan status can be:

    • OK: The fans are operational.

    • Testing: The fans are being tested during initial power-on.

    • Failed: The fans have failed or the fans are not spinning.

    • Absent: The fan tray is not installed.

    Measurement

    Depends on the Class. For example, if Class is Temp, indicates the temperature in degrees Celsius (°C) and degrees Fahrenheit (°F). If the Class is Fans, indicates actual fan RPM.

  2. Issue the command show chassis temperature-thresholds. This command displays the chassis temperature threshold settings. Following is a sample output on an EX9208 switch. The output is similar on other EX Series switches.

    show chassis temperature-thresholds (EX9208 Switch)

    user@ host> show chassis temperature-thresholds

    Table 7 lists the output fields for the show chassis temperature-thresholds command. Output fields are listed in the approximate order in which they appear.

    Table 7: show chassis temperature-thresholds Output Fields

    Field name

    Field Description

    Item

    Chassis component. You can configure for the threshold information for components such as the chassis, the Routing Engines, and FPC for each slot in each FRU to display in the output. By default, information is displayed only for the chassis and the Routing Engines.

    Fan speed

    Temperature thresholds, in degrees Celsius, for the fans to operate at normal and at high speed.

    • Normal—The temperature threshold at which the fans operate at normal speed and when all the fans are present and functioning normally.

    • High—The temperature threshold at which the fans operate at high speed or when a fan has failed or is missing.

    Note: An alarm is not triggered until the temperature exceeds the threshold settings for a yellow alarm or a red alarm.

    Yellow alarm

    Temperature threshold, in degrees Celsius, that trigger a yellow alarm.

    • Normal—The temperature threshold that must be exceeded on the component to trigger a yellow alarm when the fans are running at full speed.

    • Bad fan—The temperature threshold that must be exceeded on the component to trigger a yellow alarm when one or more fans have failed or are missing.

    Red alarm

    Temperature threshold, in degrees Celsius, that trigger a red alarm.

    • Normal—The temperature threshold that must be exceeded on the component to trigger a red alarm when the fans are running at full speed.

    • Bad fan—The temperature threshold that must be exceeded on the component to trigger a red alarm when one or more fans have failed or are missing.

    Fire Shutdown

    Temperature threshold, in degrees Celsius, for the switch to shut down.

When a temperature alarm is triggered, you can identify the condition that triggered it by running the show chassis environment command to display the chassis temperature values for each component and comparing those with the temperature threshold values, which you can display by running the show chassis temperature-thresholds command.

For example, for FPC 3:

  • If the temperature of FPC 3 exceeds 55° C, the output indicates that the fans are operating at a high speed (no alarm is triggered).

  • If the temperature of FPC 3 exceeds 65° C, a yellow alarm is triggered to indicate that one or more fans have failed.

  • If the temperature of FPC 3 exceeds 75° C, a yellow alarm is triggered to indicate that the temperature threshold limit is exceeded.

  • If the temperature of FPC 3 exceeds 80° C, a red alarm is triggered to indicate that one or more fans have failed.

  • If the temperature of FPC 3 exceeds 105° C, a red alarm is triggered to indicate that the temperature threshold limit is exceeded.

  • If the temperature of FPC 3 exceeds 110° C, the switch is powered off.

Table 8 lists the possible causes for the switch to generate a temperature alarm and the respective remedies.

Table 8: Causes and Remedies for Temperature Alarms

Cause

Remedy

Ambient temperature is above threshold temperature.

Ensure that the ambient temperature is within the threshold temperature limit. See Environmental Requirements and Specifications for EX Series Switches.

Fan module or fan tray has failed.

  • Check the fan.

  • Replace the faulty fan module or fan tray.

  • If the above two checks show no problems, open a support case using the Case Manager link at https://www.juniper.net/support/ or call 1-888-314-5822 (toll-free within the United States and Canada) or 1-408-745-9500 (from outside the United States).

Restricted airflow through the switch due to insufficient clearance around the installed switch.

Ensure that there is sufficient clearance around the installed switch. See the following topics to understand the clearance requirements of various EX Series switches.

Release History Table
Release
Description
Starting in Junos  OS Release 14.1X53, a Raw Report can be generated from the log messages being loaded in the Events Detail table.
Starting in Junos  OS Release 14.1X53, a Formatted Report can be generated from event log messages being loaded in an Events Detail table.