Error Handling by Fabric OAM
Fabric Operation, Administration, Maintenance (OAM) helps in detecting failures in fabric paths. Fabric OAM validates the fabric connectivity before sending traffic on a fabric plane whenever a new fabric path is brought up for a PFE. If a failure is detected, the software reports the fault and avoids using that fabric plane for that PFE. This feature works by sending a very low packets per second (PPS) self-destined OAM traffic over each of the available fabric planes and detecting any loss of traffic at the end points (fabric self-ping check).
- In Junos OS Evolved Release 20.4R1, the fabric OAM feature is enabled by
default. You can disable the feature by using the CLI command
set chassis fabric oam detection-disable
. - In Junos OS Evolved Releases 20.4R2 and 21.1R1, the fabric OAM feature is disabled by default.
- In Junos OS Evolved Release 22.1R1, the runtime fabric OAM feature is enabled by
default. You can disable the feature by using the CLI command
edit chassis fabric oam runtime-disable
. The runtime fabric OAM feature is supported on PTX10004, PTX10008, and PTX10016 routers.
The Fabric OAM checks are done at boot time. The failed paths are disabled. The system does not do any recovery action. However, you can try to recover the affected fabric planes by restarting the SIBs. The recovery steps depend on the nature of the failure.
A fabric plane represents an independent bidirectional path between a PFE and fabric ASIC. Runtime Fabric OAM periodically checks fabric connectivity and helps detect and report failures in fabric planes during system runtime. Runtime Fabric OAM detects the fabric reachability of each PFE.
When the same fabric planes fail on a single or multiple FPCs, restart the SIB containing the failed planes, using the following commands:
user@host> request chassis sib slot
slot-number offline
user@host> request chassis sib slot
slot-number online
When random fabric planes fail on multiple FPCs, the fault cannot be isolated to a specific FPC or SIB. However, you can try to recover the planes by restarting the SIBs that contain the affected planes in a sequential manner.
For each error detected by the fabric OAM feature, a syslog is generated. The following is an example:
Oct 29 23:02:46 router-dvi resiliencyd[12921]: Error: /fpc/0/fabspoked-pfe/0/cm/0/pfe/0/fabric_link_foam_fault (0x410009), scope: board, category: internal, severity: major, module: fab-pfe@0, type: fabric link foam fault
The following syslog message indicates that a fabric OAM-related error was cleared.
Oct 29 23:25:14 router-dvi resiliencyd[12921]: Performing action clear-cmalarm for error /fpc/0/fabspoked-pfe/0/cm/0/pfe/0/fabric_link_foam_fault (0x410009) in module: fab-pfe@0 with scope: board category: internal level: major
Also, you can use the CLI commands show system errors active detail
and
show system alarms
to view the Fabric OAM-related errors.
user@router> show system alarms
20 alarms currently active
Alarm time Class Description
2020-08-20 10:32:02 UTC Major FPC 0 Ideeprom read failure
2020-08-20 10:58:07 UTC Major FPC 0 Self_FOAM fault detected
[...Output truncated...]
user@router> show system alarms
14 alarms currently active
Alarm time Class Description
2022-02-15 23:45:28 PST Minor FPC 1 Volt Sensor Fail
2022-02-16 00:02:03 PST Major FPC 1 Self_Fabric OAM Runtime fault detected
2022-02-15 23:43:04 PST Minor FPC 1 Secure boot disabled or not enforced
2022-02-15 23:55:50 PST Minor FPC 3 Secure boot disabled or not enforced
[...Output truncated...]
The following output shows details for both single fabric plane failure (on Packet Forwarding Engine 0) and all fabric planes failure (on Packet Forwarding Engine 1).
user@router> show system errors active detail
System Active Errors Detail Information
FPC 0
----------------------------------------------------------------
Error Name : fabric_down_condition_on_pfe
Identifier : /fpc/0/fabricHub/0/cm/0/fabrichub/1/fabric_down_condition_on_pfe
Description : fabric_down_condition_on_pfe
State : enabled
Scope : pfe
Category : functional
Level : major
Threshold : 1
Error limit : 0
Occur count : 3
Clear count : 2
Last occurred(ms ago) : 103158
System Active Errors Detail Information
FPC 0
----------------------------------------------------------------
Error Name : fabric_link_foam_fault
Identifier : /fpc/0/fabspoked-pfe/0/cm/0/pfe/0/fabric_link_foam_fault
Description : fabric link foam fault
State : enabled
Scope : board
Category : internal
Level : major
Threshold : 1
Error limit : 100
Occur count : 2
Clear count : 0
Last occurred(ms ago) : 113277
System Active Errors Detail Information
FPC 0
----------------------------------------------------------------
Error Name : fabric_link_foam_fault
Identifier : /fpc/0/fabspoked-pfe/0/cm/0/pfe/1/fabric_link_foam_fault
Description : fabric link foam fault
State : enabled
Scope : board
Category : internal
Level : major
Threshold : 1
Error limit : 100
Occur count : 12
Clear count : 0
Last occurred(ms ago) : 103267
System Active Errors Detail Information
RE 0
----------------------------------------------------------------
Error Name : fpga_min_supported_fw_ver_mismatch
Identifier : /re/0/hwdre/0/cm/0/fpga_fw_events/UBAM FPGA/fpga_min_supported_fw_ver_mismatch
Description : firmware_version_lower_than_minimum_expected
State : enabled
Scope : board
Category : functional
Level : minor
Threshold : 10
Error limit : 1
Occur count : 1
Clear count : 0
Last occurred(ms ago) : 68886367
FPC 1
----------------------------------------------------------------
Error Name : fabric_link_self_fabric_oam_runtime_fault
Identifier : /fpc/1/fabspoked-pfe/0/cm/0/pfe/0/fabric_link_self_fabric_oam_runtime_fault
Description : fabric link self fabric oam runtime fault
State : enabled
Scope : board
Category : internal
Level : major
Threshold : 1
Error limit : 36
Occur count : 1
Clear count : 0
Last occurred(ms ago) : 2022-02-16 00:02:03 PST (448108 ms ago) System Active Errors Detail Information
You can use the CLI command show chassis fabric fpcs
to view the fabric OAM
self-ping state of each fabric plane.
user@router> show chassis fabric fpcs
Fabric management FPC state:
FPC #0
PFE #0
SIB0_Asic0_Fcore0 (plane 0) Plane Disabled, Links ok Fabric OAM failed
SIB0_Asic0_Fcore0 (plane 1) Plane Enabled, Links ok Fabric OAM success
SIB0_Asic0_Fcore0 (plane 2) Plane Enabled, Links ok Fabric OAM success
SIB0_Asic0_Fcore0 (plane 3) Plane Enabled, Links ok Fabric OAM success
SIB0_Asic0_Fcore0 (plane 4) Plane Enabled, Links ok Fabric OAM success
SIB0_Asic0_Fcore0 (plane 5) Plane Enabled, Links ok Fabric OAM success
SIB1_Asic0_Fcore0 (plane 6) Plane Enabled, Links ok Fabric OAM success
SIB1_Asic0_Fcore0 (plane 7) Plane Enabled, Links ok Fabric OAM success
SIB1_Asic0_Fcore0 (plane 8) Plane Enabled, Links ok Fabric OAM success
SIB1_Asic0_Fcore0 (plane 9) Plane Enabled, Links ok Fabric OAM success
SIB1_Asic0_Fcore0 (plane 10) Plane Enabled, Links ok Fabric OAM success
SIB1_Asic0_Fcore0 (plane 11) Plane Enabled, Links ok Fabric OAM success
PFE #1
SIB0_Asic0_Fcore0 (plane 0) Plane Enabled, Links ok Fabric OAM success
SIB0_Asic0_Fcore0 (plane 1) Plane Enabled, Links ok Fabric OAM success
user@router> show chassis fabric fpcs Fabric management FPC state: FPC #1 PFE #0 SIB0_Asic0_Fcore0 (plane 0) Plane Enabled, Links ok Fabric OAM Runtime success SIB0_Asic0_Fcore0 (plane 1) Plane Disabled, Links ok Fabric OAM Runtime failed SIB0_Asic1_Fcore0 (plane 2) Plane Enabled, Links ok Fabric OAM Runtime success SIB0_Asic1_Fcore0 (plane 3) Plane Enabled, Links ok Fabric OAM Runtime success SIB0_Asic2_Fcore0 (plane 4) Plane Enabled, Links ok Fabric OAM Runtime success SIB0_Asic2_Fcore0 (plane 5) Plane Enabled, Links ok Fabric OAM Runtime success SIB1_Asic0_Fcore0 (plane 6) Plane Enabled, Links ok Fabric OAM Runtime success SIB1_Asic0_Fcore0 (plane 7) Plane Enabled, Links ok Fabric OAM Runtime success SIB1_Asic1_Fcore0 (plane 8) Plane Enabled, Links ok Fabric OAM Runtime success SIB1_Asic1_Fcore0 (plane 9) Plane Enabled, Links ok Fabric OAM Runtime success SIB1_Asic2_Fcore0 (plane 10) Plane Enabled, Links ok Fabric OAM Runtime success SIB1_Asic2_Fcore0 (plane 11) Plane Enabled, Links ok Fabric OAM Runtime success SIB2_Asic0_Fcore0 (plane 12) Plane Enabled, Links ok Fabric OAM Runtime success SIB2_Asic0_Fcore0 (plane 13) Plane Enabled, Links ok Fabric OAM Runtime success SIB2_Asic1_Fcore0 (plane 14) Plane Enabled, Links ok Fabric OAM Runtime success SIB2_Asic1_Fcore0 (plane 15) Plane Enabled, Links ok Fabric OAM Runtime success
The show chassis fabric fpcs
command displays the following output when the
fabric OAM feature is disabled:
user@router> show chassis fabric fpcs
Fabric management FPC state:
FPC #0
PFE #0
SIB0_Asic0_Fcore0 (plane 0) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 1) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 2) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 3) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 4) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 5) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 6) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 7) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 8) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 9) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 10) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 11) Plane Enabled, Links ok
PFE #1
SIB0_Asic0_Fcore0 (plane 0) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 1) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 2) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 3) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 4) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 5) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 6) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 7) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 8) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 9) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 10) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 11) Plane Enabled, Links ok
PFE #2
SIB0_Asic0_Fcore0 (plane 0) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 1) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 2) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 3) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 4) Plane Enabled, Links ok
SIB0_Asic0_Fcore0 (plane 5) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 6) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 7) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 8) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 9) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 10) Plane Enabled, Links ok
SIB1_Asic0_Fcore0 (plane 11) Plane Enabled, Links ok
PFE #3